Azure OpenAI Client¶
The Azure OpenAI client provides standardized access to Azure-hosted OpenAI models through the Arshai framework. It implements the full ILLM interface with support for chat, streaming, function calling, structured output, and background tasks.
Note
This documentation reflects the actual implementation based on tested functionality. The Azure client uses the same underlying OpenAI SDK with Azure-specific configuration.
Configuration¶
Basic Setup:
from arshai.llms.azure import AzureClient
from arshai.core.interfaces.illm import ILLMConfig
# Configure the client
config = ILLMConfig(
model="gpt-4o-mini", # Your Azure deployment name
temperature=0.7, # 0.0 = deterministic, 1.0 = creative
max_tokens=500, # Response length limit
top_p=1.0, # Nucleus sampling parameter
frequency_penalty=0.0, # Reduce repetition
presence_penalty=0.0 # Encourage topic diversity
)
# Create client with Azure-specific configuration
client = AzureClient(
config=config,
azure_deployment="your-deployment-name", # Optional if set in env
api_version="2024-10-21" # Optional if set in env
)
Environment Variables:
# Required
export AZURE_OPENAI_API_KEY="your-azure-api-key"
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
export AZURE_DEPLOYMENT="your-deployment-name"
export AZURE_API_VERSION="2024-10-21"
# Optional - for organization tracking
export AZURE_OPENAI_AD_TOKEN="your-ad-token" # Alternative to API key
Azure-Specific Configuration:
# Initialize with explicit Azure parameters
client = AzureClient(
config=config,
azure_deployment="my-gpt-4-deployment",
api_version="2024-10-21"
)
# Or let it read from environment variables
client = AzureClient(config=config)
Supported Models¶
The Azure OpenAI client supports all models available in your Azure OpenAI resource. The client works with any deployment name configured in your Azure OpenAI service, including:
- Deployment-Based Model Access
Unlike direct OpenAI, Azure uses deployment names rather than model names
Each deployment maps to a specific model version
Specify your deployment name in the model parameter
- Common Azure Deployments (examples):
gpt-4o: Latest GPT-4 optimized model deploymentsgpt-4o-mini: Fast and cost-effective deploymentsgpt-4-turbo: High-performance model deploymentsgpt-35-turbo: Efficient model deployments
- Enterprise Features
Private endpoints and VNet integration
Customer-managed keys and data residency
Azure Active Directory authentication
Compliance with enterprise security requirements
Note
Deployment Names: The model parameter should contain your Azure deployment name, not the underlying model name.
Regional Availability: Check your Azure region for model availability and update API versions for latest features.
Basic Usage¶
Simple conversation:
from arshai.core.interfaces.illm import ILLMInput
# Prepare input
input_data = ILLMInput(
system_prompt="You are a helpful AI assistant specializing in Azure cloud services.",
user_message="How do I set up Azure OpenAI with private endpoints?"
)
# Get response
response = await client.chat(input_data)
print(response["llm_response"])
print(f"Tokens used: {response['usage']['total_tokens']}")
Streaming responses:
async for chunk in client.stream(input_data):
if chunk.get("llm_response"):
print(chunk["llm_response"], end="", flush=True)
if chunk.get("usage"):
print(f"\nTotal tokens: {chunk['usage']['total_tokens']}")
Function Calling¶
The Azure client supports identical function calling to the OpenAI client:
Regular Functions:
def check_azure_service_health(service_name: str, region: str = "eastus") -> dict:
"""Check the health status of an Azure service in a specific region."""
# Mock implementation
return {
"service": service_name,
"region": region,
"status": "healthy",
"last_updated": "2024-01-15T10:30:00Z"
}
def estimate_azure_costs(service: str, tier: str, hours: int = 24) -> dict:
"""Estimate Azure service costs for a given time period."""
base_rates = {"basic": 0.10, "standard": 0.25, "premium": 0.50}
hourly_rate = base_rates.get(tier.lower(), 0.25)
return {
"service": service,
"tier": tier,
"hours": hours,
"estimated_cost": hourly_rate * hours,
"currency": "USD"
}
input_data = ILLMInput(
system_prompt="You are an Azure consultant. Use the provided tools to help with Azure questions.",
user_message="Check the health of Azure OpenAI in East US and estimate costs for standard tier for 48 hours.",
regular_functions={
"check_azure_service_health": check_azure_service_health,
"estimate_azure_costs": estimate_azure_costs
},
max_turns=10
)
response = await client.chat(input_data)
Background Tasks (logging, monitoring, etc.):
def log_azure_usage(service: str, operation: str, user_id: str = "system"):
"""Log Azure service usage for compliance tracking."""
import datetime
timestamp = datetime.datetime.now().isoformat()
print(f"[AUDIT] {timestamp} - Service: {service}, Operation: {operation}, User: {user_id}")
input_data = ILLMInput(
system_prompt="You are an Azure AI assistant. Log all interactions for compliance.",
user_message="Help me understand Azure OpenAI pricing models.",
background_tasks={
"log_azure_usage": log_azure_usage
}
)
response = await client.chat(input_data)
# Automatically logs the interaction in the background
Structured Output¶
Generate structured data for Azure automation:
from pydantic import BaseModel, Field
from typing import List
class AzureResourceRecommendation(BaseModel):
"""Structured Azure resource recommendation."""
resource_type: str = Field(description="Type of Azure resource (e.g., 'App Service', 'Virtual Machine')")
tier: str = Field(description="Recommended service tier (Basic, Standard, Premium)")
region: str = Field(description="Recommended Azure region")
estimated_monthly_cost: float = Field(description="Estimated monthly cost in USD")
justification: str = Field(description="Reason for this recommendation")
configuration_steps: List[str] = Field(description="Steps to configure this resource")
input_data = ILLMInput(
system_prompt="You are an Azure architect. Provide detailed resource recommendations.",
user_message="I need to host a Python web application with 1000 daily users. What Azure resources do I need?",
structure_type=AzureResourceRecommendation
)
response = await client.chat(input_data)
recommendation = response["llm_response"] # Returns AzureResourceRecommendation instance
print(f"Resource: {recommendation.resource_type}")
print(f"Tier: {recommendation.tier}")
print(f"Region: {recommendation.region}")
print(f"Monthly Cost: ${recommendation.estimated_monthly_cost}")
print(f"Steps: {', '.join(recommendation.configuration_steps)}")
Azure-Specific Features¶
Private Endpoint Support:
# Configure for private endpoint access
client = AzureClient(
config=config,
azure_deployment="my-private-deployment"
)
# The client automatically uses your configured Azure endpoint
Azure Active Directory Authentication:
# Use Azure AD token instead of API key
export AZURE_OPENAI_AD_TOKEN="your-aad-token"
# Don't set AZURE_OPENAI_API_KEY when using AD auth
Multi-Region Deployments:
# Configure for specific region/deployment
us_client = AzureClient(
config=ILLMConfig(model="us-east-gpt4"),
azure_deployment="us-east-gpt4",
api_version="2024-10-21"
)
eu_client = AzureClient(
config=ILLMConfig(model="eu-west-gpt4"),
azure_deployment="eu-west-gpt4",
api_version="2024-10-21"
)
Content Filtering Integration:
# Azure's content filtering is automatically applied
# No additional configuration needed - handled by Azure OpenAI service
try:
response = await client.chat(input_data)
except Exception as e:
if "content_filter" in str(e).lower():
print("Content was filtered by Azure OpenAI safety systems")
Error Handling¶
Azure-specific error handling:
import asyncio
async def azure_chat_with_retry(client, input_data, max_retries=3):
"""Example retry logic for Azure-specific errors."""
for attempt in range(max_retries):
try:
return await client.chat(input_data)
except Exception as e:
error_str = str(e).lower()
if "429" in error_str or "rate" in error_str:
# Rate limiting
wait_time = 2 ** attempt
await asyncio.sleep(wait_time)
continue
elif "401" in error_str or "authentication" in error_str:
print("Check your Azure OpenAI API key or AD token")
break
elif "403" in error_str or "forbidden" in error_str:
print("Check your Azure OpenAI resource permissions")
break
elif "deployment" in error_str:
print("Check your Azure deployment name and model availability")
break
else:
raise
Configuration Validation:
try:
client = AzureClient(config)
except ValueError as e:
print(f"Azure configuration error: {e}")
# Check AZURE_DEPLOYMENT and AZURE_API_VERSION environment variables
Network and Regional Errors:
try:
response = await client.chat(input_data)
except Exception as e:
if "timeout" in str(e).lower():
print("Network timeout - check Azure region connectivity")
elif "ssl" in str(e).lower():
print("SSL/TLS error - check certificate configuration")
elif "dns" in str(e).lower():
print("DNS resolution error - check Azure endpoint URL")
Usage Tracking¶
Azure-specific usage information:
response = await client.chat(input_data)
if response["usage"]:
usage = response["usage"]
print(f"Input tokens: {usage['input_tokens']}")
print(f"Output tokens: {usage['output_tokens']}")
print(f"Total tokens: {usage['total_tokens']}")
print(f"Thinking tokens: {usage['thinking_tokens']}") # For reasoning models
# Azure-specific metadata
print(f"Provider: {usage['provider']}") # Will be "azure"
print(f"Deployment: {usage['model']}") # Your deployment name
print(f"Request ID: {usage['request_id']}") # For Azure support tickets
Performance Optimization¶
Regional Deployment Selection:
# Choose regions close to your users
config_us = ILLMConfig(model="us-deployment", temperature=0.7)
config_eu = ILLMConfig(model="eu-deployment", temperature=0.7)
Tier-Based Cost Management:
# Use different tiers for different use cases
config_dev = ILLMConfig(
model="dev-deployment", # Lower tier for development
max_tokens=100,
temperature=0.3
)
config_prod = ILLMConfig(
model="prod-deployment", # Higher tier for production
max_tokens=1000,
temperature=0.7
)
Content Filtering Optimization:
# Configure prompts to work well with Azure content filtering
input_data = ILLMInput(
system_prompt="You are a helpful, safe, and responsible AI assistant. Follow Azure content policies.",
user_message="Help me create appropriate content for my application."
)
Enterprise Integration¶
Azure Monitor Integration:
# Usage data automatically flows to Azure Monitor
# Configure alerting and monitoring in Azure portal
response = await client.chat(input_data)
# Metrics are automatically tracked
Azure Key Vault Integration:
# Store API keys securely in Key Vault
# Reference them in your application configuration
export AZURE_OPENAI_API_KEY="@Microsoft.KeyVault(VaultName=myVault;SecretName=openai-key)"
Virtual Network Integration:
# Configure private endpoints in Azure
# Client automatically uses configured networking
client = AzureClient(config=config) # Uses your VNet configuration
Limitations and Considerations¶
- Deployment Dependencies
Your Azure OpenAI resource must have the required model deployments configured.
- Regional Availability
Model availability varies by Azure region. Check Azure documentation for current regional support.
- Content Filtering
Azure applies content filtering that may affect certain use cases. Design prompts accordingly.
- Rate Limits
Rate limits are applied per deployment. Scale deployments for higher throughput requirements.
- API Version Updates
Azure regularly updates API versions. Keep your api_version parameter current for latest features.
Next Steps¶
Google Gemini Client - Google Gemini integration
OpenRouter Client - Multi-provider access via OpenRouter
Extending LLM Clients - Creating custom LLM clients
Agents (Layer 2) - Building agents with LLM clients