Azure

Overview

Azure gives you access to OpenAI and Anthropic models through the Azure OpenAI Service, called through DeepIntShield with the same OpenAI-compatible Chat Completions and Responses APIs. When you configure and call Azure, the things you set up are:

Deployments - map your model identifiers to Azure deployment IDs (see Deployment Selection)
Authentication - API key, Entra ID (Service Principal), or Managed Identity (see Key Configuration)
Endpoint - point at your Azure OpenAI resource endpoint
API version - configurable, with a preview version used for the Responses API (see API Versioning)
Models - both OpenAI and Anthropic (Claude) models are supported; the right request shape is applied automatically from the model you select

Supported Operations

Operation	Non-Streaming	Streaming	Endpoint
Chat Completions	✅	✅	`/openai/v1/chat/completions`
Responses API	✅	✅	`/openai/v1/responses`
Embeddings	✅	-	`/openai/v1/embeddings`
Files	✅	-	`/openai/v1/files`
List Models	✅	-	`/openai/v1/models`
Image Generation	✅	✅	`/openai/v1/images/generations`
Image Edit	✅	✅	`/openai/v1/images/edits`
Video Generation	✅	-	`/openai/v1/videos`
Image Variation	❌	❌	-
Batch	❌	❌	-
Text Completions	❌	❌	-
Speech (TTS)	❌	❌	-

1. Chat Completions

Request Parameters

Send standard OpenAI-compatible Chat Completions requests. Your model maps to the configured Azure deployment. Parameters supported by the underlying model (OpenAI or Claude) apply - see the OpenAI and Anthropic provider pages. The correct behavior is applied automatically based on the model name.

Authentication Configuration

Azure uses custom endpoint and deployment configuration:

curl -X POST https://app.deepintshield.com/v1/chat/completions \
  -H "Authorization: Bearer sk-bf-your-virtual-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure/gpt-4-deployment",
    "messages": [{"role": "user", "content": "Hello"}],
    "deployment": "my-gpt4-deployment",
    "endpoint": "https://my-org.openai.azure.com"
  }' \
  -H "api-key: YOUR_AZURE_API_KEY"

Key Configuration

Azure supports three authentication methods: Managed Identity (DefaultAzureCredential), Entra ID (Service Principal), and Direct (API Key). Precedence: Entra ID (if configured) → API key (if value set) → DefaultAzureCredential.

Managed Identity / DefaultAzureCredential

If no API key and no Entra ID credentials are provided, DeepIntShield automatically uses DefaultAzureCredential, which detects the auth environment.

{
  "azure_key_config": {
    "endpoint": "https://your-org.openai.azure.com",
    "api_version": "2024-10-21",
    "deployments": {
      "gpt-4": "my-gpt4-deployment"
    }
  }
}

Azure Entra ID (Service Principal)

If you set client_id, client_secret, and tenant_id, Azure Entra ID authentication will be used with priority over API key authentication.

{
  "azure_key_config": {
    "endpoint": "https://your-org.openai.azure.com",
    "client_id": "your-client-id",
    "client_secret": "your-client-secret",
    "tenant_id": "your-tenant-id",
    "scopes": ["https://cognitiveservices.azure.com/.default"],
    "api_version": "2024-10-21",
    "deployments": {
      "gpt-4": "my-gpt4-deployment",
      "gpt-4-turbo": "my-gpt4-turbo-deployment",
      "claude-3": "my-claude-deployment"
    }
  }
}

Required Azure Roles:

For OpenAI models: Cognitive Services OpenAI User
For Anthropic models: Cognitive Services AI Services User

Direct Authentication (API Key)

{
  "azure_key_config": {
    "endpoint": "https://your-org.openai.azure.com",
    "api_version": "2024-10-21",
    "deployments": {
      "gpt-4": "my-gpt4-deployment",
      "gpt-4-turbo": "my-gpt4-turbo-deployment",
      "claude-3": "my-claude-deployment"
    }
  }
}

Configuration Details:

endpoint - Azure OpenAI resource endpoint (required)
client_id - Azure Entra ID client ID (optional, for Service Principal auth)
client_secret - Azure Entra ID client secret (optional, for Service Principal auth)
tenant_id - Azure Entra ID tenant ID (optional, for Service Principal auth)
scopes - OAuth scopes for token requests (default: ["https://cognitiveservices.azure.com/.default"])
api_version - API version to use (default: 2024-10-21)
deployments - Map of model names to deployment IDs (optional, can be provided per-request)
allowed_models - List of allowed models to use from this key (optional)

Deployment Selection

Deployments can be specified at three levels (in order of precedence):

Per-request (highest priority)
```
{"deployment": "custom-deployment"}
```

Key configuration

{"deployments": {"gpt-4": "my-gpt4-deployment"}}

Model name (lowest priority, if no deployment specified) Model name is used as deployment ID directly

OpenAI Models

When using OpenAI models (GPT-4, GPT-4 Turbo, GPT-3.5-Turbo, etc.), all OpenAI-standard parameters are supported. See the OpenAI provider page for details.

Anthropic Models

When using Anthropic (Claude) models through Azure, all standard Anthropic parameters are supported, including reasoning/thinking, system messages, and tools. See the Anthropic provider page for details. The minimum reasoning budget is 1024 tokens.

API Versioning

Default version: 2024-10-21 (supports latest OpenAI features)
Preview version: preview (used for Responses API)
Custom version: Set via api_version in key config

The Responses API automatically uses the preview API version. Different API versions may have different feature support.

Streaming

Set "stream": true to receive incremental output. Streaming uses the OpenAI or Anthropic streaming format depending on the model.

2. Responses API

The Responses API is available for both OpenAI and Anthropic models on Azure and uses the preview API version.

Request Parameters

Send standard OpenAI Responses requests with instructions, input (string or array), max_output_tokens, and other parameters supported by the underlying model. The correct behavior is applied automatically based on the model name (OpenAI or Claude). This API uses the preview API version.

curl -X POST https://app.deepintshield.com/v1/responses \
  -H "Authorization: Bearer sk-bf-your-virtual-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure/claude-3-sonnet",
    "input": "Hello, how are you?",
    "instructions": "You are a helpful assistant",
    "deployment": "my-claude-deployment",
    "endpoint": "https://my-org.openai.azure.com"
  }' \
  -H "api-key: YOUR_AZURE_API_KEY"

For parameter details, see the OpenAI Responses API and Anthropic Responses API pages.

3. Embeddings

Embeddings are supported for OpenAI models only (not available for Anthropic models on Azure).

Request Parameters

Parameter	Notes
`input`	Text to embed (single string or array)
`model`	Maps to the configured Azure deployment
`dimensions`	Optional output size (when supported)

curl -X POST https://app.deepintshield.com/v1/embeddings \
  -H "Authorization: Bearer sk-bf-your-virtual-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": ["text to embed"],
    "deployment": "my-embedding-deployment"
  }' \
  -H "api-key: YOUR_AZURE_API_KEY"

Response

The embeddings response uses the standard OpenAI format:

{
  "data": [
    {
      "object": "embedding",
      "embedding": [0.1234, -0.5678, ...],
      "index": 0
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 10,
    "total_tokens": 10
  }
}

4. Files API

Files operations are supported for OpenAI models only.

Supported Operations

Operation	Support
Upload	✅
List	✅
Retrieve	✅
Delete	✅
Get Content	✅

Files are stored in Azure and can be used with batch operations.

5. Image Generation

Image Generation is supported for OpenAI models on Azure and uses the OpenAI-compatible format.

Request Parameters

Azure uses the same request format as OpenAI (see OpenAI Image Generation). Your model maps to the configured Azure deployment; prompt and all other parameters use the OpenAI format.

curl -X POST https://app.deepintshield.com/v1/images/generations \
  -H "Authorization: Bearer sk-bf-your-virtual-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure/dall-e-3",
    "prompt": "A sunset over the mountains",
    "size": "1024x1024",
    "n": 1,
    "deployment": "my-image-gen-deployment"
  }' \
  -H "api-key: YOUR_AZURE_API_KEY"

Response

Images are returned in the standard OpenAI-compatible image-generation response format.

Streaming

Image generation streaming is supported and uses OpenAI’s streaming format with Server-Sent Events (SSE). See OpenAI Image Generation Streaming.

6. Image Edit

Image Edit is supported for OpenAI models on Azure and uses the OpenAI-compatible format.

Azure uses the same request format as OpenAI (see OpenAI Image Edit):

URL Format: {endpoint}/openai/deployments/{deployment}/images/edits?api-version={apiVersion}
Authentication: Azure API key or OAuth bearer token
Deployment Mapping: Your model identifier is mapped to the configured Azure deployment ID
Response: Same as OpenAI - returned as-is in the DeepIntShield image-generation response format
Streaming: Supported, using the same SSE streaming format as OpenAI

Endpoint: /openai/deployments/{deployment}/images/edits?api-version={apiVersion}

7. List Models

Request Parameters

None required.

Response

Lists available models/deployments configured in the Azure key. Response includes model metadata, capabilities, and lifecycle status.

{
  "data": [
    {
      "id": "gpt-4",
      "object": "model",
      "created": 1687882411,
      "status": "active",
      "lifecycle_status": "stable",
      "capabilities": {
        "chat_completion": true,
        "embeddings": false
      }
    }
  ]
}

Caveats

Deployment ID Required

Severity: High Behavior: Model names must map to Azure deployment IDs Impact: Request fails without valid deployment mapping

Automatic Model Handling

Severity: Medium Behavior: OpenAI and Anthropic models are detected automatically from the model name Impact: You don’t need to indicate the underlying model family

Responses API Preview Version

Severity: Medium Behavior: The Responses API uses the preview API version, which differs from the Chat Completions API version Impact: Different API version for Responses vs Chat Completions

Version Matching for Deployments

Severity: Low Behavior: Model version differences ignored when matching to deployments Impact: gpt-4 and gpt-4-turbo can map to same deployment

8. Video Generation

Azure routes video generation to OpenAI’s Sora models via the Azure OpenAI-compatible endpoint. All parameters are identical to OpenAI Video Generation.

Supported Operations

Operation	Supported	Notes
Generate	✅	`POST /v1/videos`
Retrieve	✅	`GET /v1/videos/{id}`
Download	✅	`GET /v1/videos/{id}/content`
Delete	✅	`DELETE /v1/videos/{id}`
List	✅	`GET /v1/videos`
Remix	❌	Not supported

Configuration

HTTP Settings: API Version 2024-10-21 (configurable) | Max Connections 5000 | Max Idle 60 seconds

Endpoint Format: https://{resource-name}.openai.azure.com/openai/v1/{path}?api-version={version}

Note: DeepIntShield automatically constructs URLs using the endpoint from key configuration and the configured API version.

Setup & Configuration

Azure requires endpoint URLs, deployment mappings, and API version configuration. For detailed instructions, see Provider-Specific Authentication - Azure in the Gateway Quickstart for configuration steps in the Web UI.