Skip to content

Overview

DeepintShield is OpenAI API-compatible: point your existing OpenAI SDK at the DeepintShield endpoint and your requests, responses, and errors work unchanged.

You keep your OpenAI SDK-based architecture while gaining DeepintShield features like governance, load balancing, semantic caching, and multi-provider support.

Endpoint: /openai


Install with the OpenAI extra:

Terminal window
pip install "deepintshield[openai]"
from deepintshield import DeepintShield
shield = DeepintShield(virtual_key="sk-bf-your-virtual-key")
client = shield.openai() # pre-wired openai.OpenAI
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Use multiple providers through the same OpenAI SDK format by prefixing model names with the provider:

import openai
client = openai.OpenAI(
base_url="https://app.deepintshield.com/openai",
api_key="sk-bf-your-virtual-key",
default_headers={"x-bf-vk": "sk-bf-your-virtual-key"},
)
# OpenAI models (default)
openai_response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello from OpenAI!"}]
)
# Anthropic models via OpenAI SDK format
anthropic_response = client.chat.completions.create(
model="anthropic/claude-3-5-sonnet",
messages=[{"role": "user", "content": "Hello from Claude!"}]
)
# Google Vertex models via OpenAI SDK format
vertex_response = client.chat.completions.create(
model="vertex/gemini-2.5-flash",
messages=[{"role": "user", "content": "Hello from Gemini!"}]
)
# Azure models
azure_response = client.chat.completions.create(
model="azure/gpt-4o",
messages=[{"role": "user", "content": "Hello from Azure!"}]
)
# Local Ollama models
ollama_response = client.chat.completions.create(
model="ollama/llama3.1:8b",
messages=[{"role": "user", "content": "Hello from Ollama!"}]
)

Pass custom headers required by DeepintShield plugins (like governance, telemetry, etc.):

import openai
client = openai.OpenAI(
base_url="https://app.deepintshield.com/openai",
api_key="dummy-key",
default_headers={
"x-bf-vk": "vk_12345", # Virtual key for governance
}
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello with custom headers!"}]
)

Pass API keys directly in requests to bypass DeepintShield’s load balancing. You can pass any provider’s API key (OpenAI, Anthropic, Mistral, etc.) since DeepintShield only looks for Authorization or x-api-key headers. This requires the Allow Direct API keys option to be enabled in DeepintShield configuration.

Learn more: See Key Management for enabling direct API key usage.

import openai
# Using OpenAI's API key directly
client_with_direct_key = openai.OpenAI(
base_url="https://app.deepintshield.com/openai",
api_key="sk-your-openai-key" # OpenAI's API key works
)
openai_response = client_with_direct_key.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Hello from GPT!"}]
)
# Or pass different provider keys per request
client = openai.OpenAI(
base_url="https://app.deepintshield.com/openai",
api_key="dummy-key"
)
# Use OpenAI key for GPT models
openai_response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello GPT!"}],
extra_headers={
"Authorization": "Bearer sk-your-openai-key"
}
)
# Use Anthropic key for Claude models
anthropic_response = client.chat.completions.create(
model="anthropic/claude-3-5-sonnet",
messages=[{"role": "user", "content": "Hello Claude!"}],
extra_headers={
"x-api-key": "sk-ant-your-anthropic-key"
}
)
# Use Gemini key for Gemini models
gemini_response = client.chat.completions.create(
model="gemini/gemini-2.5-flash",
messages=[{"role": "user", "content": "Hello Gemini!"}],
extra_headers={
"x-goog-api-key": "sk-gemini-your-gemini-key"
}
)

For Azure, you can use the AzureOpenAI client and point it to DeepintShield integration endpoint. The x-bf-azure-endpoint header is required to specify your Azure resource endpoint.

from openai import AzureOpenAI
azure_client = AzureOpenAI(
api_key="your-azure-api-key",
api_version="2024-02-01",
azure_endpoint="https://app.deepintshield.com/openai", # Point to DeepintShield
default_headers={
"x-bf-azure-endpoint": "https://your-resource.openai.azure.com"
}
)
azure_response = azure_client.chat.completions.create(
model="gpt-4-deployment", # Your deployment name
messages=[{"role": "user", "content": "Hello from Azure!"}]
)
print(azure_response.choices[0].message.content)

Submit inference requests asynchronously and poll for results later using the x-bf-async header. This is useful for long-running requests where you don’t want to hold a connection open. See Async Inference for full details.

import openai
import time
client = openai.OpenAI(
base_url="https://app.deepintshield.com/openai",
api_key="sk-bf-your-virtual-key",
default_headers={"x-bf-vk": "sk-bf-your-virtual-key"},
)
# Submit async request
initial = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Tell me a short story."}],
extra_headers={"x-bf-async": "true"}
)
# If choices are present, the request completed synchronously
if initial.choices:
print(initial.choices[0].message.content)
else:
# Poll until completed
while True:
time.sleep(2)
poll = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Tell me a short story."}],
extra_headers={"x-bf-async-id": initial.id}
)
if poll.choices:
print(poll.choices[0].message.content)
break
import openai
import time
client = openai.OpenAI(
base_url="https://app.deepintshield.com/openai",
api_key="sk-bf-your-virtual-key",
default_headers={"x-bf-vk": "sk-bf-your-virtual-key"},
)
# Submit async request
initial = client.responses.create(
model="openai/gpt-4o-mini",
input="Tell me a short story.",
extra_headers={"x-bf-async": "true"}
)
# If status is "completed", the request completed synchronously
if initial.status == "completed":
print(initial.output_text)
else:
# Poll until completed
while True:
time.sleep(2)
poll = client.responses.create(
model="openai/gpt-4o-mini",
input="Tell me a short story.",
extra_headers={"x-bf-async-id": initial.id}
)
if poll.status == "completed":
print(poll.output_text)
break
HeaderDescription
x-bf-async: trueSubmit the request as an async job. Returns immediately with a job ID.
x-bf-async-id: <job-id>Poll for results of a previously submitted async job.
x-bf-async-job-result-ttl: <seconds>Override the default result TTL (default: 3600s).

The OpenAI integration supports all features that are available in both the OpenAI SDK and DeepintShield core functionality. If the OpenAI SDK supports a feature and DeepintShield supports it, the integration will work seamlessly.