Skip to content

API Reference Overview

The DeepintShield gateway exposes an HTTP API that is drop-in compatible with the OpenAI, Anthropic, and Google Gemini wire formats. Point your existing client at the gateway, supply a virtual key, and every request is guarded, cached, routed, and logged before it reaches a provider - no SDK changes required.

Point your client at the hosted DeepintShield cloud gateway:

https://app.deepintshield.com

If you run the Enterprise VPC / Self-Hosted data plane, replace the host with your own control plane (for example https://<your-deepintshield-host>) - the API surface remains the same. See Setting up the gateway.

Inference requests are authenticated with a virtual key. Create and manage keys from the Web UI - see Virtual Keys.

The gateway accepts the virtual key in any of the following headers, so you can keep using whichever header your existing client already sends:

HeaderTypical client
x-bf-vk: <virtual-key>DeepintShield-native clients
Authorization: Bearer <virtual-key>OpenAI SDKs and most HTTP clients
x-api-key: <virtual-key>Anthropic SDK
x-goog-api-key: <virtual-key>Google Gemini SDK
Terminal window
curl -X POST https://app.deepintshield.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPINTSHIELD_VIRTUAL_KEY" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello, DeepintShield!"}]
}'

These routes accept and return the OpenAI request/response format and work with any configured provider - the gateway translates as needed. Send them to the gateway base URL.

MethodPathPurpose
POST/v1/chat/completionsChat completions (streaming and non-streaming)
POST/v1/responsesOpenAI Responses API
POST/v1/completionsLegacy text completions
POST/v1/embeddingsEmbeddings
POST/v1/rerankReranking
POST/v1/audio/speechText-to-speech
POST/v1/audio/transcriptionsSpeech-to-text
POST/v1/images/generationsImage generation
GET/v1/modelsList available models
GET/healthGateway health check (no auth)

Streaming responses follow the standard OpenAI Server-Sent Events format - set "stream": true in the request body. See Streaming.

Every inference endpoint above has an /v1/async/... variant that returns a job id you can poll, which is useful for long-running or batched work:

MethodPathPurpose
POST/v1/async/chat/completionsSubmit an async chat completion job
GET/v1/async/chat/completions/{job_id}Retrieve the result of a job

The same {POST submit, GET {job_id} retrieve} pattern applies to responses, embeddings, audio/speech, audio/transcriptions, and the images/* endpoints.

If you prefer to keep your client speaking a provider’s exact dialect, send requests to the matching prefix. The gateway still applies guardrails, caching, routing, and logging, then forwards in the provider’s native format.

Prefix every OpenAI path with /openai:

POST /openai/v1/chat/completions
POST /openai/v1/responses
POST /openai/v1/embeddings
GET /openai/v1/models

See the OpenAI integration.

DeepintShield also exposes passthrough routes for Azure OpenAI, Amazon Bedrock, Cohere, LiteLLM, LangChain, and PydanticAI. Browse them all under Integrations.

Errors are returned with standard HTTP status codes and a JSON body. Common cases:

StatusMeaning
401Missing or invalid virtual key
403Request blocked by a guardrail or governance policy
429Rate limit or budget exceeded
5xxUpstream provider or gateway error