Anthropic
Overview
Section titled “Overview”Call Anthropic (Claude) models through DeepIntShield using the same OpenAI-compatible Chat Completions and Responses APIs you use for every other provider. You send standard OpenAI-style requests and DeepIntShield handles Anthropic’s native format for you. A few Anthropic-specific behaviors are worth knowing:
- Reasoning - the
reasoningobject drives Claude’s thinking, with a minimum token budget (see Reasoning / Thinking) - Cache control - add
cache_controldirectives to enable Anthropic prompt caching (see Cache Control) - Anthropic-specific parameters - pass fields like
top_kviaextra_params
Supported Operations
Section titled “Supported Operations”| Operation | Non-Streaming | Streaming | Endpoint |
|---|---|---|---|
| Chat Completions | ✅ | ✅ | /v1/messages |
| Responses API | ✅ | ✅ | /v1/messages |
| Text Completions | ✅ | ❌ | /v1/complete |
| Embeddings | ❌ | ❌ | - |
| Speech (TTS) | ❌ | ❌ | - |
| Transcriptions (STT) | ❌ | ❌ | - |
| Image Generation | ❌ | ❌ | - |
| Files | ✅ | - | /v1/files |
| Batch | ✅ | - | /v1/messages/batches |
| List Models | ✅ | - | /v1/models |
1. Chat Completions
Section titled “1. Chat Completions”Request Parameters
Section titled “Request Parameters”Send standard OpenAI-compatible Chat Completions requests. temperature and top_p pass through directly. The following parameters are not supported by Anthropic and are ignored: frequency_penalty, presence_penalty, logit_bias, logprobs, top_logprobs, seed, parallel_tool_calls, service_tier.
Extra Parameters
Section titled “Extra Parameters”Use extra_params (SDK) or pass directly in the request body (Gateway) for Anthropic-specific fields such as top_k:
curl -X POST https://app.deepintshield.com/v1/chat/completions \ -H "Content-Type: application/json" \ -H "x-bf-vk: $DEEPINTSHIELD_VIRTUAL_KEY" \ -d '{ "model": "anthropic/claude-3-5-sonnet", "messages": [{"role": "user", "content": "Hello"}], "top_k": 40 }'Anthropic also accepts a top-level "cache_control": {"type": "ephemeral"} object on requests to enable automatic prompt caching.
Cache Control
Section titled “Cache Control”Cache directives can be added to system messages, user messages, and tool definitions to enable prompt caching:
curl -X POST https://app.deepintshield.com/v1/chat/completions \ -H "Content-Type: application/json" \ -H "x-bf-vk: $DEEPINTSHIELD_VIRTUAL_KEY" \ -d '{ "model": "anthropic/claude-3-5-sonnet", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "This is cached context", "cache_control": {"type": "ephemeral"} } ] } ], "system": [ { "type": "text", "text": "You are a helpful assistant", "cache_control": {"type": "ephemeral"} } ] }'Reasoning / Thinking
Section titled “Reasoning / Thinking”Documentation: See DeepIntShield Reasoning Reference
Use the reasoning object to enable Claude’s thinking:
reasoning.effortenables thinkingreasoning.max_tokenssets the token budget for thinking
Critical Constraints
Section titled “Critical Constraints”- Minimum budget: 1024 tokens required; requests below this fail with error
- Dynamic budget:
-1is converted to1024automatically
{"reasoning": {"effort": "high", "max_tokens": 2048}}Tool & Image Support
Section titled “Tool & Image Support”- Tools: Standard OpenAI-style tool definitions are supported, including
tool_choicevaluesauto,none,required, and specific tool selection. - Images: Both URL images (
{"type": "image_url", "image_url": {...}}) and base64 data-URL images are supported in message content.
Response
Section titled “Response”Responses come back in the standard OpenAI-compatible shape, so you read the same fields you use for other providers:
finish_reason(stop,length,tool_calls)usage.prompt_tokens/usage.completion_tokens- token counts, with cache usage rolled intoprompt_tokensusage.prompt_tokens_details.cached_read_tokens/cached_write_tokens- cache read/write breakdown when prompt caching is usedreasoning_details- Claude thinking output (when reasoning is enabled)- Tool call arguments are returned as a JSON string in
tool_calls
Streaming
Section titled “Streaming”Set "stream": true to receive incremental chunks. Output is delivered as standard OpenAI-compatible streaming deltas (content text, tool-call arguments, and reasoning text arrive progressively).
Caveats
Section titled “Caveats”Minimum Reasoning Budget
Behavior: reasoning.max_tokens must be >= 1024
Impact: Requests with lower values fail with error
Dynamic Budget Conversion
Behavior: reasoning.max_tokens = -1 is treated as 1024
Impact: Dynamic budgeting not supported
2. Responses API
Section titled “2. Responses API”The Responses API uses the same underlying /v1/messages endpoint. Send standard OpenAI Responses requests; DeepIntShield handles the Anthropic format.
Request Parameters
Section titled “Request Parameters”temperature and top_p pass through directly. Use extra_params (SDK) or pass directly in the request body (Gateway) for Anthropic-specific fields such as top_k, include, and stop:
curl -X POST https://app.deepintshield.com/v1/responses \ -H "Content-Type: application/json" \ -H "x-bf-vk: $DEEPINTSHIELD_VIRTUAL_KEY" \ -d '{ "model": "anthropic/claude-3-5-sonnet", "input": "Hello, how are you?", "top_k": 40 }'Cache Control
Section titled “Cache Control”Cache directives can be added to instructions (system) and input messages to enable prompt caching:
curl -X POST https://app.deepintshield.com/v1/responses \ -H "Content-Type: application/json" \ -H "x-bf-vk: $DEEPINTSHIELD_VIRTUAL_KEY" \ -d '{ "model": "anthropic/claude-3-5-sonnet", "instructions": "You are a helpful assistant. This instruction is cached.", "instructions_cache_control": {"type": "ephemeral"}, "input": [ { "type": "text", "text": "Answer this question", "cache_control": {"type": "ephemeral"} } ] }'Tool Support
Section titled “Tool Support”Supported types: function, computer_use_preview, web_search, mcp. MCP tools accept server_label and server_url. Cache control is supported on instructions and input blocks (see Cache Control above).
Response
Section titled “Response”Responses come back in the standard OpenAI Responses shape:
status(completed,incomplete) reflects whether the model finished or hit the token limitusage.input_tokens/usage.output_tokens, with cache usage broken out underinput_tokens_details.cached_read_tokensandcached_write_tokensoutputitems: assistant text asmessage, tool calls asfunction_call, and Claude thinking asreasoning
Streaming
Section titled “Streaming”Set "stream": true to receive output as standard OpenAI Responses streaming events (text, tool-call arguments, and reasoning arrive incrementally).
3. Text Completions (Legacy)
Section titled “3. Text Completions (Legacy)”Send a prompt with standard parameters. temperature and top_p pass through directly; top_k and stop can be set via extra_params. The response is returned in OpenAI-compatible completion shape.
4. Batch API
Section titled “4. Batch API”Request formats: requests array (CustomID + Params) or input_file_id
Pagination: Cursor-based with after_id, before_id, limit
Endpoints:
- POST
/v1/messages/batches- Create - GET
/v1/messages/batches- List - GET
/v1/messages/batches/{batch_id}- Retrieve - POST
/v1/messages/batches/{batch_id}/cancel- Cancel
Response: JSONL format with {custom_id, result: {type, message}}
5. Files API
Section titled “5. Files API”Upload: Multipart/form-data with file (required) and filename (optional)
Endpoints: POST /v1/files, GET /v1/files (cursor pagination), GET /v1/files/{file_id}, DELETE /v1/files/{file_id}, GET /v1/files/{file_id}/content
6. List Models
Section titled “6. List Models”Request: GET /v1/models (no body)
Multi-key support: Results are aggregated from all keys, filtered by allowed_models if configured.