Skip to content

AWS Bedrock

Call AWS Bedrock’s model families (Claude, Nova, Mistral, Llama, Cohere, Titan) through DeepIntShield using the same OpenAI-compatible Chat Completions and Responses APIs. You select a model by ID and DeepIntShield handles the right request shape for that family. A few things are useful to know when you call Bedrock:

  • Model selection - the right family-specific behavior is applied automatically from the model ID you pass (e.g. bedrock/anthropic.claude-3-5-sonnet-...)
  • Parameters - send standard OpenAI-style fields like max_completion_tokens and stop; see Request Parameters
  • AWS authentication - sign in with AWS credentials (access key/secret or the standard credential chain); see AWS Authentication & Configuration
  • Guardrails & service tier - pass Bedrock-specific guardrail and performance settings directly in the request body
  • Reasoning, tools, structured output - supported with the same OpenAI-style request fields, covered in the sections below
  • Images - must be base64 or data-URI (remote URLs are not supported)
FamilyChatResponsesTextEmbeddingsImage GenerationImage EditImage Variation
Claude (Anthropic)
Nova (Anthropic)
Mistral
Llama
Cohere
Titan
OperationNon-StreamingStreamingEndpoint
Chat Completionsconverse
Responses APIconverse
Text Completionsinvoke
Embeddings-invoke
Files-S3 (via SDK)
Batch-batch
List Models-listFoundationModels
Image Generationinvoke
Image Editinvoke
Image Variationinvoke
Count Tokens-count-tokens
Speech (TTS)-
Transcriptions (STT)-

Send standard OpenAI-compatible Chat Completions requests. The following are supported:

  • max_completion_tokens, temperature, top_p, stop
  • response_format for structured output (see Structured Output)
  • tools and tool_choice for function calling (see Tools)
  • reasoning for model thinking (see Reasoning / Thinking)
  • user and service_tier

The following are not supported by Bedrock and are ignored: frequency_penalty, presence_penalty, logit_bias, logprobs, top_logprobs, seed, parallel_tool_calls.

Bedrock-specific fields can be passed directly in the request body:

Terminal window
curl -X POST https://app.deepintshield.com/v1/chat/completions \
-H "Authorization: Bearer sk-bf-your-virtual-key" \
-H "Content-Type: application/json" \
-d '{
"model": "bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
"messages": [{"role": "user", "content": "Hello"}],
"guardrailConfig": {
"guardrailIdentifier": "guardrail-id",
"guardrailVersion": "1",
"trace": "enabled"
},
"performanceConfig": {
"latency": "optimized"
},
"top_k": 40
}'

Available Extra Parameters:

  • guardrailConfig - Bedrock guardrail configuration with guardrailIdentifier, guardrailVersion, trace
  • performanceConfig - Performance optimization with latency (“optimized” or “standard”)
  • additionalModelRequestFieldPaths - Pass-through for model-specific fields not in standard schema
  • promptVariables - Variables for prompt templates (if using prompt caching)
  • requestMetadata - Custom metadata for request tracking

Prompt caching is supported via cache control directives:

Terminal window
curl -X POST https://app.deepintshield.com/v1/chat/completions \
-H "Authorization: Bearer sk-bf-your-virtual-key" \
-H "Content-Type: application/json" \
-d '{
"model": "bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "This context will be cached",
"cache_control": {"type": "ephemeral"}
}
]
}
],
"system": [
{
"type": "text",
"text": "You are a helpful assistant",
"cache_control": {"type": "ephemeral"}
}
]
}'

Documentation: See DeepIntShield Reasoning Reference

Reasoning/thinking support varies by model family. Use the reasoning object to enable it.

  • reasoning.effort enables thinking
  • reasoning.max_tokens sets the thinking token budget
  • Minimum budget: 1024 tokens required; requests below this fail with error. -1 is treated as 1024.
{"reasoning": {"effort": "high", "max_tokens": 2048}}
  • reasoning.effort sets the thinking level ("low" or "high")
  • reasoning.max_tokens sets the maximum reasoning tokens
{"reasoning": {"effort": "high", "max_tokens": 10000}}

A few things to know about Bedrock message content:

  • Images: Only base64 / data-URI images are supported; remote image URLs are not supported.
  • Documents: PDF, CSV, DOC, DOCX, XLS, XLSX, HTML, TXT, and MD documents are supported as file content.
  • Audio: Audio input is not supported and returns an error.

The Chat Completions request format is OpenAI-compatible. Bedrock-specific extensions (for example, a standalone cachePoint) are also accepted when using the Bedrock provider.

Block TypeRequest ShapeSupport
Text{"type":"text","text":"..."}
Image{"type":"image_url","image_url":{"url":"data:image/png;base64,..."}}✅ (base64/data URI only)
File{"type":"file","file":{...}}
Input audio{"type":"input_audio",...}❌ (not supported)
Standalone cache point{"cachePoint":{"type":"default"}} (no outer type field)✅ (Bedrock-specific extension)
Terminal window
curl -X POST https://app.deepintshield.com/v1/chat/completions \
-H "Authorization: Bearer sk-bf-your-virtual-key" \
-H "Content-Type: application/json" \
-d '{
"model": "bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..."
}
}
]
}
]
}'
Terminal window
curl -X POST https://app.deepintshield.com/v1/chat/completions \
-H "Authorization: Bearer sk-bf-your-virtual-key" \
-H "Content-Type: application/json" \
-d '{
"model": "bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Summarize this document."},
{
"type": "file",
"file": {
"file_data": "JVBERi0xLjQKJcfs...",
"filename": "report.pdf",
"file_type": "application/pdf"
}
}
]
}
]
}'

Note: file_data is raw base64-encoded content (no data: URI prefix, unlike image_url). Supported document formats: pdf, txt, md, html, csv, doc, docx, xls, xlsx. Use file.file_data for document payloads; file_url and file_id are not supported for Bedrock chat content.

Standalone Cache Point Example (Bedrock-specific)

Section titled “Standalone Cache Point Example (Bedrock-specific)”
{
"role": "system",
"content": [
{"type": "text", "text": "Long context to cache"},
{"cachePoint": {"type": "default"}}
]
}

This standalone cachePoint block is a DeepIntShield/Bedrock extension (not OpenAI-standard) and should be used only with the Bedrock provider.

Cache directives are supported on:

  • System content blocks (entire system message)
  • User message content blocks (specific parts)
  • Tool definitions within tool configuration

Standard OpenAI-style tool definitions are supported. tool_choice accepts:

tool_choiceBehavior
"auto"Model decides (default)
"none"No tool calls
"required"Must call a tool
Specific toolRestricts to the named function

The function.strict field is not supported by Bedrock and is ignored. Tool call arguments are returned as a JSON string in tool_calls.

Bedrock has no native structured-output parameter, so DeepIntShield implements response_format for you. Send the standard response_format with a JSON schema and you receive structured JSON in the response content - no extra handling required.

{
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "response",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "number"}
}
}
}
}
}

The schema you provide is enforced on the model’s output, and the structured result is returned as the message content.

Responses come back in the standard OpenAI-compatible shape:

  • finish_reason (stop, length, tool_calls)
  • usage.prompt_tokens / usage.completion_tokens, with cache usage rolled into prompt_tokens and broken out under prompt_tokens_details.cached_read_tokens / cached_write_tokens
  • reasoning_details - model thinking output (when reasoning is enabled)
  • Tool call arguments returned as a JSON string in tool_calls

When you request structured output, the structured JSON is returned as the message content (not as a tool call), so you consume it the same way you would from any other provider.

Set "stream": true on Chat Completions or Responses to receive output as standard OpenAI-compatible streaming chunks (content text, tool-call arguments, and reasoning arrive incrementally).

Text Completions do not support streaming on Bedrock.


Bedrock is available through the OpenAI-style Responses API on the same models as Chat Completions.

Send standard OpenAI Responses requests. The following are supported:

  • max_output_tokens, temperature, top_p
  • instructions for system instructions
  • input as a string or array
  • tools and tool_choice (see Chat Completions)
  • reasoning for model thinking (see Reasoning / Thinking)
  • text for structured output

Bedrock-specific fields such as include and stop can be passed directly in the request body. Cache control is supported on instructions and input messages.

Terminal window
curl -X POST https://app.deepintshield.com/v1/responses \
-H "Authorization: Bearer sk-bf-your-virtual-key" \
-H "Content-Type: application/json" \
-d '{
"model": "bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
"input": "Hello, how are you?",
"stop": ["###"]
}'

Responses come back in the standard OpenAI Responses shape:

  • status (completed, incomplete)
  • usage.input_tokens / usage.output_tokens, with cache usage broken out under input_tokens_details.cached_read_tokens / cached_write_tokens
  • output items: assistant text as message, tool calls as function_call, model thinking as reasoning

Set "stream": true to receive output as standard OpenAI Responses streaming events.


Send a prompt with standard parameters (max_tokens, temperature, top_p). top_k and stop can be passed directly in the request body. The response is returned in OpenAI-compatible completion shape, with text in choices[].text. Mistral models can return multiple completions.


Supported embedding models: Titan, Cohere

ParameterNotes
inputText or array of texts to embed
dimensions⚠️ Not supported - Titan has fixed dimensions per model
encoding_format"base64" or "float"

Titan-specific: No dimension customization; fixed output size per model version.

Cohere-specific: Uses the same parameters as the standard Cohere provider.

The embeddings response includes the embedding vector(s) and token usage in the standard OpenAI-compatible format.


Supported image generation models: Titan Image Generator v1, Titan Image Generator v2, Nova Canvas v1

ParameterNotes
promptText description of the image
nNumber of images to generate
negative_promptWhat to avoid in the image
seedSeed for reproducibility
qualityImage quality (see Quality Mapping)
styleImage style
sizeImage size in "WxH" format (e.g., "1024x1024")

The quality value maps to Bedrock’s supported quality levels:

Input ValueResult
"low"Standard
"medium"Standard
"high"Premium
"default"Standard
"premium"Premium

Generated images are returned in data[] as base64 (data[].b64_json).

Terminal window
curl -X POST https://app.deepintshield.com/v1/images/generations \
-H "Authorization: Bearer sk-bf-your-virtual-key" \
-H "Content-Type: application/json" \
-d '{
"model": "bedrock/amazon.nova-canvas-v1:0",
"prompt": "A futuristic cityscape with a flying car",
"size": "1024x1024",
"seed": 123,
"negative_prompt": "bikes",
"n": 2
}'

Supported image edit models: Titan Image Generator v1, Titan Image Generator v2, Nova Canvas v1

Bedrock supports three image edit task types: INPAINTING, OUTPAINTING, and BACKGROUND_REMOVAL. The type field is required and must be one of these values.

Request Parameters

ParameterTypeRequiredNotes
modelstringModel identifier (must be Titan or Nova Canvas model)
typestringEdit type: "inpainting", "outpainting", or "background_removal"
promptstringText description of the edit (required for inpainting/outpainting)
image[]binaryImage file(s) to edit (only first image used)
maskbinaryMask image file (for inpainting/outpainting)
nintNumber of images to generate (1-10, for inpainting/outpainting only)
sizestringImage size: "WxH" format (e.g., "1024x1024", for inpainting/outpainting only)
qualitystringImage quality (for inpainting/outpainting only). See Quality Mapping for supported values.
cfgScalefloatCFG scale (pass as an extra param, for inpainting/outpainting only)
negative_textstringNegative prompt (pass as an extra param, for inpainting/outpainting only)
mask_promptstringMask prompt (pass as an extra param, for inpainting/outpainting only)
return_maskboolReturn mask in response (pass as an extra param, for inpainting/outpainting only)
outpainting_modestringOutpainting mode (pass as an extra param, outpainting only): "DEFAULT" or "PRECISE"

Behavior

  • Task Type: Set type to "inpainting", "outpainting", or "background_removal". Any other value returns an error.
  • Image Handling: The first image you send is used.
  • Task-Specific Parameters:
    • Inpainting: Accepts prompt, the input image, an optional mask, and the negative_text, mask_prompt, and return_mask extra params.
    • Outpainting: Accepts the same as inpainting, plus the outpainting_mode extra param ("DEFAULT" or "PRECISE").
    • Background removal: Accepts only the input image; no other parameters apply.
  • Generation Config (inpainting and outpainting only): n (number of images), size ("WxH"), quality (see Quality Mapping), and cfgScale (extra param).

Response

Returns the same response shape as image generation: images[] (base64-encoded images) and, if return_mask was true, a base64-encoded maskImage.

Streaming: Image edit streaming is not supported by Bedrock.


Supported image variation models: Titan Image Generator v1, Titan Image Generator v2, Nova Canvas v1

Request Parameters

ParameterTypeRequiredNotes
modelstringModel identifier (must be Titan or Nova Canvas model)
imagebinaryImage file to create variations from (supports multiple images via image[])
nintNumber of images to generate (1-10)
sizestringImage size: "WxH" format (e.g., "1024x1024")
qualitystringImage quality. See Quality Mapping for supported values.
cfgScalefloatCFG scale (pass as an extra param)
promptstringPrompt/text for variation (pass as an extra param)
negativeTextstringNegative prompt (pass as an extra param)
similarityStrengthfloatSimilarity strength (pass as an extra param): Range 0.2 to 1.0

Behavior

  • Image Handling: You can supply one or more input images via image[].
  • Variation Parameters: prompt, negativeText, and similarityStrength (range 0.2 to 1.0) can be set as extra params.
  • Generation Config: n (number of images), size ("WxH"), quality (extra param, see Quality Mapping), and cfgScale (extra param).

Response

Returns the same response shape as image generation: images[] (base64-encoded image variations).

Streaming: Image variation streaming is not supported by Bedrock.


Request formats: requests array (CustomID + Params) or input_file_id

Pagination: Cursor-based with afterId, beforeId, limit

Endpoints:

  • POST /batch - Create batch
  • GET /batch - List batches
  • GET /batch/{batch_id} - Retrieve batch
  • POST /batch/{batch_id}/cancel - Cancel batch

Response: JSONL format with {recordId, modelOutput: {...}} or {recordId, error: {...}}

Batch statuses: Validating, InProgress, Completed, Failed, Cancelling, Cancelled, Expired


Upload: Multipart/form-data with file (required) and filename (optional)

Endpoints:

  • POST /v1/files - Upload
  • GET /v1/files - List (cursor pagination)
  • GET /v1/files/{file_id} - Retrieve metadata
  • DELETE /v1/files/{file_id} - Delete
  • GET /v1/files/{file_id}/content - Download content

Note: File purpose is always "batch", status is always "processed".


Request: GET /v1/models (no body)

Returns available Bedrock models with metadata. Models can be filtered by region, deployment configuration, and an allowlist (allowed_models config). When multiple keys are configured, results are aggregated across keys.


DeepIntShield signs every Bedrock request with AWS Signature Version 4 (SigV4). Credentials are resolved in the following priority order, and STS AssumeRole can be layered on top of any of them.

Provide access_key and secret_key directly in bedrock_key_config. Optionally include a session_token for pre-obtained temporary credentials.

{
"bedrock_key_config": {
"access_key": "your-aws-access-key",
"secret_key": "your-aws-secret-key",
"session_token": "optional-session-token",
"region": "us-east-1"
}
}

2. Default Credential Chain (IAM Role / Instance Profile)

Section titled “2. Default Credential Chain (IAM Role / Instance Profile)”

Leave access_key and secret_key empty (or omit them). DeepIntShield calls AWS LoadDefaultConfig which automatically resolves credentials from the environment in this order:

  • Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN)
  • EKS IRSA (AWS_WEB_IDENTITY_TOKEN_FILE + AWS_ROLE_ARN)
  • ECS task role
  • EC2 instance profile (IMDS)
  • ~/.aws/credentials default profile
{
"bedrock_key_config": {
"region": "us-east-1"
}
}

Set role_arn to assume an IAM role before signing requests. AssumeRole requires a valid source identity - it works when credentials are available either via explicit access_key/secret_key in key config, or via the default credential chain (environment variables, EC2 instance profile, ECS task role, EKS IRSA, etc.). If no credentials are available from either source, AssumeRole will fail.

{
"bedrock_key_config": {
"role_arn": "arn:aws:iam::123456789012:role/BedrockRole",
"external_id": "optional-external-id",
"session_name": "my-session",
"region": "us-east-1"
}
}
FieldRequiredDefaultNotes
role_arnYes (for STS)-IAM role ARN to assume
external_idNo-Required when the role’s trust policy demands it
session_nameNodeepintshield-sessionIdentifies the session in CloudTrail logs

How to Use ARNs and Application Inference Profiles

Section titled “How to Use ARNs and Application Inference Profiles”

When using AWS Bedrock inference profiles or application inference profiles, you must split the configuration correctly to avoid UnknownOperationException:

FieldPurpose
arnThe ARN prefix (everything before the final /resource-id). Required for URL formation when using inference profiles.
deploymentsMap logical model names to the model ID or inference profile resource ID only - not the full ARN.

Application inference profiles - use the resource ID (short alphanumeric suffix) in deployments:

{
"bedrock_key_config": {
"access_key": "your-aws-access-key",
"secret_key": "your-aws-secret-key",
"session_token": "optional-session-token",
"region": "eu-west-1",
"arn": "arn:aws:bedrock:eu-west-1:123456789012:application-inference-profile",
"deployments": {
"claude-opus-4-6": "ghi56rst",
"claude-sonnet-4-5": "jkl78mno"
}
}
}

Cross-region inference profiles - use the model identifier (e.g., us.anthropic.claude-3-5-sonnet-20241022-v2:0) in deployments:

{
"bedrock_key_config": {
"access_key": "your-aws-access-key",
"secret_key": "your-aws-secret-key",
"session_token": "optional-session-token",
"region": "us-east-1",
"arn": "arn:aws:bedrock:us-east-1:123456789012:inference-profile",
"deployments": {
"claude-sonnet": "us.anthropic.claude-3-5-sonnet-20241022-v2:0"
}
}
}

For detailed instructions on setting up AWS Bedrock authentication including credentials, IAM roles, regions, and deployment mapping, see Provider-Specific Authentication - AWS Bedrock in the Gateway Quickstart for configuration steps in the Web UI.

  • Runtime API: bedrock-runtime.{region}.amazonaws.com/model/{path}
  • Control Plane: bedrock.{region}.amazonaws.com (list models)
  • Batch API: Via bedrock-runtime

HTTP Status Mapping:

StatusDeepIntShield Error TypeNotes
400invalid_request_errorBad request parameters
401authentication_errorInvalid/expired credentials
403permission_denied_errorAccess denied to model/resource
404not_found_errorModel or resource not found
429rate_limit_errorRate limit exceeded
500api_errorServer error
529overloaded_errorService overloaded

Error Response Structure:

{
"error": {
"type": "invalid_request_error",
"message": "Human-readable error message"
}
}

Special Cases:

  • Cancelled requests return a request-cancelled error.
  • Timed-out requests return a request-timeout error.
  • Streaming errors are delivered in-stream with an end-of-stream indicator.
  • Malformed provider responses return a response-parsing error.

Image Format Restriction

Severity: High Behavior: Only base64/data URI images supported; remote URLs not supported Impact: Requests with URL-based images fail

Minimum Reasoning Budget (Claude)

Severity: High Behavior: reasoning.max_tokens must be >= 1024 Impact: Requests with lower values fail with error

Model Family-Specific Reasoning

Severity: Medium Behavior: Reasoning/thinking support varies by model family Impact: Behavior differs for Claude vs Nova vs other families

Text Completion Streaming Not Supported

Severity: Medium Behavior: Text completion streaming returns error Impact: Streaming not available for legacy completions API

Structured Output

Severity: Low Behavior: Bedrock has no native structured-output parameter, so response_format is implemented for you under the hood Impact: None - you receive standard schema-conforming JSON in the response content

Deployment Region Prefix Handling

Severity: Low Behavior: Model IDs with region prefixes matched against deployment config Impact: Model availability depends on deployment configuration