Skip to content

Fireworks AI

Fireworks AI is an OpenAI-compatible provider offering the same API interface with identical parameter handling. Key features:

  • Full OpenAI compatibility - Identical request/response format
  • Streaming support - Server-Sent Events with delta-based updates
  • Tool calling - Complete function definition and execution support
  • Native text completions - Served directly via /v1/completions
  • Embeddings - Served directly via /v1/embeddings
OperationNon-StreamingStreamingEndpoint
Chat Completions/v1/chat/completions
Responses API/v1/responses
Text Completions/v1/completions
Embeddings-/v1/embeddings
List Models-/v1/models
Image Generation-
Speech (TTS)-
Transcriptions (STT)-
Files-
Batch-

Add Fireworks AI as a provider with a single API key. The default base URL is https://api.fireworks.ai/inference; override it per workspace if you front Fireworks behind a custom gateway.

{
"provider": "fireworks",
"keys": [{ "value": "env.FIREWORKS_API_KEY", "models": [], "weight": 1.0 }]
}

Model identifiers use Fireworks’ fully-qualified form, e.g. accounts/fireworks/models/llama-v3p1-70b-instruct.


Fireworks AI supports all standard OpenAI chat completion parameters. For full parameter reference and behavior, see OpenAI Chat Completions.


Unlike chat-only OpenAI-compatible providers, Fireworks exposes a native Responses endpoint, so DeepIntShield forwards Responses requests directly to /v1/responses rather than converting them to chat completions.


Fireworks supports the legacy completions endpoint natively at /v1/completions, including streaming.


Embedding requests are forwarded to /v1/embeddings using the standard OpenAI embedding request/response shape.


Fireworks’ model listing endpoint returns available models with their context lengths and capabilities at /v1/models.