Cerebras
Overview
Section titled “Overview”Cerebras is a fully OpenAI-compatible provider that works with the complete set of OpenAI API features through DeepIntShield. Key characteristics:
- Complete OpenAI compatibility - All chat, text, and streaming features supported
- Full tool calling - Function definitions and parallel tool execution
- Streaming support - Server-Sent Events with token usage tracking
- Parameter preservation - Passes through all standard OpenAI parameters
- Responses API - Full support with format conversion
Supported Operations
Section titled “Supported Operations”| Operation | Non-Streaming | Streaming | Endpoint |
|---|---|---|---|
| Chat Completions | ✅ | ✅ | /v1/chat/completions |
| Responses API | ✅ | ✅ | /v1/chat/completions |
| Text Completions | ✅ | ✅ | /v1/completions |
| List Models | ✅ | - | /v1/models |
| Embeddings | ❌ | ❌ | - |
| Image Generation | ❌ | ❌ | - |
| Speech (TTS) | ❌ | ❌ | - |
| Transcriptions (STT) | ❌ | ❌ | - |
| Files | ❌ | ❌ | - |
| Batch | ❌ | ❌ | - |
1. Chat Completions
Section titled “1. Chat Completions”Request Parameters
Section titled “Request Parameters”Cerebras supports all standard OpenAI chat completion parameters. For full parameter reference and behavior, see OpenAI Chat Completions.
The following parameters are not supported by Cerebras and are ignored: prompt_cache_key, verbosity, store, service_tier.
Reasoning Parameter
Section titled “Reasoning Parameter”Cerebras follows the OpenAI-compatible reasoning convention using reasoning.effort. A thinking-token budget (reasoning.max_tokens) is not accepted by Cerebras.
Cerebras supports all standard OpenAI message types, tools, responses, and streaming formats. For details on message handling, tools, responses, and streaming, refer to OpenAI Chat Completions.
2. Responses API
Section titled “2. Responses API”Cerebras supports the Responses API with the same parameters as Chat Completions. Responses are returned in Responses format (output items instead of message content).
3. Text Completions
Section titled “3. Text Completions”Cerebras supports legacy text completion API:
| Parameter | Mapping |
|---|---|
prompt | Sent as-is |
max_tokens | max_tokens |
temperature | temperature |
top_p | top_p |
stop | stop sequences |
Response returns choices[].text with completion text.
4. Text Completions Streaming
Section titled “4. Text Completions Streaming”Streaming text completions use same SSE format as chat streaming.
5. List Models
Section titled “5. List Models”Lists available models from Cerebras with capabilities and context length information.
Unsupported Features
Section titled “Unsupported Features”| Feature | Reason |
|---|---|
| Embedding | Not offered by Cerebras API |
| Image Generation | Not offered by Cerebras API |
| Speech/TTS | Not offered by Cerebras API |
| Transcription/STT | Not offered by Cerebras API |
| Batch Operations | Not offered by Cerebras API |
| File Management | Not offered by Cerebras API |
Caveats
Section titled “Caveats”User Field Size Limit
Severity: Low Behavior: User field > 64 characters is silently dropped Impact: Longer user identifiers are lost