Skip to content

Perplexity

Perplexity is an OpenAI-compatible API with built-in web search and reasoning support. Call it through DeepIntShield using the same OpenAI-compatible Chat Completions and Responses APIs. What you can use:

  • Web search parameters - search mode, domain filters, recency filters, and location-based search (see Perplexity-Specific Parameters)
  • Reasoning - reasoning.effort maps to Perplexity’s reasoning_effort (see Reasoning & Effort)
  • Search results in the response - citations, search results, and videos returned alongside the answer
  • Extended usage - separate counts for citation tokens, search queries, and reasoning tokens
OperationNon-StreamingStreamingEndpoint
Chat Completions/chat/completions
Responses API/chat/completions
Text Completions-
Embeddings-
Image Generation-
Speech (TTS)-
Transcriptions (STT)-
Files-
Batch-
List Models-

Perplexity supports most OpenAI chat completion parameters. For standard parameter reference, see OpenAI Chat Completions.

  • No function calling: tools and tool_choice are silently dropped
  • Dropped parameters: stop, logit_bias, logprobs, top_logprobs, seed, parallel_tool_calls, service_tier
  • Reasoning: Uses reasoning_effort instead of reasoning object (see Reasoning & Effort)

Pass Perplexity-specific search and configuration fields directly in the request body:

Terminal window
curl -X POST https://app.deepintshield.com/v1/chat/completions \
-H "Authorization: Bearer sk-bf-your-virtual-key" \
-H "Content-Type: application/json" \
-d '{
"model": "sonar",
"messages": [{"role": "user", "content": "What is the latest news?"}],
"search_mode": "web",
"language_preference": "en",
"return_images": true,
"return_related_questions": true,
"disable_search": false,
"search_domain_filter": ["news.example.com"],
"search_recency_filter": "week"
}'
ParameterTypeDescription
search_modestringSearch mode: "web", "academic", "news", etc.
language_preferencestringLanguage preference (e.g., "en", "fr")
search_domain_filterstring[]Restrict search to specific domains
return_imagesbooleanInclude images in search results
return_related_questionsbooleanReturn related questions
search_recency_filterstringRecency filter: "hour", "day", "week", "month", "year"
search_after_date_filterstringSearch results after date (ISO format)
search_before_date_filterstringSearch results before date (ISO format)
last_updated_after_filterstringContent last updated after date
last_updated_before_filterstringContent last updated before date
disable_searchbooleanDisable web search entirely
enable_search_classifierbooleanEnable search classifier
top_kintegerTop-k results to use
ParameterTypeDescription
web_search_optionsobject[]Array of web search option configurations with user location support
media_response.overrides.return_videosbooleanReturn videos in results
media_response.overrides.return_imagesbooleanReturn images in results

Configure detailed search behavior including location:

{
"web_search_options": [
{
"search_context_size": "high",
"user_location": {
"latitude": 40.7128,
"longitude": -74.0060,
"city": "New York",
"country": "US",
"region": "NY"
},
"image_search_relevance_enhanced": true
}
]
}

Set the reasoning effort with reasoning.effort:

  • Supported efforts: "low", "medium", "high"
  • "minimal" is treated as "low" (Perplexity only supports low/medium/high)
  • reasoning.max_tokens is not supported and is ignored (Perplexity has no token budget control)

Perplexity responses include additional fields for search integration:

  • citations[] - Source citations from search
  • search_results[] - Full search results with metadata
  • videos[] - Video results from search

Extended usage tracking specific to Perplexity:

FieldSourceDescription
completion_tokens_details.citation_tokensusage.citation_tokensTokens used for citations
completion_tokens_details.num_search_queriesusage.num_search_queriesNumber of web search queries performed
completion_tokens_details.reasoning_tokensusage.reasoning_tokensTokens consumed by reasoning process
usage.costusage.costCost of the request
{
"id": "...",
"choices": [...],
"usage": {
"prompt_tokens": 100,
"completion_tokens": 150,
"total_tokens": 250,
"completion_tokens_details": {
"citation_tokens": 25,
"num_search_queries": 3,
"reasoning_tokens": 40
},
"cost": { "prompt_cost": 0.001, "completion_cost": 0.002 }
},
"citations": ["https://example.com/article1", "https://example.com/article2"],
"search_results": [
{
"title": "...",
"url": "...",
"snippet": "...",
"date": "2025-01-15"
}
],
"videos": [
{
"title": "...",
"url": "...",
"duration": 300
}
]
}

Perplexity uses OpenAI-compatible streaming format. Event sequence:

  • chat.completion.chunk events with delta updates
  • Standard OpenAI finish reason mapping

No Tool Support

Severity: High Behavior: Tool-related parameters are silently dropped Impact: Function calling not available

Reasoning Effort Mapping

Severity: Medium Behavior: "minimal" effort is mapped to "low" (Perplexity only supports low/medium/high) Impact: Requested minimal effort becomes low effort

Reasoning Max Tokens Dropped

Severity: Low Behavior: reasoning.max_tokens is silently dropped Impact: No control over reasoning token budget

Stop Sequences Not Supported

Severity: Low Behavior: stop parameter is silently dropped Impact: Stop sequences not enforced


Perplexity is available through the OpenAI-style Responses API, returning results in Responses format with the same search results, citations, and extended usage as Chat Completions.

The following parameters are supported:

ParameterNotes
max_output_tokensMaximum output tokens
temperature, top_pSampling controls
instructionsSystem instructions
reasoning.effortReasoning effort (see Reasoning & Effort)
text.formatStructured output format
input (string/array)Prompt input

The same Perplexity-specific search and configuration parameters as Chat Completions are also available (see Perplexity-Specific Parameters).

Terminal window
curl -X POST https://app.deepintshield.com/v1/responses \
-H "Authorization: Bearer sk-bf-your-virtual-key" \
-H "Content-Type: application/json" \
-d '{
"model": "sonar",
"instructions": "You are a helpful assistant with web search capabilities",
"input": "What is the latest news in technology?",
"search_mode": "news",
"return_images": true
}'

Same as Chat Completions with search results, citations, and extended usage tracking preserved.

Responses streaming uses the same OpenAI-compatible streaming as Chat Completions, with results returned in Responses format.