Hallucination Defense & Response Consistency

Overview

DeepintShield turns two of the hardest problems in production GenAI - “the model made it up” and “the model gives a different answer every time” - into governed, measurable controls you configure in the Web UI.

The feature combines two complementary capabilities:

Hallucination Defense reduces fabrication before the prompt reaches the model (grounding, anti-fabrication, citation, and uncertainty directives plus a randomness cap), then scores every response after the fact across six accuracy metrics - without adding latency to the request.
Response Consistency guarantees that identical and paraphrased questions return the same approved answer, including an admin-curated Golden Registry of canonical answers for regulated or high-stakes topics.

Crucially, consistency never weakens safety: a response is only made repeatable after it has passed your guardrails, and the guardrail verdict travels with every served answer.

Key benefits

Less fabrication at the source - proactive directives are applied inline with no extra model calls.
Per-request accuracy scoring - see how grounded, relevant, and self-consistent your output actually is, not just whether it was blocked.
Same question, same answer - equivalent questions resolve to one canonical, governed response.
Author-controlled answers - pin exact wording for compliance-sensitive questions, with an audit trail.
Pilot safely - every control can be scoped to a subset of virtual keys before a workspace-wide rollout.

When to use

Regulated and high-stakes assistants (financial, legal, healthcare, support) where a wrong-but-confident answer is unacceptable.
Knowledge-base and RAG copilots where you need proof that answers are backed by your own sources.
High-traffic FAQ and support flows where the same questions repeat and you want consistent, lower-cost answers.
Compliance copilots that must return officially approved wording for specific questions.

Hallucination Defense

Hallucination Defense has three parts, each on its own tab under Workspace → Hallucination Control: proactive Control, post-response Configuration (scoring), and the Ground Truth reference corpus.

Proactive control

The Control tab injects factual-conservatism instructions and a randomness cap before the request leaves the gateway. Turn on the master Hallucination control switch, then enable the techniques you want. Techniques compose - enabling several layers their instructions together.

Technique	What it does
Stay grounded	Tells the AI to answer only from the information you provided, and to say it doesn’t know when the answer isn’t there.
Don’t make things up	Tells the AI not to invent facts, names, dates, or quotes - to leave them out when unsure.
Require sources	Asks the AI to mark facts with their source. Works best when you share source material in the prompt.
Hedge when unsure	Asks the AI to say “might”, “likely”, or “I’m not sure” instead of stating uncertain things as facts.
Reduce randomness	Makes answers more predictable by capping the temperature (default cap 0.4).

Under Advanced parameters you set:

Strictness - Low (gentle reminder), Medium (balanced, default), or High (strict - refuse to answer when there is no supporting source). Higher strictness also tightens the randomness cap.
Temperature cap - the maximum temperature allowed when Reduce randomness is on. Anything above this value is capped down.

Use Apply to virtual keys to limit these techniques to specific keys. Leave it empty to apply them to every key in the workspace.

Open Workspace → Hallucination Control → Control.
Turn on the Hallucination control master switch.
Enable the techniques you want (for example, Stay grounded and Don’t make things up).
Set Strictness and, if Reduce randomness is on, the Temperature cap.
Optionally pick keys under Apply to virtual keys to pilot first. Leave it empty to apply to all virtual keys, or pick specific keys to scope it.
Click Save.

Response scoring (six metrics)

The Configuration tab scores responses after they are returned, so it adds no latency to the request path. Turn on Hallucination evaluation, then choose which detectors to run:

Detector	What it measures
Faithfulness	Whether the answer stays grounded in the information you provided.
Answer relevance	Whether the answer actually addresses the question that was asked.
Coherence	Whether the answer is internally consistent, with no sentences contradicting each other.
Helpfulness	How useful the answer is, graded by a small AI model (uses one extra model call per scored response).
Citation precision	Whether the facts in the answer are backed by the reference material you uploaded under Ground Truth.

These detectors are combined into an overall hallucination score so you can track accuracy per request and per virtual key.

The Helpfulness grader section lets you pick the virtual key and judge model used for the Helpfulness detector - pick a small, fast model such as Haiku, gpt-4o-mini, or Gemini Flash. It only runs when Helpfulness is enabled.

Under Performance, control cost and throughput:

Sample rate (%) - what fraction of responses to score (10% is the recommended balance of observability and cost; 0% disables scoring; 100% scores everything).
Async workers - how many checks run at the same time.
Timeout (ms) - a check is skipped if it takes longer than this. Checks never delay your responses.

As with Control, Apply to virtual keys lets you pilot scoring on a subset of keys.

Open Workspace → Hallucination Control → Configuration.
Turn on the Hallucination evaluation switch.
Enable the detectors you want.
If you enabled Citation precision, upload reference material on the Ground Truth tab (below).
If you enabled Helpfulness, pick a Virtual Key and Judge model under Helpfulness grader. The judge is only used when the Helpfulness detector is enabled.
Set the Sample rate, Async workers, and Timeout.
Click Save.

Ground-truth corpus

The Citation precision detector compares an answer’s claims against reference material you supply. You manage that material on the Ground Truth tab as canonical question / answer / source records.

The corpus is uploaded as a CSV with three columns:

Column	Meaning
`question`	Optional. Anchors a record to a specific question.
`expected_answer`	The reference answer that responses are checked against.
`sources`	One or more reference sources, separated by a pipe (`\|`).

You attach records to scope using the Apply records to dropdown above the uploader - choose a specific virtual key, or All virtual keys in this workspace (a wildcard that matches every key, including future ones). The chosen scope is stamped onto every row at upload time, so the CSV itself does not carry a key column.

Web UI
CSV

Open Workspace → Hallucination Control → Ground Truth.
Click Download template CSV and fill it in your spreadsheet.
Pick a scope under Apply records to (a specific key, or all keys).
Click Upload filled CSV and review the parsed records in the table.
Click Save ground truth.

question,expected_answer,sources
What is your data-retention policy?,We operate zero data retention by default unless caching is explicitly enabled per tenant.,policy-handbook-v3|legal-faq
Is my data used to train models?,No. Customer data is never used for model training.,policy-handbook-v3

Use Download current as CSV to export the active corpus for editing, and re-upload to update it.

Response Consistency

Response Consistency returns one canonical, governed answer to identical and semantically-equivalent questions, short-circuiting the model on a hit. It is managed under Workspace → Response Consistency, with separate tabs for Policy & Modes, the Golden Registry of pinned answers, and a per-request Request Trace.

Policy and modes

On the Policy & Modes tab, turn on the Response Consistency Engine master switch, then pick a Consistency mode. Each higher tier composes the ones below it:

Mode	What it does
Off	Pass through. Every request reaches the model.
Exact-Match	Byte-identical prompts return the stored answer.
Semantic	Paraphrases are matched by embedding similarity.
Semantic + Pinned	Adds admin-curated Golden Registry answers, served verbatim.
Strict Deterministic	Pinned + semantic, with deterministic settings on misses.

Additional policy controls:

Semantic match threshold - the minimum similarity to treat two questions as the same. Lower (toward 0.80) means more hits; higher (toward 0.99) means fewer false matches.
Verifier mode - adjudicates borderline semantic matches to avoid confidently returning the wrong cached answer:
- Off - borderline matches are not served.
- Borderline only - only near-threshold matches are double-checked.
- Always - every semantic match is confirmed.
Scope - where the policy applies: Tenant, Application, Route, or Per-user. Narrower scopes inherit and can override wider ones. Route is recommended for conversational apps; Per-user for confidential routes.
Freshness & retention - a Cache TTL (in hours) plus Auto-invalidate on options so a model change, system-prompt or policy change, guardrail policy change, or RAG knowledge-base update never silently serves a stale answer.
Eligibility rules - only responses that passed all guardrails can be cached (always enforced). You can additionally exclude responses with residual PII or secrets, exclude tool-calling turns, and set a minimum prompt length.
Embedding model - for Semantic tiers, pick the virtual key and embedding model used to compare questions (for example text-embedding-3-small).
Apply to virtual keys - leave empty to apply to every key, or pick a subset to pilot first.

Open Workspace → Response Consistency → Policy & Modes.
Turn on the Response Consistency Engine switch.
Pick a Consistency mode (start with Semantic + Pinned).
Set the Semantic match threshold and Verifier mode.
Choose a Scope, Cache TTL, and Auto-invalidate options.
For semantic tiers, select an Embedding model virtual key and model.
Optionally pick keys under Apply to virtual keys to pilot first.
Click Save policy.

Golden Registry (pinned answers)

The Golden Registry tab lets you pin canonical question / answer pairs that are served verbatim for regulated or high-stakes topics (it takes effect under the Semantic + Pinned and Strict Deterministic modes).

How it works:

Upload a CSV with two columns: question,answer.
Every upload is scanned by your workspace guardrails before admission. Clean rows are admitted; rejected rows are kept on the version (downloadable) so you can review what the scan flagged.
Each upload becomes a new immutable version, and exactly one version is active at a time.
The original CSV is retained per version for audit, recovery, and rollback. You can re-download any version, re-activate an older one, or delete one.

Open Workspace → Response Consistency → Golden Registry.
Click CSV template and fill in your question,answer pairs.
Click Upload CSV - the file is scanned server-side; admitted rows become the new active version.
Review the active version’s rows and any flagged-row count in Version history.
Use the row actions to download, activate, or delete a version.

Request trace

The Request Trace tab gives per-request explainability for the consistency engine. For recent resolutions it shows the source of each answer - Exact match, Semantic cache, Pinned answer, or Model (miss) - along with the matched canonical question, similarity score, carried guardrail verdict, policy version, model, pinned version (when applicable), and the latency saved by avoiding the model. Use it to confirm exactly why each answer was served.

Field and option reference

Hallucination Control (`semantic_cache` plugin)

Field	Values	Description
`hallucination_control_enabled`	boolean	Master switch for proactive control.
`hallucination_control_techniques`	`grounding_directive`, `anti_fabrication`, `citation_required`, `uncertainty_ack`, `temperature_clamp`	Enabled mitigation techniques.
`hallucination_control_strictness`	`low`, `medium`, `high`	How firm the injected instructions are.
`hallucination_control_temp_cap`	number (0–2)	Max temperature when randomness is capped (default 0.4).
`hallucination_control_vk_scope`	array of key IDs	Empty = all keys.

Hallucination Evaluation (`semantic_cache` plugin)

Field	Values	Description
`hallucination_eval_enabled`	boolean	Master switch for scoring.
`hallucination_eval_detectors`	`faithfulness`, `answer_relevance`, `coherence`, `helpfulness`, `citation_precision`	Enabled detectors.
`hallucination_eval_sample_pct`	0–100	Fraction of responses scored (10 recommended).
`hallucination_eval_workers`	integer	Concurrent scoring checks.
`hallucination_eval_timeout_ms`	integer	Per-check timeout; longer checks are skipped.
`hallucination_judge_vk_id` / `hallucination_judge_model` / `hallucination_judge_provider`	string	Judge used by the helpfulness detector.
`hallucination_ground_truth`	array	Ground-truth records (managed on the Ground Truth tab).
`hallucination_eval_vk_scope`	array of key IDs	Empty = all keys.

Response Consistency (`response_consistency` plugin)

Field	Values	Description
`mode`	`off`, `exact`, `semantic`, `semantic_pinned`, `strict`	Consistency tier.
`threshold`	0.80–0.99	Minimum cosine similarity for a semantic match.
`verifier`	`off`, `borderline`, `always`	Borderline-match adjudication.
`scope`	`tenant`, `app`, `route`, `user`	Where the policy applies.
`ttl_hours`	integer	Max age before re-generation.
`invalidate_on`	`model`, `prompt`, `policy`, `rag_corpus`	Auto-invalidation triggers.
`eligibility`	`exclude_pii`, `exclude_tool_calls`, `min_prompt_chars`	Cache write guards (clean-guardrails always enforced).
`embedding_vk_id` / `embedding_model` / `embedding_provider`	string	Embedding model for semantic tiers.
`vk_scope`	array of key IDs	Empty = all keys.

Examples

Strict compliance copilot. Enable Hallucination Control with Stay grounded, Don’t make things up, Require sources, and Reduce randomness at High strictness; pin official answers in the Golden Registry; set Response Consistency to Semantic + Pinned with verifier Always.

RAG knowledge-base assistant. Enable Citation precision scoring with a ground-truth corpus per virtual key, keep the sample rate at 10% to track citation coverage, and turn on Auto-invalidate on RAG corpus update so a knowledge-base refresh never serves a stale cached answer.

High-volume support FAQ. Use Semantic consistency at a moderate threshold to absorb paraphrased repeats, exclude tool-calling turns from caching, and pilot on one virtual key before workspace-wide rollout.

Next steps

Semantic caching - the underlying response-caching primitive.
Virtual keys - scope these controls per key.
Guardrails - the policies that scan Golden Registry uploads and gate cacheability.
Observability - track hallucination scores and consistency hit rates over time.