• Blog
  • Orchestrating trust: A structural solution to AI hallucinations in regulated industries

Orchestrating trust: A structural solution to AI hallucinations in regulated industries

Last updated 28 November 2025
Technology

Large Language Models (LLMs) promise a customer service future of seamless, empathetic, and complex problem-solving. But in the rush to adopt, enterprises are ignoring a critical question: what happens when this human-like AI gets it wrong?

In high-stakes, regulated industries like financial services, insurance and telecom, an AI hallucination could be more akin to a compliance nightmare than a quirky bug. A new study on end-user perceptions of LLM errors, co-authored by researchers at SINTEF and boost.ai, makes one thing abundantly clear: for any organization where trust is currency, relying on generative models without strict governance poses an unacceptable risk.

The hallucination hierarchy

The study, "LLM Hallucinations in Conversational AI for Customer Service," surveyed 274 potential end-users to understand how they perceive different kinds of AI errors. The results showed that not all AI errors are created equal.

The single most severe, trust-destroying error is "factual inconsistency" - providing factually incorrect information. This was rated as significantly more problematic than any other mistake, including not understanding a request, contradicting itself, or omitting information.

When an AI provides incorrect information, it fundamentally erodes user trust in both the technology and the service provider behind it. As one study participant noted, "It makes the chatbot redundant if I cannot trust the answer. It makes it fundamentally unreliable".

High stakes demand high stakes control

Why this strong reaction? Because in customer service, the stakes are real. Users aren’t asking an AI to write a poem; they’re asking about their mortgage, their insurance policy, or their data plan.

The research confirms that users assess severity based on potential negative implications, with risks related to personal finances deemed "particularly problematic". The study explicitly warns that hallucinations in this domain can lead to financial or legal ramifications and reputational damage.

If an AI wrongly tells a user they are covered by insurance when they aren't, the user doesn't just get annoyed—they may want to "submit a formal complaint or sue the service provider". This is the critical risk that regulated industries face.

The “Silent error” trap

Given that LLM hallucinations are considered inherent characteristics that are “likely to persist to some degree”, the path forward is not to simply hope for the best.

This is where we must contrast LLMs with traditional, NLU-based (Natural Language Understanding) chatbots. The research highlights a crucial difference: breakdowns in intent-based AI typically stem from an inability to assist users rather than from conveying misleading information.

While the study suggests users are statistically more forgiving of omissions (incomplete information) than outright falsehoods, there is a dangerous nuance where an omission is only forgiving if the user spots it. If an AI says “Yes, you are covered,” but omits the crucial detail “if you have the Platinum Plan”, the user has been misled just as effectively as if the AI had lied.

Pure generative AI struggles here. Instructing an LLM to simply say “I don’t know” is notoriously difficult; these models are statistically driven to predict the next token, opening up the potential for half-truths.

Orchestration as the mechanism of governed AI

The only responsible solution is a hybrid model managed by AI orchestration.

In essence, we move beyond intent-hierarchy models to a contextual and routing coordination layer. Rather than relying on a single model to handle everything, an AI orchestration layer uses fine-tuned LLMs to interpret user input and direct it to the most relevant specialized agent.

The architecture changes the risk profile by separating routing from execution:

  • Routing (the Orchestrator): Generative AI is used here to understand the user. It excels at interpreting complex intents like “I need to file a claim”.

  • Execution (the Agent): Once the Orchestrator identifies a high-stakes intent, it routes the query to a specialized agent. Crucially, for sensitive topics, this agent can be rule-based.

This combination greatly reduces the risk of factual inconsistency. A rule-based agent follows a compliance-approved flow; it cannot “omit” an exclusion clause or invent a policy because it is not predicting the answer, it’s simply executing a process.

Proper use case evaluation

This structure also supports a key implication from the study: using LLMs only where they add value. AI orchestration facilitates continuity with evolution, allowing enterprises to map out their agents and consciously decide which should be generative (for flexibility) and which must be rule-based (for safety).

It transforms the AI from a black box into a transparent team of agents, ensuring that accurate, quality-assured information is always the priority.

Managing user perception and recovery

Finally, an orchestrated approach vastly improves the user experience even when less severe errors occur.

The research found that users are forgiving of errors that are immediately apparent, provided they have a path to resolution. The key is agency. An AI orchestration layer manages the entire dialogue context, ensuring that if a specialized agent cannot help, the user is seamlessly handed off to another agent or a human, minimizing the "waste of time" and "emotional distress" identified in the study.

The future of AI in customer service is not a single, all-powerful generative model. That approach is a gamble on your customers’ trust. Instead, it lies in an intelligent, orchestrated system that coordinates specialized agents.

By understanding what users actually fear—factual errors with real-world consequences—we can design AI agents that are trustworthy, controllable, and above all, safe to deploy in a highly regulated environment.