A three-step framework for governing AI as a unified, accountable system
Imagine this scenario: A customer reaches out to an airline chatbot with an urgent question. He wants to know if the airline provides refunds or discounts for “bereavement fares.” The chatbot walks the customer through the process, suggesting that he can book the flight at full price and request a refund via claims once he completes the travel.
The customer follows the instructions, books the flight, completes the travel, and then submits a refund claim. So far, it is a straightforward process.
But here is the problem: The airline denies the claim, stating that refunds cannot be requested after travel. They must always be claimed before flying. This information is also publicly available on their website, but it directly contradicts what the bot said. When the customer explained that he followed “their” chatbot’s instructions to the letter, the airline responded that the bot was responsible for its “own” mistake — implying that AI can hallucinate, and it is the customer’s responsibility to verify information with a human.
What can a customer do? He booked an urgent, same-day flight on the promise of a refund. His circumstances didn’t allow for other options, and now the claim is denied.
This is a real event that occurred in 2022 when Air Canada’s chatbot provided a passenger with refund advice that directly contradicted the airline’s bereavement policy. The failure here was not a “hallucination” in the traditional sense. Instead, the bot was like a store representative who wasn’t given the rulebook but was told to “make the customer happy” at all costs.
It is a classic example of the “Point Solution Trap” — when individual solutions operate as isolated systems to support and prevent escalations across the customer journey, but rarely speak to each other.
Having spent the last 13 years in machine learning — including the last eight as a CX data scientist designing experience metrics, service analytics frameworks, and AI-driven decision systems for B2B SaaS — I have seen this ‘trust gap’ firsthand. In my experience, even the most technologically advanced models fail when they are governed in silos as ‘check-the-box’ deployments rather than as vital parts of a single, unified engine.
As we move through 2026, this trap is becoming impossible to ignore. Customer experience is starting to show clear signals where such gaps exist and how quickly they can escalate to critical, more prominent issues. We have become exceptional at building high-performance models, yet lag significantly behind in governing the ecosystems they inhabit.
Why Static Governance Fails in Practice
Most current AI governance frameworks are “gate-based.” They check for biases at the point of deployment and assume that the current state will remain stable. In a complex B2B environment, this assumption is a major risk.
When an AI-enabled system suggests a complicated workaround to a customer, that “decision” does not exist in a vacuum; it is a direct extension of the brand’s legal and ethical commitments.
In 2024, the British Columbia Tribunal ruled that Air Canada is a “single entity.” It cannot separate its chatbot’s actions from its company’s rules. The airline ended up paying the full compensation for its solution’s mistake.
The ruling confirms a critical executive mandate: All point solutions are part of a unified system, and an AI’s output is exactly equivalent to the business’s word. As highlighted by Harvard Law School, rising corporate compliance failures prove that a purely enforcement-driven model is no longer adequate to prevent misconduct or promote a truly ethical culture.
In this case, the airline’s chatbot was governed as a conversational interface — a touchpoint — rather than a support representative of the business who could be held accountable. If the bot is not integrated as part of the “system,” the organization remains inherently exposed and vulnerable to even more serious liabilities. Ideally, the chatbot should have been integrated with Air Canada’s Service-Level Agreements (SLAs), human agent workflows, and all policy repositories.
According to Gartner, organizations operationalizing AI transparency and security (AI TRiSM) will see a 50% improvement in adoption by 2026. But we can’t get there with a checklist. We need a robust and reliable system for sustainable adoption.
Escape Plan: A Framework for Systems-Led AI Governance
To avoid the “Air Canada effect,” CX and Service organizations must shift from managing individual models to governing the entire service ecosystem. This transition requires a move away from point solutions toward an architecture where data, AI, and human expertise operate within a single, integrated loop.
Here is the three-step blueprint, inspired by the way high-performers like Octopus Energy orchestrate their intelligence.
1. Establish Decision Lineage by Unifying Data
I have frequently seen most organizations suffer from fragmented data — billing records, support history, qualitative feedback, and product usage exist in separate silos. When AI operates within these gaps, it lacks the context required for accuracy and struggles with hallucinations. Systems governance begins by linking these streams together so every automated output has a verifiable “ancestry.” As noted in Collibra’s primer guide, a correct lineage is critical for building a transparent data ecosystem.
We integrate the core operational datasets — CRM (customer data), ERP (billing, transactions), and real-time usage logs — into a single, centralized, API-first architecture — essentially a digital interface that allows different software systems to integrate seamlessly and talk to each other instantly without manual data uploads. Think of this as a “shared drive” for your AI, ensuring it always pulls from the latest version of updates rather than random documents. This eliminates the friction when a customer moves from a bot to a human agent, as the data trail remains continuous.
In 2016, Octopus Energy started building Kraken, a unified platform that houses their billing, metering, and customer interactions in one place. With this systemic view, the AI and the human agent see the same real-time data, so the “Decision Lineage” always remains clear. There is no discrepancy between what the system knows and what is actually shared with the customer.
2. Implement Contextual Guardrails via Dual-Model Architecture
CX leaders are starting to realize that standard safety measures for complex B2B service environments are unable to keep pace with the rapid AI evolution. For reliability, we need a secondary validation layer that understands your business’s specific contracts and technical constraints.
We call this layer a Dual-Model Architecture — a system with two consecutive AI models where the first one drafts the response, and a second model automatically verifies it before anything is shared with the customer.
While your primary AI talks to the customer, a secondary “Evaluator” model — a policy or accuracy “checker” — is trained exclusively on your SLAs and internal policies and audits the output in real-time. It works like an automated peer review that catches a mistake before it reaches the customer’s inbox.

Octopus Energy used this approach within Kraken to automate policy validation. If the system flags a request that conflicts with company policy — like an unauthorized credit — the AI is blocked from sending it.
This approach involves a deliberate operational tradeoff. Kraken’s “Magic Ink” tool uses a verification system that annotates facts with sources. By introducing a secondary model, Octopus accepted a slight increase in computational latency and API costs. This extra layer is a strategic necessity and a deliberate choice of accuracy over speed. In high-stakes service, speed at the cost of precision is a liability. This is how CX leaders ensure integrity and maintain customer trust that unmonitored automation can quickly break.
3. Redefine the Human Role as a System Orchestrator
In a systems-led environment, the human agent’s role shifts from resolving individual support issues to acting as an expert who oversees and improves the entire infrastructure.
We redesign the agent’s job to capture the “why” behind every correction. When an agent fixes an AI’s answer, they categorize the error (e.g., “Outdated Documentation”, “Policy Change”). We are not just looking for a thumbs-up or thumbs-down — we are looking for the precise reason for the rework, making the system smarter with each feedback.
How Octopus Orchestrates the System: Since every agent and AI tool shares the same “memory” in the Kraken platform, a human’s correction does more than fix one customer’s issue. It updates the system’s logic for every future interaction. To support this, Octopus moved away from speed-based metrics like Average Handle Time (AHT). Instead, it tracks the Repeat Contact Rate. If a customer calls back for the same issue, the system failed, no matter how fast the first call was.
The challenge here was primarily cultural. Leadership had to explicitly separate agent performance from speed-based quotas to allow for ‘system-fixing’ feedback. Good Energy achieved a 32% reduction in calls specifically following the implementation of Kraken’s unified system.
By monitoring how often agents override the AI, leaders can identify exactly where the “source of truth” needs a patch. This reduces the cost of “fixing the fix” (rework by humans) and the long-term operational expense of repetitive manual work, resulting in permanent improvement for the entire AI stack. Instead of old-school retrospective audits, your staff resolves knowledge gaps as they happen.
Measuring Governance Health: From Volume to Velocity
By implementing this governance framework, we move away from measuring how fast we resolve an issue and start measuring how effectively the system prevents rework. By tethering your AI solutions to a unified system, you don’t just reduce your cost to serve; you build a brand that customers actually trust.
According to Kraken Technologies’ performance data, this systems-led approach allows service teams to use AI to draft over 40% of digital communications. This orchestration is the primary driver behind their reported 40% lower cost-to-serve compared to traditional utilities — proving that governance isn’t a latency driver, but a cost-saving engine.
Up until 2025, the goal for CX leaders has been to “implement AI.” But in 2026 and beyond, the winners will be those who orchestrate a robust, self-correcting system that understands your business as well as you do.
Disclaimer: All opinions expressed in this article are my own and do not represent the opinions or official positions of any current or former employer.
AI tools assisted in research for this article. All insights and conclusions reflect the author’s professional experience and expertise.