AI Governance

Security

article

The 3 AI Failures Everyone Misdiagnosed - And What They Reveal About the Coming Agent Crisis

Air Canada, Microsoft, Google: three AI failures everyone misdiagnosed. The WEF framework reveals what actually broke and why agents will make it worse.

Beon de Nood

December 1, 2025

9 min read

These weren't "hallucinations," "bugs," or "bad fine-tuning."

They were failures of classification, authority, operational context and governance. And most people completely missed it.

The World Economic Forum just published a framework that gives us the vocabulary to understand what actually went wrong, and why these incidents are warnings about what's coming. Their frameworks become the foundation for regulatory policy across G20 nations and the evaluation criteria for enterprise procurement. When they publish "AI Agents in Action: Foundations for Evaluation and Governance," they're setting the vocabulary that regulators, auditors, and compliance teams will use for the next five years.

The framework does something critical: it separates AI agents from AI models. Agents aren't just smarter chatbots. They're systems with autonomy, authority, sensors, effectors and operational context. That distinction changes everything.

I'm going to walk through three high-profile AI failures from the past two years. In each case, the public diagnosis was wrong. And in each case, the WEF framework reveals exactly what broke and why it's about to get much worse.

Case Study 1: Air Canada Chatbot (An Authority Failure Disguised as a Hallucination)

In 2022, Jake Moffatt visited Air Canada's website to book a bereavement flight after his grandmother's death. The airline's chatbot told him he could purchase full-price tickets and retroactively apply for bereavement rates within 90 days.

The chatbot was wrong. When Moffatt requested the discount, Air Canada refused and argued the chatbot was a "separate legal entity responsible for its own actions." The court rejected this and ordered Air Canada to pay $812 CAD in damages.

The real root cause

Everyone called this a hallucination problem. It wasn't.

The chatbot had de facto authority to make binding commitments on behalf of the company, but no mechanism to validate those commitments against actual policy. No authority boundaries. No constraint layer. No policy enforcement.

The bot could speak with the company's voice and make promises in the company's name, but nobody had defined what it was actually allowed to commit to. This is what the WEF framework calls authority misassignment: giving systems power to act without explicitly defining what that power includes.

Why agents make it worse

If you can't constrain a simple text interface from making unauthorized commitments, you won't constrain an autonomous agent with tool use and API permissions. An agent with access to your CRM, email system, and pricing database could make thousands of unauthorized commitments before anyone notices. The Air Canada incident was one customer and $812. Scale that to an agent with autonomy and watch the liability compound.

What should have been in place

Agents need explicit authority boundaries tied to their role, not implicit permissions based on what they can technically access. Commitments that create obligations or liabilities must be validated against actual policy before they're communicated. And there needs to be traceability: a clear audit log showing what the agent was authorized to do, what it actually did, and who's responsible when those diverge.

Case Study 2: Microsoft Copilot (A Sensor Boundary Failure)

Microsoft Copilot's integration with Microsoft 365 created an unexpected privacy crisis in late 2024. Organizations deploying Copilot discovered that employees could suddenly access sensitive information they shouldn't have: CEO emails, HR documents, confidential meeting notes.

The problem wasn't a security breach. It was a sensor-scoping failure. Copilot was designed to index and synthesize information across Microsoft 365 apps. It worked exactly as designed. But many organizations had never properly configured granular permissions because there was no tool powerful enough to exploit those misconfigurations at scale.

Until Copilot.

As one Microsoft employee described it to Business Insider: "Now when Joe Blow logs into an account and kicks off Copilot, they can see everything. All of a sudden Joe Blow can see the CEO's emails."

The real root cause

This wasn't over-eager summarization. The system had broader sensor access than its task required. Identity boundaries, data partitions and permission layers weren't aligned with what the tool actually needed to do its job.

Copilot didn't just expose misconfigurations: it amplified them. The operational environment was complex, with interconnected apps, legacy permissions, and unclear data ownership. Most organizations had never evaluated Copilot in a real-world permission environment, and its ability to traverse and aggregate data at scale turned theoretical permission gaps into actual security incidents.

The WEF framework stresses that agents must be evaluated in representative environments that mirror real deployment conditions, including understanding all sensor surfaces available to them. Microsoft's customers skipped that step.

Why agents make it worse

Copilot was primarily a perception tool: it could see and summarize data, but had limited ability to act. Sensor-boundary mistakes in perception tools are embarrassing. In autonomous agents, they become action-boundary failures. Agents don't just perceive data: they execute actions based on what they see. Overly broad sensor access means agents can exfiltrate intellectual property, manipulate sensitive datasets, or make decisions based on information they were never meant to have.

What should have been in place

Sensor domains need to be explicitly scoped to task requirements, not defaulted to "everything the user can technically access." Agents need identity-aware permissions where their sensor boundaries are defined independently from the human user who deployed them. And organizations need to evaluate their agents in environments that reflect production complexity: messy permissions, interconnected systems, legacy access controls, before those agents go live.

Case Study 3: Google Gemini Image Generation (A Goal-Misgeneralization Failure)

In February 2024, Google paused Gemini's ability to generate images of people after widespread criticism over historical inaccuracies. Users discovered that Gemini was producing images of racially diverse Nazi soldiers, non-white U.S. Founding Fathers, and Black female popes when asked for historically grounded depictions.

Google acknowledged the issue publicly. Prabhakar Raghavan, Google's Senior VP, explained that the system had been tuned to ensure diverse representation in open-ended prompts. But the tuning "failed to account for cases that should clearly not show a range" — like historically specific contexts where accuracy matters more than diversity.

The real root cause

This wasn't political bias. It was goal misgeneralization. Constraints were embedded through implicit fine-tuning signals rather than structural, inspectable policy logic. The system learned a rule ("maximize diversity in depictions of people") but couldn't distinguish where that rule applied and where it didn't.

It generalized the goal beyond its intended domain because there was no explicit, testable constraint policy. When constraints live inside fine-tuning rather than inside verifiable rule layers, agents misgeneralize because there is no explicit model of where a constraint applies in the operational context.

Why agents make it worse

Misgeneralization in a text-to-image tool is embarrassing. In autonomous agents, it becomes dangerous. If an agent learns the wrong boundary conditions for one domain, it will misgeneralize in others. An agent optimizing for "customer satisfaction" might approve fraudulent refunds. An agent optimizing for "efficiency" might skip required compliance checks. An agent optimizing for "helpfulness" might share confidential data with unauthorized users. Add tool access, and misgeneralization becomes mis-execution at scale.

What should have been in place

Constraints need to be explicit, inspectable, and testable: not learned through fine-tuning. Agents need deterministic safety rails that can be validated across edge cases before deployment. And when agents operate in domains where context matters, they need transparent override layers that allow human operators to correct boundary conditions in real time. This is what the WEF means by progressive governance: the more autonomous the system, the more explicit the constraints must be.

The Pattern: Missing Agent Architecture

Across all three failures, the diagnosis was wrong and the pattern was identical.

Air Canada: everyone said "hallucination" when the real problem was unbounded authority. Microsoft: everyone said "privacy bug" when the real problem was sensor-scoping failure. Google: everyone said "woke AI" when the real problem was goal misgeneralization through implicit constraints.

These were not failures of AI intelligence. They were failures of agent architecture and governance infrastructure.

Air Canada deployed a system that could make commitments without defining what it was allowed to commit to. Microsoft deployed a system that could access data without scoping what it needed to access. Google deployed a system that applied constraints without defining where those constraints should apply.

In every case, the organization treated the AI like better software: a smarter chatbot, a more capable search tool, a more creative image generator. But agents aren't just "better software." They're systems with autonomy, authority, sensors and effectors operating in complex environments. They require fundamentally different architecture.

The WEF framework identifies four foundational pillars that were missing in all three cases: classification (defining role, authority and operational context), evaluation (testing in representative environments), risk assessment (understanding how agent behavior interacts with context), and progressive governance (scaling safeguards to autonomy level).

When these foundations are missing, even "simple" AI systems behave like ungoverned agents: making commitments they shouldn't, accessing data they don't need, and applying rules in contexts where they don't belong.

Why This Matters Now

The next wave of AI failures won't come from smarter models. They'll come from deploying autonomous agents on top of the same broken assumptions that caused these three incidents.

Authority failures won't just cost an airline $812. They'll result in unauthorized financial transactions at scale. Sensor boundary failures won't just expose a CEO's inbox. They'll enable agents to exfiltrate intellectual property or manipulate sensitive datasets. Goal misgeneralization won't just produce embarrassing images. It'll cause agents to optimize for the wrong objectives in critical domains like healthcare, finance, or infrastructure management.

Organizations that don't address these gaps won't just face technical failures — they'll face compliance failures. The vocabulary is set. The evaluation criteria are defined. The question is whether organizations will build the infrastructure before the next incident, or after.

What CapiscIO Is Building

The WEF paper makes one point unavoidable: before organizations ask what agents can do, they must answer who is being allowed to act in their systems, and under what authority.

CapiscIO builds the trust layer that makes that possible.

We verify the identity of every agent, define its role, and bind its authority boundaries to a cryptographically signed agent card. Sensor and effector domains are declared explicitly. Capabilities are inspectable. Constraint policies are deterministic, not hidden inside fine-tuning. Downstream systems can use CapiscIO's attestation to enforce their own authorization logic.

CapiscIO operationalizes WEF's four pillars (classification, evaluation inputs, risk context, and governance signaling) in a way organizations can deploy today.

CapiscIO doesn't replace governance. It enables it.

This is what progressive governance looks like in practice: trust signals that scale with autonomy and environmental complexity, aligned with the classification, evaluation, and risk principles the WEF outlines.

The companies that succeed with AI agents won't be the ones with the most advanced models. They'll be the ones with the foundation to govern them. CapiscIO provides that foundation so organizations can deploy agents safely, predictably and with confidence.

Written by Beon de Nood

Creator of CapiscIO, the developer-first trust infrastructure for AI agent discovery, validation and governance. With two decades of experience in software architecture and product leadership, he now focuses on building tools that make AI ecosystems verifiable, reliable, and transparent by default.