AI Agents

Multi-Agent Systems

article

The Confused Deputy Problem Is Coming for Multi-Agent Systems

In 1988, Norman Hardy described a security vulnerability he called the Confused Deputy Problem. It was ignored. It is now standard in multi-agent AI systems.

Beon de Nood

March 6, 2026

9 min read

In 1988, a computer scientist named Norman Hardy described a security vulnerability that had no name yet. He called it the Confused Deputy Problem. It was concise, precise, and largely ignored by practitioners for the next three decades because the conditions that make it dangerous were relatively rare in traditional software architectures.

Those conditions are now standard in every multi-agent AI system deployed in production.

The Original Problem

Hardy's example involved a compiler running on a time-sharing system. The compiler had legitimate authority to write to a billing file — it needed to record resource usage. Users could invoke the compiler and specify an output file for their compiled code.

A malicious user discovered they could name their output file the same as the billing file. When the compiler ran, it dutifully wrote the user's output to the billing file, overwriting billing records. The compiler was not compromised. It was not tricked into doing something outside its legitimate capabilities. It was confused: it used its own legitimate authority to serve the interests of a less-privileged principal who asked it to.

The compiler is the deputy. It has authority the user does not. The user confused the deputy into exercising that authority on the user's behalf.

This is not a bug in the traditional sense. The compiler worked exactly as designed. The security failure is architectural: the compiler conflated the authority it held with the authority of the party making the request.

Why It Was Manageable Before

In traditional software architectures, the conditions for confused deputy attacks are present but relatively contained. A web application might be confused into reading files outside its intended directory. An API might be confused into returning data it should not. Access control lists, capability checks, and input validation address most of these cases.

The pattern is manageable because in most systems, the deputy is a well-defined, purpose-built piece of software with a narrow interface. The range of things a confused deputy can be tricked into doing is limited by how narrow the deputy's interface is.

Multi-agent systems break this assumption entirely.

The Multi-Agent Version

Consider a multi-agent workflow. A user authorizes a planning agent to "help with our quarterly budget review." The planning agent, attempting to be helpful, delegates a data retrieval task to a finance data agent. The finance data agent has broad read access to financial systems — it was provisioned that way because it serves multiple workflows across the organization. The planning agent makes a request that the finance data agent interprets as within scope. The finance data agent retrieves data far beyond what the user intended to expose.

No credential was stolen. No system was hacked. The finance data agent was confused into using its legitimate authority to serve the interests of a workflow that should have been bounded to a much narrower scope.

Now extend this across organizational boundaries. Acme Corp's orchestration agent delegates a task to TaxCo's processing agent. TaxCo's processing agent has legitimate access to TaxCo's internal financial systems. Acme's orchestration agent, through poor scope declaration, effectively gains access to TaxCo's internal systems by proxy. TaxCo's processing agent is the confused deputy.

This is not hypothetical. It is the natural failure mode of any multi-agent system where agents have different privilege levels and can make requests of each other without cryptographic binding of the originating authority to the delegation chain.

Why the Problem Is Worse in Agent Systems

Three properties of LLM-based agent systems make the confused deputy problem significantly more severe than in traditional software.

Deputies have wide interfaces. Traditional software components have narrow, well-defined interfaces. An LLM-based agent can be asked to do almost anything its tool surface permits. The range of things a confused deputy can be directed to do is bounded only by its tool access, not by a narrow API contract.

Intent is ambiguous at delegation time. When Agent A delegates to Agent B, the scope of that delegation is often expressed in natural language or left implicit. "Help with the budget" is not a precise authorization boundary. The confused deputy receives an ambiguous mandate and fills in the gaps using its own authority.

Delegation chains are long and invisible. A user may authorize an agent that spawns three more agents, each of which calls two tools. By the time a confused deputy executes an action, the chain of principals that authorized it is five hops long and opaque to every participant. No one in the chain has a complete picture of what was originally authorized.

The Standard Defenses That Do Not Work Here

The industry has two standard responses to this problem, and both are insufficient for multi-agent systems.

Principle of Least Privilege. Give every agent only the minimum access it needs. This is correct in principle and nearly impossible to implement in practice for general-purpose agents. You cannot enumerate minimum access for an agent that will be asked different questions by different users in different contexts. And least privilege at provisioning time does nothing to prevent a narrowly provisioned agent from being confused into using its legitimate access for purposes it was never meant to serve.

OAuth scopes. Limit what a token allows by assigning narrow scopes at issuance. This addresses the deputy's authority ceiling but not the confusion itself. A token scoped to finance:read on TaxCo's systems is still a legitimate token. A confused deputy with that token will use it when asked, regardless of whether the requesting agent had authority to make that request.

Both approaches constrain what the deputy can do. Neither prevents the deputy from being confused about who it is actually serving.

The Structural Solution

The confused deputy problem has a structural solution that was identified in the security research literature decades ago: capability-based security.

In a capability-based system, authority is not attached to the agent — it is attached to the specific interaction. An agent does not have authority in the abstract; it has a specific capability that was passed to it by a principal who had that capability. The capability is unforgeable and non-transferable except through explicit delegation. If an agent wants to use a capability on behalf of someone else, it must explicitly delegate that capability, and delegation can only produce a subset of what the agent holds.

This is the architectural principle that defeats the confused deputy: a deputy that can only act within explicitly delegated capabilities cannot be confused into using authority it was never given. The question "whose authority is this action being performed under?" has a cryptographically verifiable answer at every step.

In the multi-agent context, this translates into three requirements.

First, every delegation must be an explicit, signed artifact. Not an implicit assumption that Agent B trusts Agent A because they are in the same network or the same organization. A signed delegation credential that says: Agent A, holding these specific capabilities, explicitly delegates the following subset to Agent B for this transaction.

Second, delegation must be monotonically narrowing. What Agent B receives from Agent A can be equal to or narrower than what Agent A holds. It cannot be broader. This is enforced cryptographically, not organizationally. The invariant holds regardless of whether Agent B is in the same organization as Agent A, the same data center, or across the internet.

Third, the delegation chain must be verifiable by downstream participants. When Agent C receives a request from Agent B, Agent C must be able to verify the complete chain: that Agent B's authority came from Agent A, that Agent A's authority came from the user, that each delegation in the chain was a valid narrowing, and that the action Agent B is requesting falls within the scope of the delegated authority. Without verifiability, a downstream agent cannot distinguish a legitimate delegation from an attempted confused deputy attack.

What This Looks Like in Practice

At CapiscIO, the mechanism that implements this is the Authority Envelope — a signed delegation artifact that every agent in a chain must carry and that every downstream participant must verify.

When a user initiates a workflow, a root Authority Envelope is issued, binding the originating principal's identity to a declared capability class and a set of constraints. Every agent that receives a delegation gets a derived Authority Envelope that can only be narrower than its parent. The derivation is cryptographically bound: the child envelope's signature covers a hash of the parent envelope, so the chain cannot be forged or altered.

When a downstream agent receives a request, its Policy Enforcement Point verifies the full chain before executing any action. It checks that each envelope in the chain is validly signed, that each delegation is a valid monotonic narrowing, and that the requested action falls within the capability class and constraints of the innermost envelope. If any of these checks fail, the action is denied.

A confused deputy attack in this model requires the attacker to forge a delegation chain. That requires either compromising a signing key or producing a valid signature without the key. The former is a key management problem; the latter breaks the cryptographic primitives. Both are significantly harder than the current attack, which in most deployed agent systems requires nothing more than a carefully worded request.

The Relevant Standards Activity

The industry is converging on this architecture, though slowly and incompletely.

The IETF WIMSE working group is developing workload identity standards that address the identity half of the problem — giving workloads cryptographically verifiable identities. OAuth's emerging on-behalf-of drafts address one-hop delegation in a token-based model. Neither standard fully addresses multi-hop chains or cross-organizational delegation with monotonic narrowing semantics.

The academic literature on capability-based security goes back to the 1970s — the E programming language, object-capability models, Mark Miller's work at Google on Caja. The theoretical foundation is solid. The gap is production-ready infrastructure that implements these principles for modern agent communication protocols.

That gap is what CapiscIO is built to close. The Confused Deputy Problem is thirty-six years old. Multi-agent AI systems have given it a new deployment context where the conditions for exploitation are the default, not the exception.

Written by Beon de Nood

Creator of CapiscIO, the developer-first trust infrastructure for AI agent discovery, validation and governance. With two decades of experience in software architecture and product leadership, he now focuses on building tools that make AI ecosystems verifiable, reliable, and transparent by default.

inside a narrowing architectural structure

Why Authority Only Shrinks: The Case for Monotonic Narrowing in Agent Delegation

There is a principle at the core of CapiscIO's architecture that we state in 3 words: authority only shrinks. It sounds obvious. It is surprisingly rare.

Beon de NoodMarch 21, 2026

mathematician is stumped by a problem on a chalkboard

Automated Intent Classification for AI Agents Is an Unsolved Problem

At a security conference presentation a label read "Intent Engine" But how does it actually work? The answers were evasive in a way that is becoming familiar.

Beon de NoodMarch 13, 2026

a chain with a sign reading authorized personnel only

Why Agent Identity Is Not the Same as Agent Authorization

Every serious AI agent platform built in the last eighteen months has solved identity. None of that solves the problem that will actually burn enterprises.

Beon de NoodFebruary 13, 2026