Tal Peretz

Fine-Grained Permissions and Identity Management for AI Agents

The scale of MCP adoption inside enterprises is the part most security teams haven't caught up to. The GitHub MCP server crossed 2 million weekly installs in early 2026 and the Postgres MCP server (which hands AI agents direct SQL access to any Postgres database) crossed 800,000 in the same time window. These tools are giving AI agents production system access at scale, on developer laptops, right now.

This attack surface has already been weaponized. We're seeing cases where unofficial MCP servers with 1,500 weekly downloads are being silently modified to add new fields into their instructions. OWASP added Shadow MCP servers as the ninth-ranked risk in the 2025 MCP Top 10 because the pattern is now consistent enough to standardize.

In this piece, we'll be going over the current state of permissions and identity management for AI agents. We'll highlight the nuances that come with securing agents using traditional security methods, then address those nuances by introducing a layered model that allows enterprises to safely deploy AI agents at scale.

The nuances of agent identity

The operational pattern looks quite similar in most cases. A developer installs an MCP server from GitHub, authenticates with a personal API token, and connects it to their AI client. That connection now sits outside the identity system. When the developer leaves, the connection persists, since nobody knew it existed in the first place.

Most teams push agents through their existing IAM stack. We've done this before, and it fails every time, predictably. These methods were built for humans who log in, do their work, and then log out. Sessions assume someone is sitting at a keyboard making one decision at a time. The result is an identity model that cannot enforce fine-grained permissions at the level agents actually require. Agents violate almost every assumption baked into traditional security methods.

Agents are non-deterministic

The first assumption that breaks is pre-provisioning an agent's access. A service account follows code, calling a known API with known parameters at a familiar cadence. An agent decides at runtime which tools to invoke based on its own reasoning over whatever context it has been given. You cannot predefine what an agent will do, which means you cannot pre-provision the exact credentials it needs. Grant too little and it stops mid-task. Grant too much and you have an autonomous actor spending permissions it should never have had.

Standard OAuth has one subject per token

OAuth was built around a single subject per token. When an agent acts on behalf of a user, the system needs to track two identities at the same time: the user and the agent executing the action. Token structures collapse these two identities into one. When the agent later exceeds its scope, the audit log cannot answer basic accountability questions. Did the user intend this action? Was the agent operating within the delegated boundary? The question of who decided to act has to be separable from the question of whose data is being accessed, and standard OAuth does not recognize that separation.

The blast radius for RBAC is compounded

A role grants access to a resource, not a task. An agent's job is typically hyper-specific and short-lived. Granting an agent read access to every table in your data warehouse because it needs to pull one sales report over-provisions by orders of magnitude for the actual task at hand. The naive fix is more granular roles, which tends to balloon into thousands of micro-roles for specific tasks that become impossible to audit. The stakes are even higher because over-provisioned agents can bulk-edit records and trigger destructive workflows in seconds, before any alert fires.

Confused deputy attacks

The confused deputy attack happens when an agent with valid credentials is manipulated by malicious input. This can take the form of a tool description with embedded exfiltration instructions or a prompt injection hidden in a document the agent was asked to summarize. Every identity check passes, which is what makes this class of attack so difficult to catch. The problem is not authentication. It is the inability to express "this agent can read public issues but cannot write to pull requests that reference private repositories" as a policy. Authorization at session start does not stop this.

One important clarification here is that policy enforced at the tool-call layer governs whether a call should happen. It does not govern what data flows through an authorized call. Embedded credentials in tool outputs or hidden characters designed to steer an agent’s next action all pass through a permitted call if we don’t scan at the content level.

Agent delegation is difficult to represent

Existing IAM systems can’t represent a chain of agents delegating to other agents. Hierarchy-based authorization graphs that try to model an orchestrator agent that spawns a research subagent which spawns a code-execution agent break under write load. When thousands of agents are spawning subagents dynamically, the graph itself becomes the bottleneck before any actual security decision gets made.

These failures amplify each other, and it is the default configuration we have seen in most enterprise MCP deployments running right now: an agent connecting through a shadow MCP server, using long-lived credentials outside the identity system, operating under a role designed for a human, authorized at connection time, and possibly spawning subagents along the way. So how do we address all of this?

How to manage agent identity and permissions at scale

The answer is a layered approach. No single feature solves everything. The instinct is to over-restrict. That pushes developers underground. The companies that actually solve this go the other direction. They build a governed path that is faster than the ungoverned one. When the compliant path has less friction than the shadow path, the shadow problem disappears on its own.

Detect existing MCP servers first

First, we need to find MCP connections that are already in your environment. Device scanning through existing MDM tools like Rippling and Jamf catches connections that route around the gateway entirely. From there, the goal is to redirect those connections to a governed catalog that’s genuinely easier than the ungoverned path. Only then does the policy enforcement at the tool-call layer actually matter, because only then is there enough traffic flowing through the control plane for agent policies to govern.

Use PBAC for access control

Fine-grained agent permissions require authorization that evaluates at runtime against the full request context. This means who the user is, who the agent is, what tool is being called, what arguments are being passed, what IP the request originates from, what OAuth status is in effect, and what annotations the tool carries. A policy like "this agent can query this object only when the table prefix matches sales_ or finance_ and the request originates from a corporate IP range" is a declarative rule in PBAC. In standard FGA methods, it is custom application logic that every consuming service has to implement and maintain independently.

The authorization model runs on two distinct layers that solve two different problems. The platform layer governs who can approve MCP servers, create policies, view audit logs, and manage settings. This is enforced at every dashboard API endpoint. The tool-call layer is where PBAC governs what actually happens at invocation time, evaluated per call at the proxy. A user needs the right role at the platform layer and a matching PBAC policy at the tool layer.

Asymmetric access control

The sharpest property of PBAC for agent authorization is one that standard FGA fundamentally cannot express. For on-behalf-of (OBO) agents, agent policies and user policies are evaluated independently in sequence. Both must allow for access to be granted. The agent side can be stricter. FGA's symmetric intersection check treats both sides the same. It cannot express "users can, but agents acting as them cannot.”

Each agent is also treated as a first-class OAuth client with its own Client ID and Client Secret, its own policies, and its own audit trail. When the agent operates autonomously, it uses an M2M (machine-to-machine) token and only agent-level policies apply. When it acts on behalf of a user, it uses a token that correctly captures this relationship. This is what makes the delegation chain machine-readable in every audit log entry.

Broker credentials to delegated agents

When an agent makes a delegated call to an OAuth-protected MCP server, a credential broker resolves the right session at call time. It checks for the calling user’s personal session grant for that server first. If one exists, it falls back to a shared session grant that an admin has previously promoted for that agent. If neither exists, then the call fails. The agent gets the data that it needs without ever holding the keys to get more. This is what eliminates the problem with long-lived tokens.

Closing Thoughts

Fine-grained permissions and identity management for agents, built on a proper control plane, is what makes deploying AI safely and at full scale possible. The framing that forces enterprises to choose between AI adoption and controlled risk is a failure of architecture. Agents are a distinct identity class. They are not humans and they are not service accounts. The teams that understand this build agent identity as a first principle.

That is what Runlayer is built for. Runlayer takes all of these controls and puts it in one platform. For more information, check us out here.

May 18, 2026

•

Tal Peretz