AI Authentication and Authorization
This article explains that AI security is an extension of existing identity and authorization patterns, not a new discipline. It covers three AI use cases (RAG, tool use, agentic systems) through the lens of authentication and authorization, using a banking example. Key principles include deterministic identity layer, filtering in RAG, and chain of identity for agents.
AI Authentication and Authorization
By Dan Moore
Human identity is the source of AI authority.
I know what you're thinking: another article about AI security? Stick with me. This one is different because it's grounded in a simple, almost obvious truth that the industry keeps forgetting in its rush to ship agents: the same identity and authorization patterns that secured the API boom of the 2010s are exactly what you need to secure AI systems today.
If you've built OAuth integrations, managed API keys, or set up role-based access control, you already have most of the knowledge you need. AI auth is not a new discipline. It's an extension of existing best practices.
"Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin." — John von Neumann
"Anyone who lets AI access resources without deterministic safeguards is, of course, in a state of folly." — The Author
The Von Neumann quote is a classic warning about assuming you can take something reliable and get non-deterministic outputs. The inverse principle applies here. AI systems are probabilistic: they reason, hallucinate, and improvise. But the identity layer that governs who they act for and what they're allowed to do must be deterministic. Identity is not something to "vibe."
This article walks through three AI use cases:
retrieval-augmented generation (RAG)
tool use (MCP and APIs)
agentic systems
And examines them through the lens of authentication, authorization, and identity management. It uses FusionAuth examples, but also notes where there are standards-based solutions.
We'll use a single running example throughout: you are an engineering manager at a bank, looking to improve support desk operations for both employees and customers, with AI.
A Quick Overview of the Use Cases#
Before we dive in, let's define what we're working with.
Retrieval-augmented generation (RAG) augments the data available to an AI model by feeding it documents at query time. Your bank employees or customers ask a question, and the RAG system retrieves relevant internal documents and then provides it to an LLM to ground the LLM's answer. The key auth concern: not every user should see every document. A customer is going to see different documents from a teller, who will see different ones from a VP.
Tool use (MCP and APIs) allows AI systems to take actions like reading from a database, updating a customer record, or calling an external service. The Model Context Protocol (MCP) is an emerging standard for connecting AI tools to services, but plain APIs with rich documentation work too. The key auth concern: controlling what each tool can do, and on whose behalf.
Agentic systems are semi-autonomous, task-oriented workflows that can read data, take action across multiple systems, and ask for human input when needed. They are non-deterministic software components that chain together reasoning steps. The key auth concern: maintaining a chain of identity from the human who authorized the workflow all the way through to every action taken, as well as limiting agents' access.
Here's how these map to what an identity provider can help with:
ScenarioAuthorizationAuthenticationIdentity Management
RAGYesYes (framework-specific)Yes (framework-specific)
Tool UseYesYesYes
AI AgentsYesYesYes
Now let's dig into each of these use cases.
RAG: Making Sure the Model Never Sees What It Shouldn't#
Here's the scenario.
You have bank documents related to customer support tasks, such as loan agreements, customer agreements, compliance policies, wealth management playbooks, and fraud investigation procedures. You want to make them available for customers and employees to query through an AI interface. But not all documents should be available to every user. Customer support, fraud and security, disputes and chargebacks, and loan servicing teams each need access to different document sets. And don't forget customers themselves.
Companies like LinkedIn, DoorDash, and Vimeo already use RAG in production. The pattern is well-established.
Why Identity Matters for RAG#
When answering a query, the LLM should never even see documents the user shouldn't have access to. You don't have to craft some clever prompt. You're not relying on the model to keep secrets. With the right authorization framework, you're filtering documents before they reach the model.
This is primarily an authorization problem. You authenticate the user (prove they are who they claim to be), process their query, pull documents from the vector datastore, then filter the documents based on which documents the user is allowed to query.
The model only receives documents that pass the authorization check.
Implementation#
The implementation follows a straightforward pipeline:
Chunk your documents into segments suitable for vector search.
Build an authorization schema that maps users and roles to document access.
Store metadata alongside your document chunks in the vector database, including which roles, departments, or users can access each chunk.
On retrieval, authenticate the user and get their identity claims.
Filter by user and document attributes that you stored in step 3 before passing results to the LLM.
For authentication, some frameworks use JWTs for authentication; others use API keys. The filtering mechanism depends on your RAG framework as well. For example, LangChain allows you to build a retriever wrapper which calls out to an authorization service before returning results.
For the authorization checks, use a fine-grained authorization (FGA) system. FusionAuth FGA by Permify is one option. It provides deterministic authorization that can be deployed on-site for data safety and scales with your needs.
Your authorization logic should be centralized and a single source of truth, regardless of which RAG framework you're using. You want a filter to leverage this and be deterministic, not probabilistic.
Here's a simplified diagram of the request flow, when the proper metadata has been stored on the documents during the loading.
flowchart LR subgraph User["User"] A[Authenticate] end
subgraph RAG["RAG Pipeline"] B[Query Vector DB] C[Filter using FGA with User and Document Attributes] D[Return Authorized Chunks] end
subgraph LLM["LLM"] E[Generate Response Including Chunks] end
A --> B --> C --> D --> E
But what about capturing that metadata? Documents don't always cleanly map to a given access level, and some documents may have different access for different chunks. Chunking may lose metadata.
For instance, a compliance PDF might contain sections accessible to all employees alongside sections restricted to the legal team. Make sure your chunking pipeline can handle this.
So, plan to capture the user and access metadata as part of your RAG process. If you want the LLM to never see documents the user shouldn't access, you have to make sure the user and access data is available.
Tool Use: MCP and APIs#
Suppose you want to allow customer service team members to use AI tools to update bank customer information — contact details, account preferences, service requests. But different tools are available to different roles, and even with the same tools, different users have different limits. A tier-one support agent might be able to update a phone number but not adjust a credit limit.
Two Paths: MCP and APIs#
The Model Context Protocol (MCP) is an emerging standard that makes any API or service accessible to AI tooling in a structured way. Companies like Block, Bloomberg, and Amazon are already using MCP internally. But MCP isn't the only option — plain APIs work well too. AI models are capable of figuring out API semantics from good docs.
The most recent version of MCP at the time of publishing uses OAuth 2.1 and the authorization code grant for authentication of an AI system or tool. There are also extensions under development for use of the client credentials grant.
APIs re-use traditional authentication methods: API keys or access tokens.
The same gateway patterns you've been using since the REST API era can help rate-limit or monitor access for either MCP or API servers.
MCP Implementation#
Here's how to set up MCP with identity:
Build an MCP server on top of your existing APIs and services. Configure your MCP server to point to an identity provider which supports OAuth 2.1. MCP clients should be either preregistered or created dynamically.
When an MCP client tries to access an MCP server, the MCP server should redirect to the configured identity provider, which will authenticate the user driving the MCP client and then issue a token. The token is then presented to the MCP server.
Learn more about MCP and implementation.
You may need to add fine-grained authorization to the services the MCP server is accessing if you need granular control beyond what OAuth scopes provide.
API Implementation#
For API access, the pattern is even simpler:
Use your existing APIs and services; no MCP server required.
Authenticate users with your identity provider.
Get an access token.
If the AI has a web tool, it can access the API using REST calls, passing the token.
Consider making an SDK using your API available as well. Again, you may need to add fine-grained authorization to your APIs and services if you need granular control beyond what OAuth scopes provide.
You probably have some infrastructure around authentication and your APIs that you might be able to re-use. For example, multiple API gateways work with FusionAuth.
Agentic Systems: Go Forth And Do Work#
This is where things get interesting and where new thinking in AI auth needs to happen.
Agents are non-deterministic software components that can be prompted to complete a task with varying levels of autonomy. They scale to tens or hundreds of instances, interact with humans, APIs, and MCP tools, and chain together reasoning steps.
The Scenario#
Your bank wants to automate new business account setup. A new business needs checking accounts, savings accounts, merchant services, and payroll setup. An agent needs to:
Assess the business type and recommend a package
Gather business documents (EIN, articles of incorporation) from a document store
Check creditworthiness via an API
Schedule an onboarding session with a relationship manager via a calendar service
This is a multi-step, workflow dealing with messy data and external services. This is exactly what agents are good for. But it also means they will be reading sensitive documents, calling external APIs, and scheduling meetings on behalf of a human. The stakes are high.
Chain of Identity#
Here's the foundational concept for securing agents: you need to know who authorized what, when.
When a human kicks off an agent workflow, that human's identity needs to travel with the agent through every step. If the agent reads a file, you need to know which human authorized that read. If the agent schedules a meeting, you need to know on whose behalf. If something goes wrong, you need an audit trail back to the originator.
This audit trail is the chain of identity.
How deeply to carry the human identity depends on your needs. If you're doing authorization checks at each step, the identities needed depend on the rules. If you're primarily logging and debugging, you may only need the human identity and the current agent identity. For schedule-triggered agent workflows, the chain might start with a service account or the author of the cron job.
You implement this using signed JWTs, which ensure that the chain of identity is preserved through your system.
FusionAuth doesn't currently support OAuth Token Exchange (RFC 8693), but the Vend JWT API achieves the same chain of identity semantics. If your identity provider supports token exchange natively, that's a standards-based alternative.
The FusionAuth Vend JWT API lets you create tokens that embed the originating user and prop
[truncated for AI cost control]