Zero Trust Infrastructure for Multi-LLM Context Routing

Why traditional OAUTH hits a wall and we need UMA
Ever tried sharing a medical record with a specialist or letting a tax app see your bank data without just handing over your password? It usually ends up being a mess of “all or nothing” permissions that makes security teams lose sleep.
Traditional oauth2 is great for “I let this app post to my feed,” but it hits a wall when you want to share your stuff with someone else. It was built for an app to act on your behalf, not for delegating access to a third party like a doctor or an accountant. (Protecting your information and data when using applications) This leads to messy, broad permissions that are a total security nightmare.
In a standard flow, you authorize an app to use your data. There isn’t a clean way to say “Let my son see my prescriptions, but only for the next 48 hours.” Most apps just ask for a giant bucket of scopes. If you want to share one file in a healthcare portal, you often end up giving access to the whole folder because the api doesn’t know how to be granular.
According to Ssojet, traditional oauth hits these specific walls:

Client acts for you: Standard flows don’t account for a second human (the Requesting Party) entering the mix. (Enable Sharing for Flow Orchestration Objects (Release Update))
Static permission buckets: Traditional scopes are often hardcoded and broad, making it impossible to share a single specific resource.
The “Owner’s Brain” problem: oauth2 lacks a native way to let the data owner set policies at a central hub—the Authorization Server (as)—that can be checked later.

One of the biggest headaches is that the owner usually has to be “online” to hit the authorize button. but what if a consultant needs access at 3 AM? a 2018 report from the Kantara Initiative noted that uma 2.0 was designed to fix these “asynchronous” sharing gaps, allowing policies to exist before the request even happens.
This setup means a patient can share labs with a doctor without being online the moment the doctor clicks “open.” The policy is already at the as, waiting.
Next, we’ll look at the actors in this dance.
Meet the cast: Five actors in the UMA 2.0 dance
So, we’re diving into the cast of this uma 2.0 dance. It’s basically a five-person play where nobody really trusts each other without a specific hall pass. If you’ve ever tried to manage api access for a third-party app, you know it’s a headache—uma just makes that headache someone else’s problem (the authorization server).
First up, you got the Resource Owner (ro). This is usually just a regular person—like Alice—who owns the data. Then there is the Requesting Party (rqp). This is the “other” person, like Alice’s accountant or her doctor, who actually wants to see the files.
The Client is the tool the rqp uses. Think of it as the tax app or the medical portal. It’s the middleman that does the actual fetching, but it only acts when the rqp gives it the green light.

Then we have the Authorization Server (as)—this is the “brain,” like PingAM. It holds all the rules Alice made. The Resource Server (rs) is where the data actually sits (like a bank api or a cloud drive).
As mentioned earlier, these two are “loosely coupled.” The rs doesn’t need to know why someone is allowed in; it just checks with the as using a protection api. According to Ping Identity, this setup lets the as manage the “grant rules” while the data stays put.

The PAT: The rs uses a Protection API Access Token (pat) to talk to the as.
Permission Tickets: If a client shows up without a token, the rs asks the as for a “ticket” to hand back to them.

It’s a bit of a back-and-forth, but it keeps the sensitive logic away from your actual data. Next, we’ll look at how these resources get grouped together.
Understanding Resource Sets
Before we get into the technical handshake, we gotta talk about Resource Sets. In uma 2.0, you don’t just register every single byte of data individually—that would be a nightmare for your database. Instead, you group things into a “Resource Set.”
A Resource Set is basically a collection of one or more resources that you manage under a single policy. Think of it like a folder in google drive. You might have ten different pdfs in there, but you register them as one “Tax Documents 2024” set at the as. This lets the owner set one rule—like “my accountant can view this”—that applies to everything in that bucket. It’s way more efficient than making a thousand api calls for a thousand files.
Next, we’ll look at the technical handshake.
The technical handshake: PATs, Tickets, and RPTs
Ever wonder how these systems actually talk to each other when you aren’t around? It’s basically a high-stakes game of “show me your pass” involving three main tokens that keep your data from leaking out to the wrong people.
Before anything else happens, the rs (Resource Server) needs its own login to the as. This is the PAT. It’s just a standard oauth2 token but with a very specific uma_protection scope. Honestly, without this, the rs is just a lonely server that can’t tell the as about any new files or ask for permission tickets.
As noted earlier, this token is the glue. It binds the owner, the rs, and the as into a single trust circle. If the rs wants to register a new healthcare record or a bank statement, it has to show the PAT first to prove it has the right to manage that specific user’s stuff.
This is where the “handshake” gets interesting. When a client app shows up at the rs without a token, the rs doesn’t just say “go away.” Instead, it triggers a 401 Unauthorized flow. It asks the as for a permission ticket—a short-lived handle—and drops it in the WWW-Authenticate header for the client to find.

The client then takes that ticket to the as token endpoint to trade it for a Requesting Party Token (RPT). This is the actual “key” that lets them in. According to Justin Richer, uma 2.0 simplified this by using a standard oauth extension grant called urn:ietf:params:oauth:grant-type:uma-ticket. Because it uses this specific grant type, devs can use regular libraries instead of writing custom code from scratch.

PAT: The RS’s credentials to talk to the brain (AS).
Permission Ticket: A temporary “I was here” note passed from RS to Client.
RPT: The final access token the client uses to actually get the data.

It sounds like a lot of back-and-forth, but it keeps the “who is allowed to see what” logic far away from your actual database. Next up, we’ll look at the hurdles you’ll face when building this.
Implementation hurdles and engineering best practices
Upgrading from oauth2 to uma 2.0 is a bit like swapping a bike for a jet engine—it’s fast and powerful but you gotta watch the gauges or things get messy. honestly, i’ve seen teams get so hyped on granular control they forget about the actual network costs.
The biggest hurdle is “scope explosion.” When you let users share everything, devs go overboard and create unique scopes for every file id. Suddenly your as is choking on ten thousand scopes nobody can track.
Latency is the silent killer here because of the extra “handshake” trading tickets for tokens. If your as isn’t tuned, users will just stare at a loading spinner.

Generic scopes: Instead of read:file:123, use read:document. Let the rs handle the specific id check once the RPT proves the user has the right permission.
Cache RPTs: As mentioned earlier, validating at the rs level saves network trips. Just keep the ttl short so revokes actually work.
Batching: If a client needs three files, the rs should grab one ticket for all of them at once.

To get this into production, you need a solid sequence for how the rs talks to the as. Here is a quick curl example for registering a resource set:
curl -X POST
-H ‘Authorization: Bearer <PAT>’
-H ‘Content-Type: application/json’
-d ‘{
“name”: “2024 Tax Docs”,
“resource_scopes”: [“view”, “print”],
“type”: “http://finance.com/folder”
}’
https://pingam.example.com/uma/resource_set

When you run this, the as returns a _id in the response. This is super important—the rs must store this _id and map it to the local file or folder. Without saving that mapping, the rs won’t know which resource set to ask about when a user tries to access a specific file later.
As previously discussed, these tickets MUST be single-use to stop hijacking. If you get the architecture right now, you aren’t just building a feature—you’re building real trust.

Tooling Spotlight: Building the backbone with SSOJet
Uma 2.0 fails hard if your authorization server doesn’t actually know who bob from accounting is. You can’t delegate access to someone who doesn’t exist in your system, right?
This is where ssojet comes in to save the day. It acts as the plumbing that connects your messy user directories into a single source of truth. It helps manage the “blast radius”—which is basically how much damage happens if a single credential gets compromised. By centralizing identity, you limit that impact.

Bridging oidc identities: ssojet simplifies the blast radius by linking existing identities to the uma dance.
Scim and directory sync: It provides the scim backbone needed to make sure bob’s permissions are updated in real-time if he leaves the company.
Handling claims: It makes sure the as has the right data to check against alice’s sharing policies.

Honestly, i’ve seen teams try to build this sync themselves and it always breaks. Using an api-first platform just lets you focus on the actual sharing logic. uma 2.0 is finally a tech that matches the privacy promises we make. Honestly, it’s just the right way to build.

*** This is a Security Bloggers Network syndicated blog from Read the Gopher Security's Quantum Safety Blog authored by Read the Gopher Security’s Quantum Safety Blog. Read the original post at: https://www.gopher.security/blog/zero-trust-infrastructure-multi-llm-context-routing

About Author

AndyC

Andy Curtis is an award-winning security consultant, researcher and public speaker. He has been working in the computer security industry since the early 1990s, having been employed by state and federal government, leading healthcare and banking providers across three continents. He has given talks about computer security for some of the world’s largest companies, worked with law enforcement agencies on investigations into hacking groups, and is a regular voice on TV and radio explaining IT security threats.

See author's posts