An outline for an ICAP/OCAP inspired distributed authorization scheme adjusted to the (post-)web era.
Whenever I don’t write much here, I feel like I’m not producing anything of value. Just to make myself feel better, I want to start with a quick update on what’s been happening.
I was awarded a grant by the Internet Society Foundation for research into next generation internet technologies. This, for me, has a few major effects.
First, it means I could hire some help in doing this R&D work with me. Adrian, my first freelance collaborator, is going to focus on conflict-free replicated data types. How precisely that fits into the Interpeer Project, I will write about later.
Second, it’s a step forward into doing this full-time, which I wanted to do when I set out to do it. Not immediately, but over the course of this year.
Third, it’s the push I needed to invest a bit of money into forming a non-profit organization to manage the project. This in turn will permit access to more funds, some tax breaks, and overall help in forming and running a small research team. I am very much looking forward to doing that, and Adrian is the first step here!
Lastly, it means my attention is pulled into more directions. That’s not good. It means my work on getting the channel protocol fully implemented had a few more stumbling stones, but… well, it looks like this may be resolved with time.
Yeah, that doesn’t look so bad any longer. With that out of the way, I want to focus on one area of R&D I’m tackling at the moment, distributed authorization.
Not every reader will be familiar with the terminology. The kind of problem we typically describe with “allowing” or “permitting” or “trusting” folk usually gets abbreviated with triple-A, which stands for:
Authentication, which is the act of establishing the identity of the person we’re dealing with.
Authorization, which is the act of establishing whether this person is allowed to do what they’re trying to do.
And accounting, which is the act of leaving a paper trail of the person’s activities.
Distributed authentication is essentially a solved problem. With a public key cryptography scheme, it’s easy to establish whether a remote party is in possession of a particular private key. What remains to be done is to check that this remote party can produce some credentials verifying their identity.
That latter part, as essential as it is to some use cases, is however so use case specific that it’s not very easy to generalize it. But it is “solved” in the sense that it’s well enough understood one needs a process that ties checking of credentials into checking private key possession, and signing a record of that. It’s the process details that vary.
Distributed accounting is an interesting issue, but not a particularly difficult one. Instead of merely creating a record of activities, one also has to distribute it. What’s fascinating to me is that this may be the only legitimate use case for a blockchain I’ve encountered so far. But that does not mean it requires a crypto currency — just some distributed ledger.
The story of distributed authorization is a lot more interesting, however.
Let’s Do the Timewarp Again
If we head back to the early days of the Intarwebs, the 80s and 90s, it actually turns out that we had some kind of distributed authorization scheme already. There was a rather bitter conflict in CS academia whether this scheme was better or worse than centralized authorization, with the latter eventually winning out.
The reason for this is the distributed version had a bunch of unsolved issues, solution proposals for which often centralized parts of the overall scheme again. Proponents of centralized approaches used this to argue that this effectively proved the superiority of their view. Such purity wars are fascinatingly petty in retrospect, but of course they can shape the course of history.
In either case, without a winning hybrid or fully distributed implementation available, vendors in need of providing actually functioning authorization schemes quite obviously turned to what worked. And so we are here today.
I’ve skirted around naming the old technology here a little because it’s easy to also get confused by it a bit, because it doesn’t actually encompass all of the things we’d associated with a modern authorization scheme. These were the early days, after all.
In the last decade or so, some authorization schemes developed for the web captured a few of the ideas again, but I have not seen mention of this history — I’m not sure if that is some lingering part of the historical conflict at work, or simply a case of reinventions.
I know that for my own part, I “reinvented” the same thing before becoming aware of the history. This has been brewing in my head for a couple of years now, but it was reading Christine Lemmer-Webber’s work at the Spritely Institute that helped me connect the dots.
All of this to say that this isn’t exactly new stuff. What I haven’t seen yet, though, is an approach to the issue that takes things back to generalized principles, and provides a library for a wide variety of uses. That is essentially the R&D focus for me for the next weeks.
Right, so what are we talking about?
Christine’s work on CapTP is inspired by Object Capabilities, or OCAP for short. The fundamental idea of OCAP is that in order to grant authorization to some party, that party needs to provide a token of sorts that proves they’re authorized.
OCAP was originally designed around the idea of object oriented programming environments, where each object will only respond to a method invocation if such a token is provided. The literature often refers to objects-as-processes and arbitrating kernels; clearly the idea was that OCAP was an approach for an operating system to provide local security when one program or component wanted to invoke functionality of another.
OCAP tokens are meant to be copyable, something encodable as Byte sequences. That meant that during an authorization phase, a user creates a token, and hands it off to some process. When that process at a later stage presents the token to another object, the token contains enough information for this object to grant or deny access.
Crucially, though, OCAP did not really consider the identity of the caller to be of any importance. Later work such as Li Gong’s ICAP did put their focus on such issues, but didn’t really account for anything the scale of the web — and couldn’t, really, as it somewhat predates the web.
It’s hard to take OCAP seriously as a distributed authorization mechanism in and of itself. Of course, modern interpretations such as CapTP’s underpinnings are quite different in nature.
OAuth and JWT
The web developed authorization schemes such as OAuth and JSON Web Tokens. Both take ideas from OCAP in the sense that they produce copyable tokens that prove authorization. Unfortunately, both focus relatively narrowly on a specific type of use case which makes them somewhat less suitable for fully distributed authorization — or not without jumping through some hoops.
Rather than break this down here, though, let’s instead go back to first principles and sketch out a proper distributed authorization scheme. It’s then an exercise for the reader to compare this to OAuth, JWT, or whatever else they’d like.
Breaking the Timeline
No, no, this isn’t some multiverse kind of stuff. I promise. I’ll just compare centralized vs distributed authorization briefly.
You essentially have three parties in this kind of system. Alice wants to access something of Bob’s, that Carol is keeping at her home. Alice asks Carol for access. What should Carol do?
Obviously if Bob kept his own stuff, there wouldn’t really be a problem. Bob can decide quite well on his own whether or not to give Alice his
comic books things. It’s the introduction of the third party that creates the issue.
In centralized schemes, Carol will tell Alice to hold off until she’s asked Bob if it’s okay to give Alice the thing. If Bob replies, all is well. The main weakness of this kind of scheme is that Bob is a single point of failure here. If he’s on holiday, Alice can’t get anything. If he’s burned out by all of his friends constantly asking whether it’s okay to give his stuff away, he might need the break, though. Yep, that’s a Denial of Service metaphor.
In distributed schemes, Carol will ask Alice for some token that proves to her that Bob granted permission. She doesn’t have to talk to Bob, who is currently painting happy little clouds. No, different Bob. But it works, so let’s run with it. Everything between Alice and Carol is fine.
The weakness of such a distributed scheme is that between the times that
the oceans drank Atlantis Bob gave the token to Alice, and Alice giving it to Carol, Bob may have changed his mind. Maybe Alice was mean about his hair. The point is, Carol cannot know.
But, for the moment, take away that all distributed schemes really do is move the asking and granting for permission to an earlier point in time. Alice still needs to ask Bob at some point, and Bob needs to respond with a token. That simple breaking of the timeline into two distinct events is enough to take Bob out of the picture as the breaking point.
Anatomy of a Token
The above should be enough to start putting together some details about what a token should contain. We’ll make modifications to the scheme as we address the issues.
OCAP didn’t use identities; it was all about the capability to access an object. We didn’t name the
comic books thing above, but sticking with the generic object term works quite well. Basically, an OCAP token consisted of:
An object identity.
Read, write or a combination of such attributes.
ICAP extended this to an identity.
If you’re like me, this looks familiar. Two major centralized authorization schemes contain records that look surprisingly similar; both Access Control Lists and Role-Based Access Control keep lists of such records. The difference is that ACLs identify individuals, while in RBAC the identity is a group of sorts.
Another way of looking at the difference between centralized and distributed schemes, then, is that it’s not Carol who keeps a list of these records. Instead, Alice and everyone else produces only the single record that pertains to themselves to Carol.
How does Carol know the token is valid? Well, in the beginning I wrote about public key cryptography. If Bob signs the token, Carol can verify the signature, and know it’s all good.
It helps to generalize the above a little. And one way that may inspire this generalization is to realize that the above is actually a semantic triple. I sometimes like to describe these in grammatical terms, as a subject, a predicate and an object.
The subject is the authorized party. This could be a person’s public key (hash), or some group descriptor for RBAC-style schemes.
The predicate is the kind of access granted. In OCAP it’s typically similar to file system access attributes. In e.g. PGP’s Web of Trust, it’s the trust level. The importance is to understand that Alice doesn’t really need to know precisely what the predicate means; this is something that Bob tells Carol. We can use arbitrary strings here.
The object is a unique identifier for some kind of, well, object. This could be some kind of component or service endpoint, but it could also be a file.
We have a signature over the other attributes that can be used to verify that the token was issued by the expected party.
And since that party is not necessarily known in advance, we should also require a signer identifier, which is probably again a public key (hash).
(FWIW, in JWT these attributes are called “claims”, and a specific kind of claim, “entitlements”, is used in OAuth+JWT for authorization. Interestingly, neither specifies a format for the entitlements, for much the same reason as I’ve given above; they can be opaque to the token bearer. The building blocks are there.)
As previously mentioned, revocation tends to be the problem with distributed schemes. We’re seeing much the same issues with e.g. TLS certificates. TLS of course follows a different model, where the client establishes some trust in the server, while here Carol (the server) needs to establish trust in Alice (the client). In either case, trust is checked by the recipient of a token, and in real-time. This re-introduces Bob as a single point of failure, though.
JWT and TLS certificates both introduce a concept of time, but often that’s optional in these schemes. These limit the temporal scope of when a token is valid, usually expressed as a tuple of not-before and not-after attributes (why people chose to use this negated nomenclature instead of from and until shall remain a mystery). These should be mandatory token elements.
The effect is that if Carol receives the token within the specified time span, and the token signature is valid, she can grant access. If either the signature or the time slot don’t match, she rejects the attempt.
So why are revocation mechanisms necessary when this kind of mechanism already exists? Technically, only for emergencies. If Alice did insult Bob’s magnificent hair, Bob may want to tell the world to immediately stop helping Alice out. I get that, Bob’s hair is great and does not deserve this.
The thing is, for a distributed authorization scheme, it’s not only possible to break the timeline, it’s often necessary. The key term here is that the system should be eventually consistent. That’s how “web scale” is achieved, and anything that either demands immediate consistency or blocks until consistent will have scalability issues.
To break the timeline, let’s assume for a while that Carol has a means of eventually receiving Bob’s revocations. But let’s also explicitly rule out a real-time revocation check with Bob. We might as well go centralized again, then. There are ways to protect things in an eventually consistent system that go beyond the token anatomy, and that I’d like to explore in follow-up posts. For now, consider that Carol will receive revocations at some point.
What does a revocation revoke? In TLS, it’s an entire certificate. But a certificate is just a signed set of attributes, much like happens in JWT or our scheme above. A revocation is then itself a token, except that it doesn’t grant any authorization, it revokes it. If we add a type attribute that can contain either value, we’ve got both kinds of token already modeled.
The concept of a revocation token is pretty nifty, but it also introduces some complexity. This complexity consists of two parts: for one thing, Carol needs to store revocation tokens she received on the chance that the token subject will request access at a later point. This introduces state in Carol’s checking of grant tokens that was previously unnecessary. It also introduces some kind of storage burden on Carol.
However, all of this can be at least partially mitigated. Remember, revocation tokens are intended for emergencies. They shouldn’t be issues en masse.
Carol only needs to store revocation tokens from issuers she has some kind of relationship with, in this case from Bob. She can immediately discard Eve’s revocation tokens.
A revocation token also has the from and until attributes. If the current time has advanced further, she does not need to store the token any longer (I’ll come to this point again). Crucially, if Bob creates the revocation token such that it’s until attribute matches or exceeds that of the longest lasting grant token, Bob can help Carol keep storage down. Once that time expires, Carol no longer requires the revocation token to process older grant tokens. Newer grant tokens are by definition issued after the revocation was lifted, and may be trusted again.
Revocation tokens do not have to be issued for a single subject, object or predicate. Each of these fields can be lists. In this way, Bob can undo a number of grant tokens in one fell swoop. Wildcards would help to simplify these lists to let Carol know that nobody may read Bob’s
comic booksthing from now on.
Carol is not required to forward the revocation token to any other party (though she may as an optimization mechanism). That means Carol is also not required to store the revocation token itself. She only needs to verify its validity, and then can use an optimized storage format to record the receipt.
The other consistency issue relates to time. Any time-based scheme requires that clocks between participating machines are synchronized. This kind of time dependency is often reasonable, but at minimum presents a bootstrapping issue for machines not yet synchronized with some shared clock.
Additionally, Bob may have been lax in choosing good expiry times for grant and revocation tokens. Let’s say Bob grants access to Alice forever. And then revokes access forever. And then reconsiders, and grants Alice access for some limited time.
Carol can’t really decide to give Alice access here. The revocation token hasn’t expired, and eclipses the limited time span Bob wants to give Alice access for.
Other systems, such as certificate authorities signing TLS certificates, introduce sequence numbers for this kind of problem. If each token contained a sequence number, Carol would know that the limited time access is in fact the latest thing Bob has said on the issue (that Carol received; it’s still only eventually consistent).
The blochchain-without-proof kind of approach (aka linked list) would be not to introduce a sequence number, but a hash of the preceding token. The benefit is that it lets Carol check precisely the sequence of events in Bob’s decision making process. The huge downside is that Carol needs to construct the full sequence of tokens between any she received to process anything. Let’s not do that, it doesn’t scale.
Sequence numbers don’t address the time synchronization issue, however. Is this an issue?
Time What is Time?
Let’s assume for a moment we’re in a future where distributed authentication is ubiquitous, and also encompasses time sychronization functions. Basically, Bob gives Alice a token that says she can read the clock he’s hung on Carol’s wall.
While it’s always possible to make exceptions for bootstrapping issues, it’s also the case that it’s best to eliminate them. When we look at issues around, for example, automatically provisioning IoT devices such that they can join a secure network, every unauthorized step in the process can break security altogether - at least in principle, if not in practice.
So what can we do to address this issue?
A simple solution is to issue Alice with a token for clock reading that doesn’t expire. She can read the clock, adjust herself, and only then make requests for other things. But as we just discovered in the previous section, tokens that have infinite or very long expiration times aren’t really playing well with revocations.
We have sequence numbers to fix such issues, but Carol’s storage requirements are best if Bob chooses decent expiry times and never issues revocations. In other words, sequence numbers shouldn’t really be of Carol’s main concern; she should typically go by expiration time whether she accepts a token.
That actually raises an interesting point. Carol is still an independent agent. Sure, Bob will only store his stuff at her place if they have some kind of trust relationship going. But if Carol wants to take a mental health day because she’s worn out by the insistence of Bob’s friends, nothing should prevent her from that. That is, Carol must honour Bob’s tokens (including revocations), but is otherwise still governed by local policy.
Local policy means that she can also reject tokens for reasons of her own. That may eventually lead to damaging her trust relationship with Bob, but if she suspects Alice is up to no good, she still needs to be able to do this kind of thing.
In other words, the nominal relationship is for Carol to honour Bob’s wishes. She can be stricter if she wishes to be. But she can’t be more lenient.
Unless, of course, Bob explicitly tells her that she can.
Really, this only applies to token expiry times. Other attributes have pretty binary meanings. But what if Bob had a way of issuing a token that instead of saying “this token is forever” said “this token may or may not be given at a reasonable point in time, I leave it up to you what to do”.
Leaving out the expiry times doesn’t do the trick; that’s semantically equivalent to saying time does not matter. Time does matter, but it’s up to Carol to decide how much. This needs some kind of explicit flag.
Let’s introduce a field expiry policy, which can have the values “local” or “issuer”. If the field is not given, “issuer” is the default. That would permit for Bob to issue a token to Alice for reading his clock at Carol’s, and if Alice’s clock is waaaay off, Carol may still decide to respond positively.
Bob must be very careful with this mechanism. At the same time, Carol can be as strict with this as she wishes to be — either by honouring expiry timestamps precisely, or by e.g. rate-limiting such tokens, etc. It introduces some leniency into the system that does not introduce other consistency issues.
The field does not make much sense in revocation tokens. Let’s just forbid it in those instead of ruminating about potential meanings.
OAuth and JWT, Again
Hopefully the above exploration has provided enough background that comparing to OAuth and JWT is now simple. In summary, JWT can be a valid serialization format for such distributed tokens, but also addresses non-authorization-related use cases.
OAuth’s authorization flow, on the other hand, is inherently centralized. While some of the token anatomy is related in concept, the control flow in distributed authorization is quite different, and most importantly can be spread out over time.
A distributed authorization scheme, inspired by OCAP, JWT, PGP and other prior technology does not seem impossible. While this article does not cover every possible corner case, it outlines an anatomy of authorization tokens that allow for eventually consistent and therefore highly scalable, distributed authorization systems.
The token elements are:
An issuer id. This identifies the key with which to verify the signature.
A token type, with possible grant or revoke values.
A strictly increasing sequence number. This helps disambiguate conflicting tokens.
A scope field. This extra layer of indirection serves no purpose now, but I may introduce different scopes later. For now, it refers to:
A tuple of from and until time stamps. These identify the nominal time period in which the token applies.
An expiry policy, with possible local or issuer values. This specifies whether the above time stamps are to be strictly evaluated as the issuer specified them, or the agent may process them according to local policy.
A claims field. This is a list of semantic triples describing the authorization:
A subject, which may be a public key (hash) identifying a user, a system-defined group identifier, or a wildcard indicating “any subject”.
A predicate, which is a system-defined string or a wildcard indicating “any predicate”.
An object, which is a unique identifier of the object for which authorization is managed, or a wildcard indicating “any object owned by the issuer”.
Finally, a signature over all of the above made with the issuer private key.
The claims list is something I alluded to briefly before. Rather than making each of the tuple fields lists, a list of tuples is a better mechanism for disambiguating how tuple components relate to each other.
That’s the Lego block. Some thoughts on revocations and tightening security in eventually consistent systems without introducing single points of failure will follow.
A quick note on subscriptions: I’m trying to keep doing this work, and have set up subscriptions so that people can help me pay my bills. That means some articles are effectively paywalled — if you can’t pay, fear not: there is a special launch promotion open.
You’ll get free lifetime access, and I can keep the paywall up for folks coming from the outside. And if you are in the right space to be extra awesome and pay for a subscription, that’s all the more appreciated!