OAuth Briefly

A Brief (But Relatively Thorough) OAuth Manual

Jeromy Van Dusen
18 min readDec 4, 2020

What is OAuth?

OAuth is an authorization framework to allow a third-party application to gain limited access to an HTTP resource. Note authorization, not authentication. Authentication is not a part of OAuth. OAuth in conjunction with an authentication mechanism such as OpenID Connect results in a complete authentication/authorization solution. Also note HTTP. OAuth is designed for the web.

What are the different versions all about?

OAuth 1.0 was first drafted in 2007.

OAuth 1.0a was drafted in 2009 to address a session fixation security flaw in the 1.0 protocol. This was then published in 2010.

OAuth 2.0 was published in 2012. It deprecates the previous version of OAuth.

OAuth 2.1 is, as of the time of this article being written in December 2020, currently being drafted. It is intended to consolidate and simplify the most used features of OAuth 2.0 while addressing some security concerns.

So, which one should I use?

For now, use OAuth 2.0 along with some of the recommendations from OAuth 2.1, primarily:

  • Use PKCE with the authorization code grant.
  • Avoid using the implicit grant wherever possible.

Those recommendations will make more sense shortly.

Can you explain the OAuth lingo?

OAuth uses several terms that need to be fully understood.

Roles

First, there are four roles involved in an OAuth transaction:

Resource Owner

This is the entity, either a person or software, with the authority to grant access to a protected resource. In the context of the OAuth goal of allowing an application to gain limited access to an HTTP resource, this is whoever or whatever is authorized to grant that application access to the resource.

Resource Server

This is the server hosting the protected resource. It allows access to the resources when appropriate access tokens are provided in the incoming request.

Client

This is software acting on behalf of and with the authorization of the resource owner making requests to the resource server. A client can be any kind of software running on any kind of device.

Authorization Server

This is the server that provides access tokens to the client after successfully authenticating and authorizing the resource owner.

Note that these are defined as “roles”, not necessarily as distinctly separate “entities”. A single entity could act as multiple roles. Specifically, the resource server and the authorization server are two different roles, but it is entirely possible, although not necessary, that they can be implemented by the same server.

Other Terms

In addition to the four roles, a few other key terms to understand are:

Grant

This is the act of granting the client access to the resource server. In other words, a grant is the whole point of OAuth. We will dig into the various types of grants shortly.

Scope

This is OAuth’s attempt to limit and control the access granted to a client. The intent is to allow the resource owner to define different scopes of access that can be requested by a client, as well as to allow the resource owner to change the scope after access has been granted, although that part is rarely implemented in the real world.

Tokens

Tokens are the physical thing (well, data) that get given out to a client to represent the access that has been granted. A token is like a ticket to a concert. It indicates that the client is allowed in.

What does all that mean in simple, practical terms?

Well, let’s look at a couple use cases.

Federated Identity

A common use case is where a web application wants to delegate authentication and authorization to an external system. This external system could be a website that supports OpenID Connect, such as Google, Facebook, Twitter, etc. It could also be an OpenID Connect implementation owned and operated centrally within a given organization.

For example, a web application (client) wants access to a few pieces of information about the user (resource owner), such as their email address (limited by the scope), from Facebook (both the resource server, as it stores the user’s information, and the authorization server).

This allows the web application to let the user use the application without having to manage sign-ups or logins, delegating those responsibilities to Facebook instead. It also provides a convenience for the user in that they do not need to create an account on the web application, instead simply reusing their existing Facebook account.

Microservices Architecture

Another use case is where an application is architected using microservices. Each microservice is not expected to manage accounts nor provide a login screen. Instead, authentication and authorization will be delegated to a central authorization server.

For example, the user (resource owner) of a single-page web application (client) wants access to information provided by a microservice (resource server). In order to access that information, the client obtains an access token from the central account management service (authorization server) and provides that token to the microservice.

Okay, so how do I use it?

You use OAuth by implementing one of the authorization grant flows. There are a few different flows that handle a few different use cases.

Authorization Code Grant

This is the primary flow that should be used in most use cases involving users. The gist of this flow is that the authorization server provides the client with an authorization code, which the client can then exchange for an access token.

To understand this flow, it’s helpful to understand the trust relationship between the roles involved. The user, acting in the role of resource owner, trusts the client application and wants to give it access to information. The client application doesn’t know who the user is and therefore doesn’t trust them but trusts the authorization server to provide information. The authorization server is aware of the client application but doesn’t trust it enough to give it information about a user without explicit permission from the user, whom it trusts.

Trust Relationship

For a federated identity use case, the process goes like this:

1. The user browses to a web application that allows login using an existing Facebook account. The user clicks a login link, which takes them to a Facebook login page, passing along to Facebook the following additional pieces of information:

  • response_type=code: This tells Facebook that the client is using the authorization code grant flow.
  • client_id: This is an ID that tells Facebook what client application is requesting the grant. The application must have been registered with Facebook as a valid client in advance.
  • redirect_uri: This is a URL that Facebook will use to redirect the user back to the client application with the grant. The URL must have been registered with Facebook in advance as part of the client registration. Facebook should not redirect to an unknown URL, even if the Client ID represents a known client.
  • scope: This is the scope of information being requested by the client. The valid scopes, along with what information is passed along to the client for each scope, must have been registered with Facebook in advance as part of the client registration. Multiple scopes can be provided, or just a single scope.
  • state: This is a unique random string provided to Facebook, which Facebook will provide back with its response. The client can validate that Facebook’s response contained the expected state as a measure to prevent CSRF attacks.

2. If the user does not already have an active Facebook session, then Facebook will present them with a login screen. If they have already logged in and created a session, they will go directly to the next step.

3. Once they are logged in to Facebook, Facebook will inform the authenticated user that a specific application, as determined by the Client ID, has requested specific information and access, as determined by the scope, and will ask the user to confirm that they approve of providing this information to the client application.

4. Once the user clicks the button or link to allow this, Facebook redirects the user using the Redirect URI that was provided by the client application, passing along to the client the following additional pieces of information in the redirect URL:

  • state: This is the same state string that the client provided.
  • code: This is the authorization code that grants access to the requested information. This is not an access token. It is a short-lived code that the client can exchange for an access token.

5. The web application makes a POST request to a known Facebook URL to exchange the authorization code for an access token, passing along to Facebook the following additional pieces of information:

  • grant_type=authorization_code: This tells Facebook that the client has an authorization code that it wants to exchange for an access token.
  • code: This is the authorization code that Facebook provided.
  • redirect_uri: This is the same redirect URL that the client provided before. Not all authorization server APIs require this. If it is required, it is an additional security check to be sure that a known redirect URL is being used.
  • client_id: This is the client’s ID again, telling Facebook which client is requesting the access token. For additional security, the authorization server can choose to double-check that the authorization code provided is one that was assigned to this client.
  • client_secret: This is the client’s secret key to prove that it really is the client making the request. This secret must have been registered with Facebook in advance as part of the client registration. In the case of a mobile application or single-page application where secrets cannot be safely stored on the client, this is not used. We’ll discuss that later.

6. If everything checks out, Facebook responds with an access token, providing the following pieces of information:

  • access_token: This is the access token. OAuth does not specify what kind of token is used. It could be either:
    an opaque token: This is just a meaningless unique string of characters. When a resource server receives this token, it must contact the authorization server to exchange the token for the details of the user and what permissions they have, or:
    a structured token: This token contains the user’s information. When a resource server receives this token, they can read it directly to determine who the user is and what permissions they have. This is typically a JWT (JSON Web Token).
  • token_type: This is the type of token. Generally, the value of this attribute will be “bearer”, indicating a bearer token.
  • expires_in: This is the duration of time for which the token is valid. This is not required; however, it is recommended that tokens be short lived.
  • refresh_token: This is essentially a pre-authorization code that can be exchanged later for a new token, once the current token expires. This is optional.
  • scope: This is the scope within which the authorization server has granted access. It is normally optional but must be provided if the authorization server has granted access within a different scope than the one that was requested. Multiple scopes can be provided, or just a single scope.

7. The web application uses the access token to make various API calls for whatever information it needs.

The client application could be a traditional web application, where all processing occurs on the server side. In this case, step #5 where the client exchanges the authorization code for an access token is handled away from prying eyes in a direct communication between client application server, which resides within the client application’s protected network, and authorization server.

However, the client application could be a mobile or single-page application, where processing occurs on the mobile device or the browser. In this case, step #5 is initiated by code that runs outside of the client application’s protected network. Therefore, the client secret is not an option, as that secret would have to leave the protected network, which is a security risk. To address this, the PKCE extension was created

Authorization Code Grant with PKCE Extension

PKCE stands for Proof Key for Code Exchange and is generally pronounced “pixie”. It allows a client that cannot safely store a secret to prove its identity to the authorization server without providing the client secret.

As this is an extension to the authorization code grant, the same steps already described are still applicable. The PKCE extension merely adds a couple of details to the existing steps:

1. Before step #1, the client application generates a cryptographically random string, known as a “code verifier”. It then creates a “code challenge”, which is a hash of the code verifier. The code challenge can be generated as a Base64 URL encoded SHA256 hash, or if the client does not have the ability to do that, it can just be the plain text code verifier. When the client directs the user to Facebook in step #1, two additional pieces of information are provided:

  • code_challenge: This is the generated code challenge.
  • code_challenge_method=S256: This tells Facebook that the code challenge is a SHA256 hash. If the plain code verifier is provided as the challenge, then this will contain the value “plain” instead. This attribute can also be omitted, which implies “plain”.

2. Facebook will store the code challenge temporarily, along with the authorization code that it generates.

3. When the web application attempts to exchange the authorization code for an access token in step #5, it provides one additional piece of information instead of the client secret:

  • code_verifier: This is the code verifier that was used originally to generate the hashed code challenge that was provided to Facebook.

4. Facebook calculates a SHA256 hash of the code challenge that was provided in the original request and compares the result to the code verifier provided in the exchange request. Alternatively, if a “plain” code challenge was used, it simply compares the two as they are. If there’s a match, Facebook will return the access token.

In OAuth 2.1, the PKCE extension is required, so start using it now.

Client Credentials Grant

This is the primary flow that should be used in most use cases not involving users, such as service-to-service communication. The gist of this flow is that the service requests an access token for itself, providing its own authentication details. The process goes like this:

1. The service requests an access token from the authorization server, passing along the following pieces of information:

  • grant_type=client_credentials: This tells the authorization server that the client is requesting access to its own resources and requests an access token.
  • scope: This is the scope of information being requested and is an optional component of this grant type. Multiple scopes can be provided, or just a single scope.
  • client_id: The client ID that has been registered in advance with the authorization server.
  • client_secret: The client secret that has been registered in advance with the authorization server. This, along with the client ID, provides the authentication mechanism for this request.

2. If the provided credentials are valid, the authorization server responds with the access token. The pieces of information provided in the response are the same as those described in step #6 of the Authorization Code Grant flow.

Device Authorization Grant

This flow is used by devices that are OAuth enabled but do not provide an easy means of entering login credentials, such as a Smart TV or another Internet-enabled device. The gist of this flow is that the device instructs the user to log in using another device, such as their computer or smart phone. For a Google-connected device, the process goes like this:

1. The device requests a device code from Google, passing along the following pieces of information:

  • client_id: This tells Google who is requested the device code.
  • scope: This is optional and indicates the scope of access that is being requested. Multiple scopes can be provided, or just a single scope.

2. Google responds with the following pieces of information:

  • device_code: This is the unique device code for this request.
  • user_code: This is a short, unique code for the user to use during the authorization process.
  • verification_uri: This is the URL that the user should use to authorize the device.
  • interval: This is the polling interval the device should use, as will be described shortly.
  • expires_in: This tells the device when the device code will expire.

3. The device instructs the user to browse to the URL provided in the verification_uri attribute and give it the code provided in the user_code attribute. While waiting for the user to do that, the device begins polling Google at the provided interval, passing along the device code, waiting for a response indicating that the user has completed the authorization process.

4. The user uses a browser on another device to browse to the given URL and enter the given code. Google will collect credentials, inform the user what has been requested by the device, and prompt the user to approve or reject the request.

5. Upon receiving approval from the user, Google responds to the device’s next polling attempt with the access token. The pieces of information provided in the response are the same as those described in step #6 of the Authorization Code Grant flow.

Refresh Token Grant

Access tokens are, or at least generally should be, short lived. The point of using a refresh token is so that the user is not forced to go through the whole login and authorize process every time their access token expires. Instead, the client application can, behind the scenes, quietly obtain a new access token without user intervention, and without storing any sensitive user data.

This is the flow used in any of the previous use cases that provide a refresh token. Therefore, this flow is never used on its own, but is rather an additional flow used alongside other flows, such as the Authorization Code Grant flow. The gist of this flow is that the service requests to exchange a refresh token that was provided with the previous access token for a new access token once the previous access token has expired. The process goes like this:

1. The service requests an access token from the authorization server, passing along the following pieces of information:

  • grant_type=refresh_token: This tells the authorization server that the client has a refresh code that it wants to exchange for an access token.
  • refresh_token: This is the refresh token.
  • scope: This is the scope of information being requested and is an optional component of this grant type. Multiple scopes can be provided, or just a single scope, however the scope requested here cannot include additional scopes that were not in the original request for the previous access token that this refresh token came with. Generally, this attribute is not used, and the new token will use the same scope as the previous token.
  • client_id: This is the client’s ID, telling the authorization server which client is requesting the access token. This is not always required. See the next point for details.
  • client_secret: This is the client’s secret key to prove that it really is the client making the request. The client_id and client_secret are required if the original flow that was used to get the previous access token and its refresh token also used these attributes. If these attributes were not used, such as in the case of an Authorization Code Grant with PKCE extension, then these attributes are not provided in this request.

2. The authorization server responds with a new access token. The pieces of information provided in the response are the same as those described in step #6 of the Authorization Code Grant flow. There may or may not be a new refresh token provided. If no new refresh token is provided, then the currently held refresh token can be reused for future requests.

Implicit Grant

This is a deprecated flow that has been removed from OAuth 2.1. Do not use this flow if you can avoid it. If you think you cannot avoid it, try a little harder just to be sure. Unless you’re using really outdated technology, I bet you can probably use Authorization Code Grant with PKCE. Yes, it’s more complex. It’s also more secure.

The focus of this flow was to handle the scenario where client secrets cannot safely be stored in the client application. This is precisely the problem that the PKCE extension solves. This flow made sense when cross-origin requests were not possible. Today, we have CORS to address this.

If you still think you need this flow, realize that it is not secure. Here’s an article that gives a concise overview of the issues it has: https://www.taithienbo.com/why-the-implicit-flow-is-no-longer-recommended-for-protecting-a-public-client/

This used to be the primary simplified flow for native and JavaScript applications. The gist of this flow is that the service requests an access token from the authorization server, and the authorization server returns it directly in the redirect. For a federated identity use case, the process goes like this:

1. The user browses to a web application that allows login using an existing Facebook account. The user clicks a login link, which takes them to a Facebook login page, passing along to Facebook the sane information that was provided in the Authorization Code Grant flow, with one modification:

  • response_type=token: This tells Facebook that the client is using the implicit grant flow and requests a token to be returned directly.

2. Facebook responds in the same way that it did in the Authorization Code Grant flow, except that when it returns the redirect back to the client application in step #4, the redirect URL contains the following additional pieces of information instead of the authorization code:

  • access_token: This is the access token. This is also one of the causes of security concerns. The token itself is included directly in the redirect URL. This means it can be logged in the browser’s history. Also note that no client secret was provided in order to get this token in the first place. The authorization server has no idea who it is returning the token to, nor can it confirm that it will even be received by the requestor, and ultimately it can be logged and potentially exposed in various ways on the client side. This is all bad.
  • token_type: This is the type of token. Generally, the value of this attribute will be “bearer”, indicating a bearer token.
  • expires_in: This is the duration of time for which the token is valid. This is not required; however, it is recommended that tokens be short lived.

Note that the implicit grant will never include a refresh token. When the access token expires, the process must be repeated.

Password Grant

This is another deprecated flow. There is no reason to use it anymore, so don’t. I’m not even going to go into the details, but the gist of it is that it’s like the implicit flow, except that the service actually collects and transmits the user’s login credentials along with the request for an access token. So, essentially, it implements authentication in the client application. That’s a terrible idea.

What type of token should the authorization server return?

As mentioned previously, OAuth does not specify what an access token must looks like. It is only concerned with how access tokens are provided. Also mentioned previously, there are primarily two types of access tokens: opaque and structured. Deciding what type of token to use is not a decision that OAuth can guide but is nevertheless a decision that must be made whenever implementing a service that uses OAuth, so the token types are described in a bit more detail here.

An opaque token is just a random string of characters that means nothing to the holder of the token. It is used by the authorization server as a key to look up the user associated with it.

A structured token, typically a JWT which is generally pronounced “jot” and stands for JSON Web Token, has the user information encoded in it and is therefore readable by the holder of the token. It is not encrypted, but it is cryptographically signed to ensure that it cannot be tampered with to gain additional access.

So, which one is better? Well, as with everything in the world of software, the choice is a matter of a trade off, and the correct answer will depend on your specific needs. The short answer is, if you need stronger security, opaque tokens are preferred. If you need lower latency, structured tokens are preferred. Why? Well, let’s explore the trade offs the two types of tokens provide.

With an opaque token, a resource server needs to make an additional call to the authorization server whenever it receives a token to obtain the user details that the token represents. This means additional network traffic on each request, and whatever latency is associated with that traffic. It also means additional load on the authorization server. Some system implementers choose to use caching to mitigate those issues but doing so defeats the benefits of an opaque token, along with adding additional complexity.

The benefit of an opaque token is increased security. This is because the token contains no meaningful information and must be exchanged for meaningful, and more importantly current, information from the authorization server. If a user’s access has changed since the token was issued, the resource server will receive the latest information about that user when requesting the information from the authorization server. Additionally, since the token contains no meaningful information, it cannot be tampered with to alter the access.

With a structured token, the user information is encoded right in the token. This means the resource server is not receiving user details directly from the authorization server, and simply trusts that the information in the token is correct due to it being cryptographically signed. In theory, it may be possible to alter the information in the token, though in reality the cryptographic signature makes it extremely difficult. Having the information in the token also means that it is stale information as it represents the user’s access as it was at the time the token was issued, which may or may not still be correct information. Some system implementers choose to add a token black-list or other service that the resource server can contact to determine if there are changes to the token that it needs to be aware of, but doing so defeats the benefits of a structured token, along with adding additional complexity.

The benefit of a structured token is that a resource server does not need to make an additional call to the authorization server to find out who the token represents, as that information is right there in the token. This means less network traffic, less load on the authorization server, and lower latency.

Okay. I think I get it now.

Awesome. Happy authorizing!

--

--