Handling Cookies is a Minefield

HTTP cookies are a small piece of data set by either Javascript or HTTP servers, and which are essential for maintaining state on the otherwise stateless system known as the World Wide Web. Once set, web browsers will continue to forward them along for every properly scoped HTTP request until they expire.

I had been more than content to ignore the vagaries of how cookies function until the end of time, except that one day I stumbled across this innocuous piece of Javascript:

const favoriteCookies = JSON.stringify({
  ginger: "snap",
  peanutButter: "chocolate chip",
  snicker: "doodle",
});

document.cookie = ` cookieNames=${favoriteCookies}` ;

This code functioned completely fine, as far as browsers were concerned. It took a piece of boring (but tasty) JSON and saved the value into a session cookie. While this was slightly unusual — most code will serialize JSON to base64 prior to setting them as a cookie, there was nothing here that browsers had any issue with. They happily allowed the cookie to be set and sent along to the backend web server in the HTTP header:

GET / HTTP/1.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
Connection: keep-alive
Cookie: cookieNames={"ginger":"snap","peanutButter":"chocolate chip","snicker":"doodle"}
Host: example.com

Which was all well and good, until it got passed along to some code that used the Go standard library. The Go standard library couldn't parse the cookie, leading to cascading failures all the way up the stack. So what happened?

The Specification

Cookies were initially defined in RFC 2109 (1997), and subsequently updated in RFC 2965 (2000) and RFC 6265 (2011), with a draft version that is in the process of being updated (and is what this article uses).

There are two sections of the RFC that pertain to cookie values:

Section 4.1.1 (on how servers should send cookies)

Informally, the Set-Cookie response header field contains a cookie,
which begins with a name-value-pair, followed by zero or more
attribute-value pairs. Servers SHOULD NOT send Set-Cookie header
fields that fail to conform to the following grammar:

set-cookie        = set-cookie-string
set-cookie-string = BWS cookie-pair *( BWS ";" OWS cookie-av )
cookie-pair       = cookie-name BWS "=" BWS cookie-value
cookie-name       = 1*cookie-octet
cookie-value      = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
cookie-octet      = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
                      ; US-ASCII characters excluding CTLs,
                      ; whitespace DQUOTE, comma, semicolon,
                      ; and backslash

Section 5.6 (on how browsers should accept cookies)

A user agent MUST use an algorithm equivalent to the following algorithm
to parse a set-cookie-string:

1. If the set-cookie-string contains a %x00-08 / %x0A-1F / %x7F character
   (CTL characters excluding HTAB):
     Abort these steps and ignore the set-cookie-string entirely.

2. If the set-cookie-string contains a %x3B (";") character:
   The name-value-pair string consists of the characters up to, but not
   including, the first %x3B (";"), and the unparsed-attributes consist
   of the remainder of the set-cookie-string (including the %x3B (";")
   in question).

Otherwise:

1. The name-value-pair string consists of all the characters contained in
   the set-cookie-string, and the unparsed-attributes is the empty string.

There are three things that should immediately jump out to you:

  1. What servers SHOULD send and what browsers MUST accept are not aligned, a classic example of the tragedy of following Postel's Law.
  2. There is nothing here that limits what cookie values are acceptable for browsers to send to servers, aside from the semicolon delimiter. This might be fine if servers only received cookies they themselves had set, but cookies can also come from document.cookie and contain values outside the %x21, %x23-2B, %x2D-3A, %x3C-5B, and %x5D-7E characters as allowed by Set-Cookie.
  3. It doesn't acknowledge how standard libraries that handle Cookie headers should behave: should they act like user agents or like servers? Should they be permissive or proscriptive? Should they behave differently in different contexts?

And herein lies the very crux of the resulting issue I ran into: everything behaves differently, and it's a miracle that cookies work at all.

Web Browsers

First, let's start with how web browsers behave. The teams behind Gecko (Firefox), Chromium, and WebKit (Safari) work together constantly, so it would be reasonable to expect them to all behave the same… right?

Before we dig in, remember that the RFC contradictorily says that Set-Cookie headers may contain any ASCII character besides control characters, double quotes, commas, semicolons, and backslashes, but that browsers should accept any cookie value that does not contain control characters.


Firefox

Firefox's code for valid cookie values looks like this:

bool CookieCommons::CheckValue(const CookieStruct& aCookieData) {
  // reject cookie if value contains an RFC 6265 disallowed character - see
  // https://bugzilla.mozilla.org/show_bug.cgi?id=1191423
  // NOTE: this is not the full set of characters disallowed by 6265 - notably
  // 0x09, 0x20, 0x22, 0x2C, and 0x5C are missing from this list.
  const char illegalCharacters[] = {
      0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x0A, 0x0B, 0x0C,
      0x0D, 0x0E, 0x0F, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
      0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F, 0x3B, 0x7F, 0x00};

  const auto* start = aCookieData.value().BeginReading();
  const auto* end = aCookieData.value().EndReading();

  auto charFilter = [&](unsigned char c) {
    if (StaticPrefs::network_cookie_blockUnicode() && c >= 0x80) {
      return true;
    }
    return std::find(std::begin(illegalCharacters), std::end(illegalCharacters),
                     c) != std::end(illegalCharacters);
  };

  return std::find_if(start, end, charFilter) == end;
}
†accepting 0x7F was fixed in bug 1797235 (Firefox 108)

Firefox accepts five characters which RFC recommends that servers not send:

  • 0x09 (horizontal tab)
  • 0x20 (spaces)
  • 0x22 (double quotes)
  • 0x2C (commas)
  • 0x5C (backslashes)

This was initially done to provide parity with Chrome in some long-ago era and lingers on in both codebases.

Astute observers might note that Firefox has a network.cookie.blockUnicode setting that this code checks against, and which rejects all values above 0x80. That groundwork was laid as a result of this research and can be tracked in bug 1797231.


Chromium

The Chromium code for valid cookie values looks like so:

bool ParsedCookie::IsValidCookieValue(const std::string& value) {
  // IsValidCookieValue() returns whether a string matches the following
  // grammar:
  //
  // cookie-value       = *cookie-value-octet
  // cookie-value-octet = %x20-3A / %x3C-7E / %x80-FF
  //                       ; octets excluding CTLs and ";"
  //
  // This can be used to determine whether cookie values contain any invalid
  // characters.
  //
  // Note that RFC6265bis section 4.1.1 suggests a stricter grammar for
  // parsing cookie values, but we choose to allow a wider range of characters
  // than what's allowed by that grammar (while still conforming to the
  // requirements of the parsing algorithm defined in section 5.2).
  //
  // For reference, see:
  //  - https://crbug.com/238041
  for (char i : value) {
    if (HttpUtil::IsControlChar(i) || i == ';')
      return false;
  }
  return true;
}

// Whether the character is a control character (CTL) as defined in RFC 5234
// Appendix B.1.
static inline bool IsControlChar(char c) {
  return (c >= 0x00 && c <= 0x1F) || c == 0x7F;
}

Chrome is slightly more restrictive than Firefox, refusing to accept the 0x09 (horizontal tab) in its cookie values.

Nevertheless (and contrary to the RFC), it is perfectly happy to receive and send spaces, double quotes, commas, backslashes, and unicode characters.


Safari (WebKit)

I'm not able to get access to the code for cookie storage, as it is buried inside the closed source CFNetwork. That said, we can nevertheless examine its internals by running this piece of Javascript:

for (i=0; i<256; i++) {
  let paddedIndex = i.toString().padStart(3, '0') +
    '_' + '0x' + i.toString(16).padStart(2, '0');

  // set a cookie with name of "cookie" + decimal char + hex char
  // and a value of the character surrounded by a space and two dashes
  document.cookie=` cookie${paddedIndex}=-- ${String.fromCharCode(i)} --` ;
}

document.cookie='cookieUnicode=🍪';

:::text
cookie007_0x07   --           localhost   /   Session   16 B
cookie008_0x08   --           localhost   /   Session   16 B
cookie009_0x09   --      --   localhost   /   Session   21 B
cookie010_0x0a   --           localhost   /   Session   16 B
cookie011_0x0b   --           localhost   /   Session   16 B
                   (snip for brevity)
cookie030_0x1e   --           localhost   /   Session   16 B
cookie031_0x1f   --           localhost   /   Session   16 B
cookie032_0x20   --   --      localhost   /   Session   21 B
cookie033_0x21   -- ! --      localhost   /   Session   21 B
cookie034_0x22   -- " --      localhost   /   Session   21 B
cookie035_0x23   -- # --      localhost   /   Session   21 B
                   (snip for brevity)
cookie042_0x2a   -- * --      localhost   /   Session   21 B
cookie043_0x2b   -- + --      localhost   /   Session   21 B
cookie044_0x2c   --,--        localhost   /   Session   19 B
                   (snip for brevity)
cookie044_0x5c   -- \ --          localhost   /   Session   19 B

As Safari stops processing a cookie once it sees a disallowed character, it's easy to see that 0x09 (horizontal tab), 0x20 (space), 0x22 (double quote), and 0x5C (backslash) are okay, but 0x7F (delete), 0x80-FF (high ASCII / Unicode) characters are disallowed.

Unlike Firefox and Chrome which follow the instructions in the RFC to “abort these steps and ignore the cookie entirely” when encountering a cookie with a control character, Safari does not ignore the cookie but instead accepts the cookie value for everything up until that character.

Oddly enough, this quest has uncovered a bizarre Safari bug — setting a cookie with a value of -- , -- seems to result in it trimming the whitespace around the comma.

Standard Libraries

Golang

Let's start with Golang's cookie code, which is where I ran into the problem in the first place.

// sanitizeCookieValue produces a suitable cookie-value from v.
// https://tools.ietf.org/html/rfc6265#section-4.1.1
//
// We loosen this as spaces and commas are common in cookie values
// but we produce a quoted cookie-value if and only if v contains
// commas or spaces.
// See https://golang.org/issue/7243 for the discussion.
func sanitizeCookieValue(v string) string {
    v = sanitizeOrWarn("Cookie.Value", validCookieValueByte, v)
    if len(v) == 0 {
        return v
    }
    if strings.ContainsAny(v, " ,") {
        return `"` + v + `"`
    }
    return v
}

func validCookieValueByte(b byte) bool {
    return 0x20 <= b && b < 0x7f && b != '"' && b != ';' && b != '\\'
}

Golang falls relatively close to the RFC's wording on how servers should behave with Set-Cookie, only differing by allowing 0x20 (space) and 0x2C (comma) due to them commonly occurring in the wild.

You can already see the struggles that programming languages have to deal with — they have to both receive values from browsers in line with Section 5, but also send cookies as per Section 4.1.1.

This can have pretty serious consequences, as you can see from running this code:

package main

import (
  "fmt"
  "net/http"
)

func main() {
  rawCookies :=
    ` cookie1=foo; ` +
    ` cookie2={"ginger":"snap","peanutButter":"chocolate chip","snicker":"doodle"}; ` +
    ` cookie3=bar`

    header := http.Header{}
    header.Add("Cookie", rawCookies)
    request := http.Request{Header: header}

    fmt.Println(request.Cookies())
}

Which outputs only:

[cookie1=foo cookie3=bar]

Invisibly dropping a cookie that all major browsers accept, without any sort of exception to the effect that this is happening. Still, dropping a cookie it didn't understand without any other side effects isn't as bad as it could be.


PHP

Many languages, such as PHP, don't have native functions for parsing cookies, which makes it somewhat difficult to definitively say what it allows and does not allow.

That said, we can set cookies using the code below and see how PHP responds:

[0x09, 0x0D, 0x10, 0x20, 0x22, 0x2C, 0x5C, 0x7F, 0xFF].forEach(i => {
  let paddedIndex = i.toString().padStart(3, '0') + '_' +
    '0x' + i.toString(16).padStart(2, '0');

  document.cookie=` cookie${paddedIndex}=-- ${String.fromCharCode(i)} --` ;
});

document.cookie='cookieUnicode=🍪';

Output:

cookie009_0x09: -- --
cookie009_0x10: -- --
cookie009_0x0d: -- --
cookie032_0x20: -- --
cookie034_0x22: -- " --
cookie044_0x2c: -- , --
cookie092_0x5c: -- \ --
cookie255_0x7f: -- --
cookie255_0xff: -- ÿ --
cookieUnicode: 🍪

When it comes to control characters, PHP's behavior is all over the place. 0x00-0x09 all work fine, as do things like 0x0D (carriage return), but if you use 0x10 (data link escape) or 0x7F (delete), PHP will completely error out with a 400 Bad Request error.


Python

import http.cookies

raw_cookies = (
    'cookie1=foo; '
    'cookie2={"ginger":"snap","peanutButter":"chocolate chip","snicker":"doodle"}; '
    'cookie3=bar'
)

c = http.cookies.SimpleCookie()
c.load(raw_cookies)

print(c)

Output:

>>> Set-Cookie: cookie1=foo

Python invisibly aborts the loading of additional cookies inside SimpleCookie.load() as soon as it encounters one it doesn't understand. This can be very dangerous when you consider that a subdomain could feasibly set a cookie on the base domain which would completely break all cookies for all domains of a given site.

It's even messier when it comes to control characters:

import http.cookies

for i in range(0, 32):
    raw_cookie = f"cookie{hex(i)}={chr(i)}"

    c = http.cookies.SimpleCookie()
    c.load(raw_cookie)

    for name, morsel in c.items():
        print(f"{name}: value: {repr(morsel.value)}, length: {len(morsel.value)}")

Output:

>>> cookie0x9: value: '', length: 0
>>> cookie0xa: value: '', length: 0
>>> cookie0xb: value: '', length: 0
>>> cookie0xc: value: '', length: 0
>>> cookie0xd: value: '', length: 0

Here we can see that Python invisibly drops a lot of cookies with control characters, and loads others improperly. Note that if you guard those values with something like:

raw_cookie = f"cookie{hex(i)}=aa{chr(i)}aa"

Then none of the control character cookies will load. Overall, Python is extremely inconsistent and unpredictable in its behavior when it comes to loading cookies.


Ruby

require "cgi"

raw_cookie = 'ginger=snap; ' +
"cookie=chocolate \x13 \t \" , \\ \x7f 🍪 chip; " +
'snicker=doodle'

cookies = CGI::Cookie.parse(raw_cookie)

puts cookies
puts cookies["cookie"].value()
puts cookies["cookie"].value().to_s()

Output:

{"ginger"=>#<CGI::Cookie: "ginger=snap; path=">, "cookie"=>#<CGI::Cookie: "cookie=chocolate+%13+%09+%22+%2C+%5C+%7F+%F0%9F%8D%AA+chip; path=">, "snicker"=>#<CGI::Cookie: "snicker=doodle; path=">}
chocolate    " , \  🍪 chip
cookie=chocolate+%13+%09+%22+%2C+%5C+%7F+%F0%9F%8D%AA+chip; path=

The Ruby library appears pretty permissive, seeming to accept every character during parsing and then percent-encoding it when being pulled from the cookie jar.

This may well be the optimal behavior (if such a thing can be said to exist with cookies), but I can certainly see cases where code setting a cookie via document.cookie would not expect to see it reflected back in percent-encoded form.


Rust

use cookie::Cookie;

fn main() {
    let c = Cookie::parse("cookie=chocolate , \" \t foo \x13 ñ 🍪 chip;").unwrap();
    println!("{:?}", c.name_value());
}

Output:

("cookie", "chocolate , \" \t foo \u{13} ñ 🍪 chip")

Rust doesn't ship any cookie handling facilities by default, so this is looking at the popular cookie crate. As configured by default, it appears to be the most permissive of the programming languages, accepting any UTF-8 string tossed at it.

The World Wide Web, aka Why This Matters

The wildly differing behavior between browsers and languages certainly makes for some riveting tables, but how does all this play out in the real world?

When I first discovered this in the real world, it was only through sheer luck that it wasn't a catastrophe. A manual tester was playing around with a third-party library update and had run into a strange set of errors on our testing site. Without bringing it to my attention, this update — doing something unlikely to be caught in automated testing — would have certainly been pushed to production. As a result, every future website visitor would have received a broken cookie and been locked out with an inscrutable error until the update was reverted and the cookies were cleared out.

And that's exactly the problem with this specification ambiguity — it's such an easy mistake to make that millions of websites and companies are only an intern away from a complete meltdown. And it doesn't only affect tiny websites on obscure frameworks, as major websites such as Facebook, Netflix, WhatsApp, and Apple are affected.

You can see for yourself how easy of a mistake this is to make by pasting this simple code fragment into your browser console, substituting .grayduck.mn for the domain you're testing, e.g. .facebook.com:

document.cookie="unicodeCookie=🍪; domain=.grayduck.mn; Path=/; SameSite=Lax"
websites hate this one weird trick

Facebook

facebook error page has images the cookie also breaks

Instagram & Threads

instagram produces this spartan 500 error
no surprise that threads is much the same

Netflix

netflix returns a NSES-500 error
and unfortunately the help page is broken too

Okta

every okta login page returns a 400 error

WhatsApp

whatsapp's descriptively named “whatsapp error”

Amazon

most parts of amazon work, but pieces of it are randomly broken

Amazon Web Services

the AWS login console returns a 400 error and crashes out

Apple Support

apple support is unable to load your devices

Best Buy

navigation works, but the search functionality does not

eBay

mostly fixed, but pieces of eBay still error out with a 400 error

Home Depot

home depot will be fixing it

Intuit

intuit was the only site to identify the cause of the error

Outlook

another 400 error shows how systematic the problem is

How do we fix this?

It's probably not much of a surprise that fixing problems in 30-year-old foundational specifications is really, really hard. And for this problem, it's unlikely that there is a good fix.

Blocking these cookies on the browser side was considered and worked on by both Mozilla and Google:

But it turns out unilateral blocking is quite complicated because while non-ASCII cookies aren't super common overall, affecting only a bit under 0.01% of all cookies, telemetry has found they are considerably more common in countries like Argentina, Mexico, and Finland. While Mozilla did implement a preference that could be toggled quickly (network.cookie.blockUnicode), it hasn't been enabled due to behavioral compatibility issues with Chromium.

Fixing it on the server-side is potentially feasible, but it affects millions of websites and most of the errors caused by this problem are buried deep in programming languages and web frameworks. It might be possible for places like Facebook and Netflix to mitigate the issue, but the average website operator is not going to have the time or ability to resolve the issue.

In truth, the true fix for this issue almost certainly lies in the IETF HTTP Working Group updating the cookie specification to both align with itself and to be strict on how systems handling cookies should behave. Whether non-ASCII characters should be allowed should be identical regardless of whether server-side or on user agents.

And regardless, the steps around how browsers, programming languages, and frameworks should handle cookie processing need to be explicit, in much the way that modern W3C standards such as Content Security Policy do. Aborting the processing of other cookies because one cookie is malformed is unacceptable when such behavior can lead to a wide variety of unexpected behavior.

These processing steps should probably look like something like:

• Start with field-value
• Split on ; and ,, giving list of "raw-cookie-pair". Comma is NOT treated a synonym for semicolon in order to support combining headers, despite RFC7230 section 3.2.2.

For each raw-cookie-pair:
  ◦ If the pair does not contain =, then skip to next raw-cookie-pair
  ◦ Remove leading and trailing whitespace
  ◦ Treat portion before first = as cookie-name-octets
  ◦ Treat portion after first = as cookie-value-octets
  ◦ If cookie-value-octets starts with DQUOTE, then:
    ‣ Remove one DQUOTE at start
    ‣ Remove one DQUOTE at end, if present
  ◦ If resulting cookie-name-octets, cookie-value-octets, or both are unacceptable to server, then skip to next raw-cookie-pair
  ◦ Process resulting [cookie-name-octets, cookie-value-octets] tuple in server defined manner

Servers SHOULD further process cookie-name-octets and reject any tuple where the cookie-name-octets are not a token.  Servers SHOULD further process and cookie-value-octets and reject any tuple where cookie-value-octets contains octets not in cookie-octet.


Summary Table

  1. not as defined as in RFC 5234, but instead as \x00-08 and \x0A-x1F (e.g. CTLs minus htab and delete)
  2. Mozilla stopped allowing CTLs in document.cookie as of Firefox 108
  3. does not abort processing and ignore the cookie as the RFC requires
  4. seems to remove whitespace around commas in some conditions
  5. sometimes allows 0x0A-0D but fails to store them in the cookie value, aborts at other times

Thanks

I couldn't have written this article without a bunch of help along the way. I'd like to thank:

  • Po-Ning Tseng, for helping me investigate this issue in the first place
  • Dan Veditz at Mozilla, for his inexhaustible knowledge and endless kindness
  • Frederik Braun, for his helpful early feedback
  • Steven Bingler at Google, for pushing on getting this issue fixed
  • Peter Bowen, for his thoughts on how cookie processing probably should happen
  • Chris Palmer and David Schinazi, for their insightful proofreading
  • Stefan Bühler, who stumbled across some of this stuff over a decade ago
  • kibwen on HackerNews, for pointing out the Rust crate situation

[Category: Standards] [Permalink]


Refresh vs. Long-lived Access Tokens

One question which I frequently receive is:

Why would you want to use long-lived refresh tokens that generate short-lived access tokens as commonly seen in OAuth 2.0, versus long-lived access tokens? Aren’t you simply replacing one long-lived token with another?

Before diving into everything, some vocabulary to clarify:

Definitions

  • Access token: a secret token that clients can exchange with servers to get access to their resources. These can either be long-lived (and potentially never expire), or short-lived, where they might last for only hours to days.
  • Refresh token: a long-lived secret token that itself does not grant access to resources, but which instead can be exchanged with an authorization server for a short-lived access token
  • Authorization server: the server(s) which consumes refresh tokens and issues access tokens
  • Resource server: the server(s) which consume and validate access tokens, and grants access to authorized services if valid

Why Refresh Tokens

There isn’t any one huge advantage that immediately stands out in favor of refresh tokens. Instead, there are a number of incremental improvements that add up towards making it the overall superior design.

  • It simplifies revocation, for much the same reason that digital certificates (as used in HTTPS) are slowly changing to be 90 days by default. Long-lived access tokens require that all systems that receive the access token need to be constantly checking a central server to see if the token has been revoked.

    When using a refresh token, only the authorization server needs to check for revocation, and the self-contained stateless nature of the short-lived access tokens they generate means that systems which consume them only need to check that they haven’t expired and that their signature is valid. While this doesn’t matter as much in smaller scale systems where there are few resource servers, it both eases development as systems grow and results in sometimes significant performance gains.

  • Short-lived access tokens limit the impact of them being leaked or compromised. While refresh tokens tend to live on and only transit between two endpoints — the client and authorization server — access tokens are transmitted to every single resource server that requires them.

    As a result of violating the core security axiom of minimizing the frequency at which long-lived tokens cross trust boundaries, it becomes immensely more difficult to secure long-lived access tokens from compromise.

    even if the compromised system is repaired, all access tokens sent to it will forever be untrustworthy

    With a refresh token design, the authorization server and its storage can be robustly secured, with access to those systems extremely limited. On the other hand, resource servers are run by dozens of teams with a wide range of technology stacks and security postures. As a result, resource servers are far more likely to leak an access token through improper logging, poor access control, analytics, attacker compromise, etc.

    In the event of a leak or compromise, the impact is far more limited with refresh tokens than it is with long-lived access tokens. Instead of simply fixing the underlying issue and letting the short-lived access tokens expire, long-lived access tokens also require you to either revoke all affected tokens or constantly monitor for abuse on an indefinite basis.

    access tokens disclosed to a repaired but previously compromised system will expire quickly without manual intervention

  • Refresh tokens provide incremental improvements to client security, as allowed by their intermittent use. As long-lived access tokens get used across numerous services on every request, it is necessary that they live in memory. Refresh tokens can live in secure enclaves or keychains, and their infrequent use in both memory and on the network provides some mitigation against transient attacks.

  • Refresh tokens allow for flexibility in future access grants. When using a refresh token, the authorization server is free to either add or remove individual permissions granted to access tokens as time goes on and system designs change. The immutable nature of long-lived access tokens adds significant complexity to permission changes, short of turning them into a quasi-refresh token by exchanging them for different access tokens or by using them to query a centralized permission store.

  • Although a pretty minor benefit, the nature of refreshing access tokens allows abuse teams to build better heuristics around abuse detection. Having a history of refresh and access token behavior allows more powerful anti-abuse detections than simply using a long-lived access token alone.

Why Not Refresh Tokens

While there are a number of upsides to using refresh tokens, there are also some downsides:

  • Increased client complexity as a result of having to implement the logic to exchange refresh tokens for access tokens. Although this is typically a one-time cost and is often abstracted away by OAuth libraries, it nevertheless adds time to build to initial client implementations when compared to a simple access token.

  • The very nature of authorization servers means that they act as a single point-of-failure. This can be mitigated to a significant degree by tweaking access token lifetimes, building redundancy and resiliency around authorization servers, and by having clients request refreshed access tokens comfortably before expiration to avoid temporary “blips.”

    However, the very design of having a centralized authorization server gating the creation of new access tokens means that long outages on these systems can nevertheless result in all dependent resource servers being unable to operate.

    Note that long-lived access tokens have their own single point-of-failure in their need for centralized revocation servers, although systems are commonly designed to “fail open” if their revocation status servers are unavailable despite the trade-off in security that this entails.

Conclusion

I hope this helps to clarify what the upsides and downsides of refresh tokens are, and why modern applications tend to be designed around refresh tokens. While they aren’t perfect, the combined benefits of using refresh tokens and short-lived access tokens are pretty substantial.

Additional Information

[Category: Security] [Permalink]


Cache-Control Recommendations

Cache-Control is one of the most frequently misunderstood HTTP headers, due to its overlapping and perplexingly-named directives. Confusion around it has led to numerous security incidents, and many configurations across the web contain unsafe or impossible combinations of directives. Furthermore, the interactions between various directives can have surprisingly different behavior depending on your browser.

The objective of this document is to provide a small set of recommendations for developers and system administrators that serve documents over HTTP to follow. Although these recommendations are not necessarily optimal in all cases, they are designed to minimize the risk of setting invalid or dangerous Cache-Control directives.

Recommendations

Recommendation Safe for PII Use Cases Header Value
Don't cache (default) Yes API calls, direct messages, pages with personal data, anything you're unsure about max-age=0, must-revalidate, no-cache, no-store, private
Static, versioned resources No Versioned files (such as JavaScript bundles, CSS bundles, and images), commonly with names such as loader.0a168275.js max-age=n, immutable
Infrequently changing public resources, or low-risk authenticated resources No Images, avatars, background images, and fonts max-age=n

Don't cache (default): max-age=0, must-revalidate, no-cache, no-store, private

When you're unsure, the above is the safest possible directive for Cache-Control. It instructs browsers, proxies, and other caching systems to not cache the contents of the request. Although it can have significant performance impacts if used on frequently-accessed public resources, it is a safe state that prevents the caching of any information.

It may seem that using no-store alone should stop all caching, but it only prevents the caching of data to permanent storage. Many browsers will still allow the caching of these resources to memory, even if it doesn't write them to disk. This can cause issues where shared systems may contain sensitive information, such as browsers maintaining cached documents for logged out users.

Although no-store may seem sufficient to instruct content delivery networks (CDNs) to not cache private data, many CDNs ignore these directives to varying degrees. Adding private in combination with the above directives is sufficient to disable caching both for CDNs and other middleboxes.

Static, versioned resources: max-age=n, immutable

If you have versioned resources such as JavaScript and CSS bundles, this instructs browsers (and CDNs) to cache the resources for n seconds, while not purging their caches even when intentionally refreshing. This maximizes performance, while minimizing the amount of complexity that needs to get pushed further downstream (e.g. service workers). Care should be taken such that this combination of directives isn't used on private or mutable resources, as the only way to "bust" the cache is to use an updated source document that refers to new URLs.

The value to use for n depends upon the application, and is ideally set to a bit longer than the expected document lifetime. One year (31536000) is a reasonable value if you're unsure, but you might want to use as low as a week (604800) for resources that you want the browser to purge faster.

Infrequently changing public resources or low-risk authenticated resources: max-age=n

If you have public resources that are likely to change, simply set a max-age equal to a number (n) seconds that makes sense for your application. Simply using max-age will allow user agents to still use stale resources in some circumstances, such as when there is poor connectivity.

There is no need to add must-revalidate outside of the unlikely circumstance where the resource contains information that must be reloaded if the resource is stale.

Directives

For brevity, this only covers the most common directives used inside Cache-Control. If you are looking for additional information, the MDN article on Cache-Control is pretty exhaustive. Note that its recommendations differ from the recommendations in this document.

max-age=n (and s-maxage=n)

  • instructs the user agent to cache a resource for n seconds, after which time it is considered "stale"
  • s-maxage works the same as max-age, but only applies to intermediary systems such as CDNs

no-store

  • tells user agents and intermediates not to cache anything at all in permanent storage, but note that some browsers will continue to cache in memory

no-cache

  • contrary to everything you would think, does not tell browsers not to cache, but instead forces them to check to see if the resource has been updated via ETag or Last-Modified
  • essentially the same as max-age=0, must-revalidate

must-revalidate

  • forces a validation when cache is stale – this can mean that browsers will fail to use a cached resource if it is stale but the site is down
  • generally only useful for things like HTML with time-specific or transactional data inside
  • if max-age is set, must-revalidate doesn't do anything until it expires

immutable

  • indicates that the body response will never change
  • when combined with a large max-age, instructs the browser to not check to see if it's still valid, even when user purposefully chooses to refresh their browser

public

  • indicates that even normally non-cacheable responses (typically those requiring Authorization) can be cached on public systems, such as CDNs and proxies
  • recommended to not use unless you're certain, as it's probably better to waste bytes than to make the mistake of having a private document get cached on a CDN

private

  • indicates that caching can happen only in private browser (or client) caches, not on CDNs
  • note that this wording can be deceiving, as “private” documents are frequently cached on CDNs, with high-entropy URLs
  • documents behind authentication are an example of a good target for the private directive

stale-while-revalidate=n

  • instructs browsers to use cached resources which have been stale for less than n seconds, while also firing off an asynchronous request to refresh the cache so that the resource is fresh on next use
  • great for services where some amount of staleness is acceptable (e.g. weather forecasts, profile images, etc.)
  • can provide a decent performance boost, as long as you're careful to avoid any issues where you require multiple resources to be fresh in a synchronized manner
  • browser support is still limited, so if you decrease max-age to compensate, note that it will affect browsers that don't yet support stale-while-revalidate

Common anti-patterns and pitfalls

Surveys of Cache-Control across the internet have identified numerous anti-patterns in broad usage. This list is not meant to be extensive, but simply to demonstrate how complex and sometimes misleading that the Cache-Control directive can be.

  • max-age=604800, must-revalidate

While there are times that max-age and must-revalidate are useful in combination, for the most part this is saying that you can cache a file but then must immediately distrust it afterwards even if the hosting server is down. Instead use max-age=604800, which says to cache it for a week while still allowing the use of a stale version if the resource is unavailable.

  • max-age=604800, must-revalidate, no-cache

no-cache tells user agents that they must check to see if a resource is unmodified via ETag and/or Last-Modified with each request, and so neither max-age=604800 nor must-revalidate do anything.

  • pre-check=0, post-check=0

You still see these directives appearing in Cache-Control responses, as part of some long-treasured lore for controlling how Internet Explorer caches. But these directives have never worked, and you're wasting precious bytes1 by continuing to send them.

  • Expires: Fri, 09 April 2021 12:00:00 GMT

While the HTTP Expires header works the same way as max-age in theory, the complexity of its date format means that it is extremely easy to make a minor error that looks valid, but where browsers treat it as max-age=0. As a result, it should be avoided in preference of the far more simple max-age directive.

  • Pragma: no-cache

Not only is the behavior of Pragma: no-cache largely undefined, but HTTP/1.0 client compatibility hasn't been necessary for about 20 years.

Glossary

  • fresh — a resource that was last validated less than max-age seconds ago
  • immutable — a resource that never changes, as opposed to mutable
  • stale — the opposite of fresh, a resource that was last validated more than max-age seconds ago
  • user agent — a user's browser, mobile client, etc.
  • validated — the user agent requested a resource from a server, and the server either provided an up-to-date resource or indicated that it hasn't changed from the last request

Learn More

Footnotes

  1. Technically not true thanks to HTTP/2 header compression, but don't send them regardless.

[Category: Security] [Permalink]


Analysis of the Alexa Top 1M sites (April 2019)

Prior to the release of the Mozilla Observatory in June of 2016, I ran a scan of the Alexa Top 1M websites. Despite being available for years, the usage rates of modern defensive security technologies was frustratingly low. A lack of tooling combined with poor and scattered documentation had led to minimal awareness around countermeasures such as Content Security Policy (CSP), HTTP Strict Transport Security (HSTS), and Subresource Integrity (SRI).

Since then, a number of additional assessments have done, including in October 2016, June 2017, and February 2018. All three surveys demonstrated clear and continual improvement in the state of web security. As a year has gone by since the last survey, it seemed like the perfect time to give the world wide web another assessment.

April 2019 Scan

Technology February 2018 April 2019 % Change
(Feb. 2018)
% Change
(All‑Time1)
Content Security Policy (CSP) .022%2
.112%3
.026%2
.142%3
+18%
+27%
+420%
+1083%
Cookies (Secure/HttpOnly)4 8.97% 10.79% +20% +474%
  — Cookies (SameSite)4 .514%
Cross-origin Resource Sharing (CORS)5 96.89% 97.57% +.70% +4.0%
HTTPS 54.31% 71.67% +32% +142%
HTTP → HTTPS Redirection 21.46%6
32.82%7
35.92%6
52.15%7
+67%
+59%
+610%
+485%
Public Key Pinning (HPKP) 1.07% 1.73% +62% +302%
  — HPKP Preloaded8 0.70% 1.73% +141% +308%
Strict Transport Security (HSTS)9 6.03% 8.68% +44% +396%
  — HSTS Preloaded8 .631% .570% -10% +261%
Subresource Integrity (SRI) 0.182%11 0.770%11 +323% +5033%
X-Content-Type-Options (XCTO) 11.72% 16.27% +38% +163%
X-Frame-Options (XFO)12 12.55% 16.42% +31% +140%
X-XSS-Protection (XXSSP)13 10.36% 11.74% +13% +133%
 
Number of sites successfully scanned: 976,431

The overall growth in adoption continues to be encouraging, particularly the rise in the HTTPS and redirections to HTTPS. Overall, an additional 170,000 sites on the Alexa Top 1M now support HTTPS and about 190,000 of the top million websites have decided to do so automatically by redirecting to their HTTPS counterpart.

Subresource Integrity has also seen a sharp increase in uptake, as more and more libraries and content delivery networks work to make its usage a simple copy-and-paste operation. We've also see X-Content-Type-Options gain signicantly increased usage, particularly given that its usage enables cross-origin read blocking and helps protect against side-channel attacks like Meltdown and Spectre.

While the usage of Content Security Policy has continued to grow, it seems to be slowing down a bit. Tools like the Mozilla Laboratory make policy generation a lot easier, but it still remains extremely difficult to retrofit CSP to old and sprawling websites like so many of the top million.

Lastly, whether a result of policy changes in how the HTTP Strict Transport Security preload list is administered or some weird bug in my code, the percentage of the Alexa Top 1M contained in the preload list fell slightly. Oddly enough, of the 20,105 sites that set preload, only 5,540 of them are actually preloaded.

Mozilla Observatory Grading

Progress continues to be made amongst the Alexa Top 1M websites, but the vast majority still do not use Content Security Policy, Strict Transport Security, or Subresource Integrity. When properly used, these technologies can nearly eliminate huge classes of attacks against sites and their users, and so they are given significant weight in Observatory grading.

Here are the overall grades changes over the last year. Please keep in mind that what is being tested now isn't the same as what was being tested three years ago. An A+ in April 2016 was considerably easier to acquire than an A+ is now.

Grade April 2016 October 2016 June 2017 February 2018 April 2019 % Change
  A+ .003% .008% .013% .018% .028% +58%
A .006% .012% .029% .011% .014% +26%
B .202% .347% .622% 1.08% 1.48% +37%
C .321% .727% 1.38% 2.04% 1.82% -11%
D 1.87% 2.82% 4.51% 6.12% 4.62% -24%
F 97.60% 96.09% 93.45% 90.73% 92.03% +1.43%

It's interesting to notice growth at both the top and the bottom. Over the last year, Observatory tests have gotten more difficult, particularly with regards to loading JavaScript over protocol-independent URLs such as this:

<script src="//example.com/script.js">

As a result, the bifurcation in scores likely indicates that more sites have decided to take web security seriously while others at the tail have fallen further into failure.

The Mozilla Observatory recently passed an important milestone of 10 million scans and has now helped over 175,000 websites improve their web security.

That's a big number, but I would love to see it continue to grow. So please share the Observatory so that the web can keep getting safer. Thanks so much!



Footnotes:

  1. Since April 2016
  2. Allows 'unsafe-inline' in neither script-src nor style-src
  3. Allows 'unsafe-inline' in style-src only
  4. Amongst sites that set cookies
  5. Disallows foreign origins from reading the domain's contents within user's context
  6. Redirects from HTTP to HTTPS on the same domain, which allows HSTS to be set
  7. Redirects from HTTP to HTTPS, regardless of the final domain
  8. As listed in the Chromium preload list
  9. max-age set to at least six months
  10. Percentage is of sites that load scripts from a foreign origin
  11. Percentage is of sites that load scripts
  12. CSP frame-ancestors directive is allowed in lieu of an XFO header
  13. Strong CSP policy forbidding 'unsafe-inline' is allowed in lieu of an XXSSP header

[Category: Security] [Permalink]


Lore of MTG - Battlemage

Magic: The Gathering: BattleMage

Released in 1997, Magic: The Gathering - BattleMage was a real-time strategy game by Acclaim Entertainment. Published for the PlayStation and PC, its gameplay was bears little resemblence to the Magic we know today. Nevertheless, it is filled with an incredible amount of lore from early Magic history.

Due to its age and rarity – as well as the storyline's many branching paths – this lore was long-since considered lost to the Vorthos community.

Given Magic’s return to Dominaria, and BattleMage's significance in cards such as Time of Ice, I thought it best to crawl through BattleMage's code to extract the lore contained within.

I do hope you enjoy these texts, which are ordered as they appear in the game. Please contact me if you notice any mistakes in the editing. Thanks!

Stories

Geography

[Category: Magic] [Permalink]