It seems risky to me. HTTP compression is fine for static resources, but for some dynamic resources served over SSL, it seems like HTTP compression might be dangerous. It looks to me like HTTP compression can, in some circumstances, allow for CRIME-like attacks.
Consider a web application that has a dynamic page with the following characteristics:
It is served over HTTPS.
HTTP compression is supported by the server (this page will be sent to the browser in compressed form, if the browser supports HTTP compression).
The page has a CSRF token on it somewhere. The CSRF token is fixed for the lifetime of the session (say). This is the secret that the attack will try to learn.
The page contains some dynamic content that can be specified by the user. For simplicity, let us suppose that there is some URL parameter that is echoed directly into the page (perhaps with some HTML escaping applied to prevent XSS, but that is fine and will not deter the attack described).
Then I think CRIME-style attacks might allow an attacker to learn the CSRF token and mount CSRF attacks on the web site.
Let me give an example. Suppose the target web application is a banking website on www.bank.com
, and the vulnerable page is https://www.bank.com/buggypage.html
. Suppose the bank ensures that the banking stuff is only accessible by SSL (https). And, suppose that if the browser visits https://www.bank.com/buggypage.html?name=D.W.
, then the server will respond with a HTML document looking something vaguely like this:
<html>...<body>
Hi, D.W.! Pleasure to see you again. Some actions you can take:
<a href="/closeacct&csrftoken=29238091">close my account</a>,
<a href="/viewbalance&csrftoken=...">view my balance</a>, ...
</body></html>
Suppose you are browsing the web over an open Wifi connection, so that an attacker can eavesdrop on all of your network traffic. Suppose that you are currently logged into your bank, so your browser has an open session with your bank's website, but you're not actually doing any banking over the open Wifi connection. Suppose moreover that the attacker can lure you to visit the attacker's website http://www.evil.com/
(e.g., maybe by doing a man-in-the-middle attack on you and redirecting you when you try to visit some other http site).
Then, when your browser visits http://www.evil.com/
, that page can trigger cross-domain requests to your bank's website, in an attempt to learn the secret CSRF token. Notice that Javascript is allowed to make cross-domain requests. The same-origin policy does prevent it from seeing the response to a cross-domain request. Nonetheless, since the attacker can eavesdrop on the network traffic, the attacker can observe the length of all encrypted packets and thus infer something about the length of the resources that are being downloaded over the SSL connection to your bank.
In particular, the malicious http://www.evil.com/
page can trigger a request to https://www.bank.com/buggypage.html?name=closeacct&csrftoken=1
and look at how well the resulting HTML page compresses (by eavesdropping on the packets and looking at the length of the SSL packet from the bank). Next, it can trigger a request to https://www.bank.com/buggypage.html?name=closeacct&csrftoken=2
and see how well the response compresses. And so on, for each possibility for the first digit of the CSRF token. One of those should compress a little bit better than the others: the one where the digit in the URL parameter matches the CSRF token in the page. This allows the attacker to learn the first digit of the CSRF token.
In this way, it appears that the attacker can learn each digit of the CSRF token, recovering them digit-by-digit, until the attacker learns the entire CSRF token. Then, once the attacker knows the CSRF token, he can have his malicious page on www.evil.com
trigger a cross-domain request that contains the appropriate CSRF token -- successfully defeating the bank's CSRF protections.
It seems like this may allow an attacker to mount a successful CSRF attack on web applications, when the conditions above apply, if HTTP compression is enabled. The attack is possible because we are mixing secrets with attacker-controlled data into the same payload, and then compressing and encrypting that payload.
If there are other secrets that are stored in dynamic HTML, I could imagine that similar attacks might become possible to learn those secrets. This is just one example of the sort of attack I am thinking of. So, it seems to me that using HTTP compression on dynamic pages that are accessed over HTTPS is a bit risky. There might be good reasons to disable HTTP compression on all resources served over HTTPS, except for static pages/resources (e.g., CSS, Javascript).