Cache Bank conflicts and CacheBleed: What is there to worry about?

Question

It seems that researchers have announced yet another vulnerability regarding OpenSSL, separate from the DROWN issue, called CacheBleed. You can view the research paper here.

How does it work, and what exactly should people worry about here?

TL;DR: For anybody stumbling across here: The OpenSSL devs weren't careful enough and now it is possible to find the secret RSA key using only "cache bank conflicts". It only applies to old processors (Sandy Bridge or older) and if the attacker is present on the same CPU (i.e. in a different VM on the same machine). Countermeasures: Patch OpenSSL, don't run important crypto on shared machines and use more modern hardware. — SEJPM, Mar 01 '16 at 17:35

score 13 · Accepted Answer · answered Mar 01 '16 at 17:58

Side-channel attacks are currently all the rage. The conceptual core idea is that while a cryptographic algorithm is a mathematical object whose inputs and outputs can be expressed in an abstract way, it will also run on some physical hardware somewhere, and that execution may leave traces in various ways, that might reveal bits of information about the data which is processed, in particular any private key.

It all began with the publication of timing attacks against RSA by Paul Kocher, back in 1996. In that case, the RSA private key could be rebuilt by making some precise measurements of the execution time of the RSA private operation (decryption) and correlating it with the input data.

There can be many ways by which hardware leaks information. Embedded systems, and in particular smart cards, are especially vulnerable because they get their power or even possibly their clock signals from the outside; and, more generally, they have to keep their secrets secret even when they are, as it were, between the hands of the attacker. Software platforms, e.g. Web servers, have it much easier, comparatively speaking.

There have been a lot of demonstrations in lab conditions of how timing attacks could be applied to algorithms and implementations on software platforms. There are two levels at play here:

The algorithm can have a data-dependent execution path leading to varying execution time even in the abstract model of the algorithm execution. This is what happens with usual square-and-multiply algorithms for modular exponentiation. Kocher's attack is about such an algorithm.
The algorithm may be all "constant-time" within the abstract model, but still exhibit data-dependent execution time variation, directly or indirectly, because of the way it is implemented. For instance, a window-based optimization on the square-and-multiply may still leak information because the elements accessed in memory won't be the same depending on private key bits, and this implies varying behaviour with regards to caches.

CPU cache is a pervasive optimization in all modern hardware platforms except the most reduced (smart cards do not usually have cache, but that's about it). Caches work by keeping a copy of some recently accessed data elements in a special piece of RAM that the CPU can read much more efficiently than common RAM. Details depend a lot on the cache technology (cache line size, alignment, cache-back or write-through, associativity...). Cache-based attacks are attacks that try to leverage cache behaviour. Basically, the algorithm, throughout its execution, will access some data elements, evicting from the cache some other data, so that the next read access to the evicted data will trigger a roundtrip to main RAM, which will be detectably slow (in modern systems, a RAM access that goes to main RAM may take hundreds of clock cycles).

The attacker needs some rather accurate access to the hardware to make cache-based measures. In most demonstration, the attacker can run his own code on the same hardware. This could be another process (in the mainframe model), but it also works if the attacker runs another virtual machine that happens to have been scheduled on the same CPU as the target system. Since modern CPU are multi-core, this is a normal occurrence even in the clouds with the best repute. In some rare cases, lab-level demonstrations of a network-based cache timing attack have been run, by making a call to another service on the same machine that would happen to exercise the relevant cache lines. In the network case, it still requires some precise timing (we are talking about microseconds here) so it is generally assumed to be very hard to pull off from any place other than the local LAN.

Defence against side-channel attacks entails using constant-time code. Basically, the memory access pattern shall be fixed for all possible secret values. This applies to both data accesses, and code (since code itself is also accessed from RAM). Therefore, no data-dependant conditional branches.

Writing constant-time crypto code can be tough to do. However, there are known efficient implementations, e.g. for AES. It is possible to make a good, fast and yet constant-time RSA implementation.

OpenSSL developers, however, did not go that far. They made an effort, but they tried to cut corners, and it backfired. This is what CacheBleed is about. The RSA implementation in OpenSSL is not constant-time in the sense explained above; it merely makes a spirited effort at keeping a fixed access patterns to cache lines, as opposed to actual memory slots. Thus, they were constant-time only on some very specific hardware, not on all of them, and cache-based timing attacks were still possible. CacheBleed uses the "attacker's code on same hardware" model, whose main practical incarnation nowadays would be two VM running in the same cloud.

Do not panic, though, because it is unclear whether cache-based timing attacks are really practical. We have lots of lab demonstration, but no actual usage in the wild has ever been spotted.

One important point to consider is that cache-based attacks are not restricted to cryptographic algorithms. Anything that processes secret data is conceptually vulnerable. And if a machine is dealing with encryption, then it certainly manipulates confidential data. Cryptographers tend to focus on cryptographic keys because keys concentrate confidentiality and are high-value targets; and also, that's their job. However, the scope of cache timing attacks is larger. In other words, if cache timing attacks apply at all on your system, then you have a big problem and the cryptographic layer is only a small part of it.

What should you do ? The same thing as always: install the security patches. It should go without saying. That it still needs to be said is embarrassing. For you, I mean.

And, on a more general basis, if you intend to "go cloud" and share the same hardware with complete strangers, well, think twice. A really good cloud provider will enforce hardware separation for their big customers, i.e. ensure (contractually and technically) that any given machine will never run concurrently two VM from distinct customers. Alternatively, there is a lot of good to be said about private clouds.

It is to be noted that the authors of the CacheBleed attack explicitly mentioned that they are not sure about the possibility of executing CacheBleed in a cloud environment. It seems difficult (though not impossible) for an attacker to manage to get the same host as its target (you do not choose your host) and then to manage to get the same CPU as the target process on this host (reliably) despite the VM abstraction. So the threat to the cloud environment is still unclear. — Matthieu M., Mar 01 '16 at 19:44

Cache Bank conflicts and CacheBleed: What is there to worry about?

1 Answers1