3

Assuming I plugged a HWRNG in to my Linux machine, use OpenSSL to generate a RSA key pair and encrypted some text with AES. Later then, a researcher posted the HWRNG was backdoored. Should I consider those key are safe to use since Linux kernel mixes data not just from the HWRNG but with other sources. How about vices versa?

I read Can we trust encrpytion when the random number generator is compromised? but the condition of this question seem like it use HWRNG ONLY.

Mike Ounsworth
  • 57,707
  • 21
  • 150
  • 207
Hartman
  • 426
  • 2
  • 11
  • 2
    This depends on the HWRNG you are using and how you interface it with OpenSSL. Most extra random sources have a user mode daemon mixing the entropy into the kernel pool, while this could maliciously decrease entropy it cannot really affect security. Especially not since Linux uses RDRAND to whitening all bytes from that store. – eckes Sep 03 '17 at 16:45

2 Answers2

4

There's no evidence that Intel's RDRAND instruction set has been backdoored, just speculation based on it being a high-profile target for such attacks, and the known nation-state capabilities of introducing subtle silicon-level backdoors into products at the supply chain level. While such a case would be incredibly unlikely to be used in a non-targeted way, and even more unlikely to affect your average user, it's still an interesting concept.

Linux's random number generator implementation (i.e. the one exposed by /dev/urandom - forget about /dev/random for now because it's irrelevant for almost every use-case) involves several key components, the most important of which for this discussion are the entropy pool, the mixing system, and the generator itself.

The entropy pool is a block of memory which has data fed into it which is expected to be highly unpredictable. The goal is to introduce enough uncertainty that it is highly infeasible for anyone to guess what the data inside the pool is. The pool does not need to be large; even a few hundred bytes is quite sufficient to be effective.

Each source of data, which might include CPU debug registers, performance counters from the chipset, timing measurements from the display driver and HID, and all sorts of other feeds, are not independently useful as random numbers for cryptography. It is only in combination that they are useful, as in total it becomes very difficult to predict the resultant state. In order to properly combine these sources of data you need a mixing function which allows a set of independently somewhat-predictable values to be coalesced into a single highly unpredictable value, without losing any of that unpredictability in the process, while also being resistant to cases where an attacker controls or knows the value of a small number of the inputs. Cryptographic hash functions are great at this - you can feed in a concatenated string and get a single fixed-length unpredictable result. This output can then be fed into the entropy pool by xor'ing it with the existing data inside it, which has the useful property that even if the attacker knows the output of a single mixing step, it tells them nothing about previous or subsequent steps and should not allow them to predict the pool contents.

The /dev/urandom implementation takes contents from the entropy pool, generates a key from them, and uses it to key a pseudorandom function (PRF) constructed from AES in CTR mode. This allows it to generate a huge amount of cryptographically useful random data (many gigabytes) from only a few hundred bits of data in the entropy pool, and do so very quickly (hundreds of megabytes per second). This alleviates the need for continuous sources of random data - even under heavy pressure to product random bits, the PRF only needs re-keying every few seconds, which gives the system time to acquire more data from entropy sources like performance counters.

This is where the concern about RDRAND comes in. The implementation within the kernel does not use the mixing function like other sources do. Instead, its output is directly xor'ed into the entropy pool. This means that a backdoor in the silicon could potentially have knowledge of the pool contents and output complimentary values which produce a predictable or known state, allowing the output of future calls to the RNG to be predicted. If you assume that such a backdoor exists, then the Linux entropy pool construction is vulnerable to such an attack.

Now keep in mind that all of this is highly theoretical. Surreptitiously introducing a silicon-level backdoor to the Intel supply chain is a fairly tall order even for nation states, but that backdoor would also need to be able to discern when to trigger the backdoor in the specific case of an RDRAND instruction execution that is part of the Linux kernel RNG and not just some random other software or driver using it, and also avoid triggering the backdoor during any unit testing (whose code may be identical to that used in the real kernel source). On top of all of that, it would also need to be able to discern the location of the Linux entropy pool in memory and perform the necessary memory reads to implement the backdoor, which is actually a fairly tall order at that level of attack implementation. TL;DR - it'd be one hell of a feat for anyone, even the NSA or GCHQ, to pull off, and I cannot possibly imagine them risking the exposure of that kind of capability by leveraging it against a mass-market product rather than only in targeted cases.

The Linux RDRAND implementation caught Linus Torvalds some flack on the mailing lists, but he steadfastly defends the approach. His argument is that in the case of such a sophisticated hardware attack like that, there are so many other (and easier) ways that the processor could expose cryptographic keys or even provide remote access to the system, and the massive performance increase is worth bypassing the mixer for. With the recent revelations of the Intel AMT vulnerabilities, and the lack of openness and accountability for IME/SMM firmware, it seems quite reasonable to expect that much easier avenues of attack would be preferred.

Polynomial
  • 132,208
  • 43
  • 298
  • 379
  • (a) The OP didn't mention RDRAND. (b) The xoring approach that you describe above was changed on [December 2013](http://www.metzdowd.com/pipermail/cryptography/2013-December/019065.html), so it is no longer a concern. – Ángel Sep 03 '17 at 22:28
  • Holy cow! We now know what your opinion is! – Mike Ounsworth Sep 04 '17 at 01:00
  • Great read, I learned some stuff. +1 for the observation that the kernel's hw_rng tools XOR directly, bypassing the mixing function. -1 for focusing so heavily on `rdnand` when it was not mentioned in the question. Sorry :( – Mike Ounsworth Sep 04 '17 at 01:35
1

[I assume that your openssl was in the default configuration to pull from /dev/(u)random and not from the HW RNG directly].

tl;dr

Short answer: this is a complex topic and depends on a bazillion factors, so there's no clear Yes / No answer here. But having the HW RNG be exposed to be backdoored, your risk is certainly higher.

Whether it's high enough for you to destroy the keys depends on your system setup when you generated them, and the value of the data they protect. If it's your family photos, then probably not worth it. If it's national military secrets, well stop posting on this site.

If you're using a standard Linux distro with a mouse and keyboard attached that you were busy jiggling while you generated your keys, then your additional risk from a backdoored HW RNG is pretty much zero.


The Linux RNG entropy sources, in brief

Core entropy sources

The Linux Random Number Generator itself generates entropy from three sources, see the kernel source file random.c:

  • add_input_randomness() – time stamps of human input device events (key presses, mouse events, etc)
  • add_interrupt_randomness() - time stamps of CPU interrupt events
  • add_disk_randomness() – the time delay between the CPU requesting a disk IO event, and the IO event returning

All of these things are "mixed" into the entropy pool behind /dev/random and /dev/urandom. As each event comes in, the LRNG estimates the number of bits of entropy that this is adding to the pool. The vast majority of events count for 0 bits, but get mixed in anyway - the theory being that mixing predictable numbers is not harmful so long as it's not over-estimating the number of bits of entropy that it has in the pool.

HW RNG support in the kernel

More recent kernel versions are also aware of some models of hardware RNGs which, depending on what's on your motherboard and the compile flags your kernel was built with, may be mixed in to the entropy pool as well.

user-land daemons

Ok, so (to me knowledge) that's it for entropy sources within the kernel itself. As @eckes suggests in comments, there are many user-land tools (daemons) that will take random bytes from {wherever} and pipe them into /dev/random, which results in them getting mixed in and their number of bits of entropy being estimated. I don't know if distributions commonly come with such daemons pre-installed, but I wouldn't be shocked; for example network packet timing is a common source of entropy for routers and such, and I think this needs to be done via daemon (please correct me in comments so I can learn something.


Conclusion

So, the Linux RNG has many sources from which to seed its entropy pool. As long as you were getting good entropy from one of those sources at the time that you generated you keys, you're fine.

As you allude to in your question, a backdoored RNG is only really dangerous if it's your only source of entropy; if anything else is being mixed into the LRNG along with your backdoored bitstream, then the attacker will need to either control or guess over those other bit streams.

Mike Ounsworth
  • 57,707
  • 21
  • 150
  • 207