20

I'd like to generate a bunch of keys for long term storage on my MacBook.

What's a good way to:

  1. measure the amount of entropy and ensure it is sufficient before each key is generated, and
  2. increase the entropy if needed?

Something similar to Linux's /dev/urandom and /dev/random

Brian Armstrong
  • 1,015
  • 2
  • 11
  • 16
  • 3
    This is off-topic, and needs to be migrated to Security SE, but I'll answer quickly the first question: `/dev/random` will block until the system estimates it has the entropy you asked for, so all you have to do is read and wait (if you don't want to block, there are system calls to get the current estimated entropy). – Thomas Sep 20 '13 at 00:36
  • 4
    This question appears to be off-topic because it is about software use of cryptography. – Thomas Sep 20 '13 at 00:37
  • This question appears off-topic because it is about a specific operating system environment use of cryptography. Of the three sets which this question clearly belongs to, the Operating System looks like the stronger discriminant. I'd suggest a migration on `AskDifferent`. – dan Sep 20 '13 at 08:01

2 Answers2

25

First, note that on OS X, /dev/random and /dev/urandom have the same behavior; /dev/urandom is provided as a compatibility measure with Linux. So, while you will see much advice online to use /dev/random because it will block if insufficient entropy is available, that is not the case under OS X.

/dev/random uses the Yarrow CSPRNG (developed by Schneier, Kelsey, and Ferguson). According to the random(4) manpage, the CSPRNG is regularly fed entropy by the 'SecurityServer' daemon. Unlike Linux, reading values from /dev/urandom does not exhaust an entropy "pool" and then fall back on a CSPRNG. Instead, Yarrow has a long-term state and is reseeded regularly from entropy collected in two different pools (a fast and slow pool), and reading from /dev/random pulls directly from the CSPRNG.

Still, the random(4) manpage notes:

If the SecurityServer system daemon fails for any reason, output quality will suffer over time without any explicit indication from the random device itself.

which makes one feel very unsafe. Furthermore, it does not appear that OS X exposes a Linux-style /proc/sys/kernel/random/entropy_avail interface, so there is no way to measure how much entropy the SecurityServer daemon has fed Yarrow, nor does there appear to be a way to obtain the current size of the entropy pools.

That being said, if you are concerned about the amount of entropy currently available, you can write directly to /dev/random to feed the CSPRNG more data, e.g. echo 'hello' >/dev/random (of course, hopefully you would use actual good, random data here). Truth be told, though, the default behavior probably suffices; but if you're feeling paranoid, use a Linux live distro, maybe with a hardware RNG attached, and generate your keys from /dev/random.

Reid
  • 408
  • 4
  • 9
  • The quote does *NOT* make me feel very safe at all. It is basically saying you may get low quality entropy without any way to know whether it is the case. Sounds pretty much like the worst case scenario to me. – Calimo Dec 12 '14 at 13:15
  • @Calimo I think that was either a typo or sarcasm, not sure which. – Jack O'Connor May 08 '15 at 17:33
  • It means two things: 1) you will continue to get pseudo-random numbers somehow correlated to the last injection of entropy and 2) if the state leaks at some point an attacker might be able to predict future random numbers. 1) doesn't really matter since we have good PRNGs nowadays, and 2) matters maybe in your threat model but I doubt it. – David 天宇 Wong Jul 07 '17 at 11:16
  • @Reid, re "does not exhaust an entropy.."; Which OS? From https://www.2uo.de/myths-about-urandom/structure-yes.png (https://www.2uo.de/myths-about-urandom/#before-linux-4.8) it seems like it wouldn't be exhausted anyway since `/dev/random` takes output **from** a CSPRNG. – Pacerier Feb 17 '18 at 23:51
  • @David天宇Wong, You mean CSPRNG? – Pacerier Feb 18 '18 at 00:12
  • @Reid, re "hardware RNG"; And which do you use? Cost prohibitive? – Pacerier Feb 18 '18 at 00:13
  • @Pacerier PRNG (cryptography) = CSPRNG (applied cryptography) – David 天宇 Wong Feb 20 '18 at 17:39
  • @David天宇Wong, Not sure I understood what you're trying to say there. – Pacerier Mar 06 '18 at 10:16
  • I meant that these terms are equivalent – David 天宇 Wong Mar 06 '18 at 20:12
  • @David天宇Wong, "applied cryptography" is different from "cryptography"? – Pacerier Mar 09 '18 at 12:49
  • It's related, but different at the same time. For example, I haven't heard of the term CSPRNG in cryptography – David 天宇 Wong Mar 09 '18 at 15:54
  • @David天宇Wong, By the term "applied cryptography", you are referring to that book right? – Pacerier Mar 09 '18 at 16:19
  • no :o I'm referring to actual applications of cryptography used in the real world. Usually the industry is referred to as "applied cryptography" whereas the academy does "theoretical cryptography". Sometimes they merge but that's basically it. – David 天宇 Wong Mar 12 '18 at 08:14
  • UPDATE: please not that with recent macOS kernel versions (at least >= Catalina), "Apple confirmed the kernel CPRNG is a Fortuna-derived design targeting a 256-bit security level". See https://apple.stackexchange.com/a/365529. No more Yarrow, and hopefully not less secure than Linux anymore. – Blacklight May 19 '20 at 08:24
16

In fact both the MacOS X man page, and the Linux man page, are afflicted with the same disease, which is that they talk of entropy as if it was some sort of gasoline, which is consumed upon usage.

In reality, this does not work so. Entropy is a measure of what a system could have been. In both Linux and Mac OS X, there is an internal "entropy pool", namely a set of bits which have been filled from hardware events. These events are supposed to be non-predictable or measurable by attackers, so the pool, at any time, may have accumulated "n bits of entropy", meaning that if the attacker wanted to guess the pool contents, then he would need, on average, 2n-1 tries. This pool is then extended into an arbitrarily long stream of pseudorandom bytes with a cryptographically secure PRNG.

Suppose that you have a pool with n bits of entropy. Now generate one gigabyte of data with the PRNG, seeded with that pool. What is the pool entropy after the generation ? Still n bits ! Indeed, the pool could still have 2n possible contents, and an attacker trying to guess the pool contents is not helped in any way by having obtained the gigabyte of pseudorandom bytes. Why is that so ? Because the cryptographically secure PRNG is cryptographically secure. That's exactly what is meant by "cryptographically secure": output "looks random" and cannot be predicted with higher probability than that of guessing the internal state, even if previous output has been observed in large quantities.

However, for some reason, both the Mac OS X and Linux man page writers are apparently convinced that the randomness somehow degrades upon usage. This would make some sort of sense if the PRNG was weak and leaked information about the pool; but you would not want to use such a PRNG anyway. In other words, when someone writes "the output from /dev/(u)random loses quality after some time", he is implicitly telling you "the PRNG is broken and leaks data".

Fortunately, even if the man page writers are somewhat fuzzy in their heads about what entropy is, the implementations are better. To ensure security, what is needed is that the entropy pool gets at some point to a decent level of entropy (128 bits are enough); from that point and until the machine is next rebooted, /dev/urandom will yield high-quality randomness which is perfectly appropriate for all practical usages, including, yeah, generating long-term PGP or SSH keys. The tricky point is to ensure that no randomness is extracted before that moment of optimal entropiness.

When the machine boots up, the pool is empty. Then hardware events are gathered, continuously. FreeBSD is a Unix-like operating system which does everything right in that respect:

  • The OS maintains an "entropy estimate" of how much entropy it has gathered since boot time.
  • If the entropy estimate is below a given threshold, /dev/random will block and refuse to produce pseudo-alea. It will wait until some decent entropy has been gathered.
  • Once the threshold has been reached, /dev/random will happily produce gigabytes of pseudorandomness, without ever blocking.
  • /dev/urandom is an alias on /dev/random.

This behaviour is nice and good: don't output bad randomness, but don't refuse to output tons of randomness when it can be done safely. Mac OS X imported a lot of kernel code from FreeBSD, so we may imagine that Mac OS X behaves like FreeBSD in that respect (this should be checked by inspecting the source code).

Linux is not as good, but not so bad either. On Linux:

  • /dev/random uses an entropy estimator, and (through flawed reasoning, as explained above) decreases this estimate when random bytes are output. /dev/random blocks when the estimate is below a given threshold.
  • /dev/urandom never blocks.

So /dev/urandom does the Right Thing, except right after boot: /dev/urandom may accept to produce pseudorandom bytes even if the entropy pool is still too shallow (i.e. too "predictable" by the attacker). To avoid this problem, Linux distributions apply a nifty fix: they save a random seed for next boot. When the machine boots, it injects that file into the pool, then produces a new file, to be injected upon next boot. This mechanism reduces the window of "poor entropy" to the very first boot, when the OS was first installed. Afterwards, /dev/urandom is properly seeded, will produce high-quality randomness, and will never block.

To sum up, just use /dev/urandom. It is fine, despite the millenarist prophecies of some man page writers; even on Linux, where a window of "low entropy" theoretically exists, distribution vendors apply corrective mechanisms.

(Note, though, that not all Linux derivatives may be fixed, in particular embedded systems. Generating an entropy seed file for next boot works only as long as there is a writable filesystem; this is not necessarily the case for embedded systems.)

Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • 3
    I have to disagree to your disagreement. The data in the entropy pool is nowhere near uniformly distributed, so in order to have nice bits they must mix it up, distil it, with hash functions or equivalent constructions. If the PRNG is broken, then the estimate of "how much entropy is revealed" is valid only as long as these hash-like constructions do their job properly, i.e. as long as they are not broken. But the PRNG itself uses the same constructions ! So it is a security measure which works only as long as it is not needed. – Tom Leek Sep 25 '13 at 21:23
  • Another point is that experience shows that these entropy estimates have _never_ saved anybody from poor PRNG doom. In all cases of bad PRNG which have been reported so far in various OS, none of them was due to a poor PRNG used beyond some internal entropy measure; all of them were due to a poor PRNG used _at all_, and failing because of a design or implementation error which would not have been avoided (and, indeed, was not) with a blocking `/dev/random`. I thus maintain my point: indeed, the distinction between `/dev/random` and `/dev/urandom` is "virtually useless" (but alas not harmless). – Tom Leek Sep 25 '13 at 21:27
  • Re "This pool is then extended"; you mean the entire pool? Or only the last x bits form as the input to the CSPRNG? – Pacerier Feb 18 '18 at 00:16
  • Re "as been observed in large quantities"; Definition of ? – Pacerier Feb 18 '18 at 00:18
  • Re "the PRNG is broken and leaks data"; Actually they probably don't mean "is" but "may be broken either in the past/present/future". – Pacerier Feb 18 '18 at 00:20
  • Re "distributions apply a nifty fix"; which distributions you referring to? – Pacerier Feb 18 '18 at 00:32