12

The Raspberry Pi has a built in hardware random number generator but there doesn't appear to be any public documentation on the hardware, and even if there were it would be questionable (what company is going to publically admit that there might be problems with their own hardware if they can get away with it?).

So the question, what is the best method of testing a hardware random number generator? If I filled a 10GB file with random numbers generated with the hardware random number generator are there any tools on my main computer I can use to see if there are any patterns in the numbers produced?

I find it hard to believe that there is no third party way to verify whether a random number generator (be it software or hardware) is safe to use. Otherwise why would anyone ever trust a random number generator that wasn't open source and that couldn't be independently audited?

I'd like to use the Raspberry Pi to generate some private keys but the normal software random number generator doesn't get enough entropy to make it viable (takes ages to generate) but with the hardware random number generator it is much more viable. I just don't know if I can trust the hardware random number generator or not.

AviD
  • 72,138
  • 22
  • 136
  • 218
Cromulent
  • 1,103
  • 1
  • 9
  • 13

7 Answers7

11

There are several factors at play here:

  • The physical HRNG mechanism used inside the ARM microprocessor.
  • The supporting logic on the silicon inside the microprocessor.
  • The microcode deployed on the microprocessor.
  • The implementation of the RNG inside Linux.

According to the manufacturer, that particular chip uses noise from a reverse bias transistor, in an open-collector HRNG design. This is precisely the hardware I'm using in my current HRNG project, but much smaller. It's one of the cheapest ways to implement a HRNG directly into silicon; whereas I'm using discrete transistors, the ARM processor simply shrinks this to a few blips of silicon on the chip die.

We know that this mechanism is a strong source of randomness, because it relies upon a phenomenon called quantum tunnelling, which is known to be probabilistic and random. The basic idea is that electrons will randomly "jump" over the band gap inside the transistor, leading to a randomly fluctuating signal. We can then amplify this (simple PNP transistor amplifier will do) and interpret the output as either a 1 or a 0 by sampling the result at a fixed frequency - if it exceeds a certain threshold it's a 1, otherwise it's a 0.

One slight deficciency of this mechanism is that any DC leakage will lead to a skew towards 1s appearing more often than 0s. In order to get around this, we can use a simple trick called von Neumann decorrelation, which essentially involves encoding bit pairs 01 and 10 to 0 and 1 respectively, and ignoring all 00 and 11 pairs. This produces a statistically unbiased stream.

I can almost certainly guarantee that this is the mechanism by which it produces random numbers. There's one major alternative (two reverse biased NOT gates in parallel) but it's covered by an Intel patent, so I doubt ARM is using that. Unfortunately, the only way to know for certain is to grab some acid and decap a chip, then take a microscope and start reverse engineering the silicon. I don't happen to have the spare gear available for this.

The potential vulnerability that you might find in such exploration is that high frequency clock signals or other HF data lines are routed very close to the HRNG or its supporting logic, leading to a potential for interference. This is usually unlikely in ICs, but we're talking about high sensitivity applications here, so it's worth looking for. Even if it were the case, though, I don't see it providing a useful skew from a cryptanalytic perspective.

The next potential for exploitation is microcode. Recently, a researcher demonstrated that it was possible to patch microcode for Intel processors to look for unique patterns in the instruction buffer and detect when the RDRAND instruction was being used to fill the /dev/random pool. It would then identify the position of that pool buffer in the processor cache and effectively zero it, causing the zero pool to be copied back into system memory. This meant that /dev/random would constantly output the same attacker-controlled value, with obviously devastating results. If a similar but more subtle trick was employed in the ARM microcode, it might be possible to massively reduce the entropy of the HRNG in a way that is only known to the creator of the patch. Detecting such tricks would be difficult, but it could be done by extracting the microcode and analysing it.

Finally, the last problem is the RNG design inside the Linux kernel. The /dev/random pool is usually fed from a bunch of state-based sources, using a mixing algorithm that is built upon a cryptographic function. However, when RDRAND or similar instructions are available, the engine simply xors the data over the pool. This isn't exactly a good idea, because it makes it easy to screw with the pool data in a meaningful way by producing certain values. If the stronger mixing function were used, the attacker would need to break that (or do something more conspicuous) in order to gain meaningful control over the pool.

There is not an obvious answer to your question. Hardware random number generators can be very high quality, but it's difficult to analyse their implementation given only a consumer device. That being said, if you're going to distrust your hardware, you can't really make any guarantees in the first place. If you want to limit the scope of distrust entirely to the quality of random numbers produced, then design your supporting system around that fact.

Polynomial
  • 132,208
  • 43
  • 298
  • 379
  • 1
    "Recently, a researcher demonstrated that it was possible to patch microcode for Intel processors to look for unique patterns in the instruction buffer and detect when the RDRAND instruction was being used to fill the `/dev/random` pool." Any chance of a citation/hyperlink to that research? Thanks! – sampablokuper Oct 31 '15 at 01:29
  • 1
    @sampablokuper [*Stealthy Dopant-Level Hardware Trojans* - Becker et. al.](http://sharps.org/wp-content/uploads/BECKER-CHES.pdf). Theodore Ts'o also wrote a PoC to demonstrate an attack on RDRAND by modifying the instruction's behaviour in bochs: [Twitter link](https://twitter.com/defusesec/status/408975222163795969), [explanation](http://pastebin.com/A07q3nL3) – Polynomial Nov 04 '15 at 13:58
  • @Polynomial Would you happen to have links to doc about “According to the manufacturer…”? – sunknudsen Aug 24 '21 at 18:35
  • @sunknudsen Unfortunately it looks like the original docs have been pulled since Broadcom started clamping down on putting the SoC datasheets behind an NDA. – Polynomial Aug 24 '21 at 20:00
  • @Polynomial Thanks! – sunknudsen Aug 24 '21 at 20:02
7

I have had the same problem in the past. United States National Institute of Standards and Technology (NIST) has actually prepared a suite full of a whole bunch of tests. It contains all sorts of statistical tests and reports a p value for each test. It includes simple tests like just counting zeros and ones to see if they are close to 50% and sophisticated tests such as spectrum analysis. There is a very extensive easy to read and understand manual which explains all of the theory and how to properly use tests and interpret their results.

Fixed Point
  • 211
  • 2
  • 7
  • The link in this answer now points to their RNG project. If anyone else is looking for the same information, here is a snapshot from the Internet Archive as of the month this answer was written: https://web.archive.org/web/20131210181724/http://csrc.nist.gov/groups/ST/toolkit/rng/index.html – FlippingBinary Mar 23 '22 at 16:32
5

I find it hard to believe that there is no third party way to verify whether a random number generator (be it software or hardware) is safe to use.

Unfortunately, this is the case. Unless that RNG is horrendously bad that there is a visible bias in the output, it is very difficult to detect if a RNG contains subtle weaknesses that can be considered backdoors without looking at the source.

  • Not strictly true - it is at least possible to decap the IC and use a microscope to verify that the internal HRNG design is of sufficient quality. – Polynomial Dec 22 '13 at 02:50
  • But then you can't use your decapped IC because you acided it to death, so maybe you buy two of them and use a random number generator to decide which one to decap and check and then assume the one you haven't destroyed is identical, but you can't trust the random number generator to decide which to uncap because maybe its biased to choose the real RNG and hide the bad one! – daniel Mar 27 '17 at 13:10
  • @daniel: I assume this is a joke, but in case it's not, how would a biased RNG know you're generating a random number for that purpose, much less which of the chips is the real one? – flarn2006 Feb 10 '22 at 04:13
4

Regarding the tools to use to test the predictability of your random data - the industry standard is dieharder suite, to be found here.
RaspberryPi-specific example, to be launched after installing kernel module and rng-tools:

apt-get install dieharder
cat /dev/hwrng | dieharder -d 0 -vv  

# Verbose is now 0
#=============================================================================#
#            dieharder version 3.31.1 Copyright 2003 Robert G. Brown          #
#=============================================================================#
   rng_name|rands/second|   Seed   |
    mt19937|  1.34e+06  |2497347415|
#=============================================================================#
    test_name       |ntup| tsamples |psamples|  p-value |Assessment
#=============================================================================#
   diehard_birthdays|   0|       100|     100|0.79423698|  PASSED  
2

Well there are a few things you can try I would point you to this page: https://calomel.org/entropy_random_number_generators.html

Check your Level of Randomness

You can test the level of enthropy/randomness in your outfile using ent. "Ent can be used against any file containing data and will test the randomness of that data"

Increase your enthropy

Increase the entropy. There is a daemon called "rngd" you can use. It runs in the background and significantly increases the amount of entropy on the system. The quality of the data is about the same as what /dev/random would produce, rngd simply make more per second.

It is worth baring in mind the following article: http://arstechnica.com/security/2013/12/we-cannot-trust-intel-and-vias-chip-based-crypto-freebsd-developers-say/

The main change this has forced is the viewing of hardware random number generators as psudo-random number generators. Treating them as that means that precautions can be taken to make the generated numbers more random (such as passing it through as a seed to a second random number generator).

You could make your own random number generator. There are many interesting projects out there that will show you how.

DevilCode
  • 151
  • 4
1

A computer is a deterministic machine, and must always oerform the sane steps in the same order. A pseudo random number generator is a software algorithm that produces "unpredictable" numbers within certain conditions: knowing any output of the generator will not help you determine numbers that were generated before the sequence you know, and knowing any output will not help you determine future output. Of course knowing the internal state allows you to predict future output. The initial state is called the seed.

In order to produce truly unpredictable numbers, the algorithm's state needs to be seeded with an unknowable value. But a mathematical algorithm can't produce one. So hardware designers create hardware that offers a measure of unpredictability.

If you don't trust the hardware, you can still gather as many sources of unknown information as possible, and mix them together. If one source is biased, it will simply contribute less to the overall unknown state. These algorithms are known as entropy gatherers. The output is run through a cryptographic sponge function, which absorbs unknown data at whatever rate it arrives at, then emits pseudorandom bytes on demand.

John Deters
  • 33,650
  • 3
  • 57
  • 110
  • 2
    This is largely untrue - modern systems have many ways of acquiring highly entropic data. On-die HRNGs can utilise truly random phenomena such as Johnson-Nyquist noise and quantum tunnelling to produce strong random numbers to great effect. – Polynomial Dec 22 '13 at 03:22
1

Testing a physical random number generator is similar to testing good quality pseduo-random number generators (mathematical functions) and can make use of the same tools developed for them; FIPS 140, STS 1.2.1, diehard, dieharder, Test U01, etc...

Interpreting the results of these tests is somewhat different when dealing with true random number generator. When testing a mathematical function based generator, a test failure reveals a flaw in the basic generator design; however, with a physical based generator we expect tests to fail occasionally on any given sample. And as long as these failed tests do not repeat frequently with different samples the generator is still producing what we expect it should.

Another difference is that when testing a mathematical based generator we only need to test it once to determine its validity; however, a physical based generator should be tested regularly since changes in the physical properties of the hardware collecting the randomness (entropy) can affect the validity of its output.

I have put together a web site that details my ongoing explorations concerning the design, implementation, use, and testing of these physical generators which contains more information if your interested, including four sets of tests run on four different four gigabyte samples collected from the Pi's internal hardware random number generator:

Over view of true random number generators

Raspberry Pi Tests