(Caveat: I certainly don't claim that HAVEGE lives up to its claims. I have not checked their theory or implementation.)
To get randomness, HAVEGE and similar systems feed on "physical events", and in particular on the timing of physical events. Such events include occurrences of hardware interrupts (which, in turn, gathers data about key strokes, mouse movements, incoming ethernet packets, time for a hard disk to complete a write request...). HAVEGE also claims to feed on all the types of cache misses which occur in a CPU (L1 cache, L2 cache, TLB, branch prediction...); the behaviour of these elements depends on what the CPU has been doing in the previous few thousands clock cycles, so there is potential for some "randomness". This hinges on the possibility to measure current time with great precision (not necessarily accuracy), which is where the rdtsc
instruction comes into play. It returns the current contents of an internal counter which is incremented at each clock cycle, so it offers sub-nanosecond precision.
For a virtual machine system, there are three choices with regards to this instruction:
- Let the instruction go to the hardware directly.
- Trap the instruction and emulate it.
- Disable the instruction altogether.
If the VM manager chooses the first solution, then rdtsc
has all the needed precision, and should work as well as if it was on a physical machine, for the purpose of gathering entropy from hardware events. However, since this is a virtual machine, it is an application on the host system; it does not get the CPU all the time. From the point of view of the guest operating system using rdtsc
, this looks as if its CPU was "stolen" occasionally: two successive rdtsc
instructions, nominally separated by a single clock cycles, may report an increase of the counter by several millions. In short words, when rdtsc
is simply applied on the hardware, then the guest OS can use it to detect the presence of an hypervisor.
The second solution is meant to make the emulation more "perfect" by maintaining a virtual per-VM cycle counter, which keeps track of the cycles really allocated to that VM. The upside is that rdtsc
, from the point of view of the guest, will no longer exhibit the "stolen cycles" effect. The downside is that this emulation is performed through triggering and trapping a CPU exception, raising the cost of the rdtsc
opcode from a few dozen clock cycles (it depends on the CPU brand; some execute rdtsc
in less than 10 cycles, other use 60 or 70 cycles) to more than one thousand of cycles. If the guest tries to do a lot of rdtsc
(as HAVEGE will be prone to do), then it will slow down to a crawl. Moreover, the exception handling code will disrupt the measure; instead of measuring the hardware event timing, the code will measure the execution time of the exception handler, which can conceivably lower the quality of the extracted randomness.
The third solution (disabling rdtsc
) will simply prevent HAVEGE from returning good randomness. Since it internally uses a PRNG, the output may still fool statistical analysis tools, because there is a huge difference between "looking random" and "being unpredictable" (statistical analysis tools follow the "look random" path, but cryptographic security relies on unpredictability).
The VirtualBox manual claims that VirtualBox, by default, follows the first method (rdtsc
is unconditionally allowed and applied on the hardware directly), but may be configured to apply the second solution (which you don't want, in this case).
To test what your VM does, you can try this small program (compile with gcc -W -Wall -O
on Linux; the -O
is important):
#include <stdio.h>
#if defined(__i386__)
static __inline__ unsigned long long rdtsc(void)
{
unsigned long long int x;
__asm__ __volatile__ (".byte 0x0f, 0x31" : "=A" (x));
return x;
}
#elif defined(__x86_64__)
static __inline__ unsigned long long rdtsc(void)
{
unsigned hi, lo;
__asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
}
#endif
int
main(void)
{
long i;
unsigned long long d;
d = 0;
for (i = 0; i < 1000000; i ++) {
unsigned long long b, e;
b = rdtsc();
e = rdtsc();
d += e - b;
}
printf("average : %.3f\n", (double)d / 1000000.0);
return 0;
}
On a non-virtual machine, with the "true" rdtsc
, this shall report a value between 10 and 100, depending on the CPU brand. If the reported value is 0, or if the program crashes, then rdtsc
is disabled. If the value is in the thousands, then rdtsc
is emulated, which means that the entropy gathering may not work as well as expected.
Note that even getting a value between 10 and 100 is not a guarantee that rdtsc
is not emulated, because the VM manager, while maintaining its virtual counter, may subtract from it the expected time needed for execution of the exception handler. Ultimately, you really need to have a good look at the manual and configuration of your VM manager.
Of course, the whole premise of HAVEGE is questionable. For any practical security, you need a few "real random" bits, no more than 200, which you use as seed in a cryptographically secure PRNG. The PRNG will produce gigabytes of pseudo-alea indistinguishable from true randomness, and that's good enough for all practical purposes.
Insisting on going back to the hardware for every bit looks like yet another outbreak of that flawed idea which sees entropy as a kind of gasoline, which you burn up when you look at it.