7
alg: drbg: could not allocate DRNG handle for ...

I only see this error on the console during the boot process of virtual machines we create. EDIT: 2/5/16 - I see it on some bare-metal installations, too. (It does proceed to boot completely.) I assume it has something to do with the virtualized hardware and lack of a (compatible) random number generator. The problem is I can't assess the severity. Is the encryption strength compromised? (Should I even care about this error?) How can I fix it?

We are using QEMU/KVM under CentOS 6.7. I can do a virsh dumpxml of a example system if you really think it will help. We are using the Anaconda default cipher/key size. (aes-xts-plain64/512)

This is the earliest reference I found on the linux-crypto mailing list. Unfortunately, it's a bit over my head.

http://www.mail-archive.com/linux-crypto%40vger.kernel.org/msg10398.html

alg: drbg: could not allocate DRNG handle for ...

Aaron Copley
  • 12,345
  • 5
  • 46
  • 67
  • 1
    Typically you encrypt the _storage_ that the VMs use, and this is transparent to the VM. This is easy enough to do if you run the infrastructure. – Michael Hampton Jan 04 '16 at 23:40
  • I don't normally encrypt VMs, but I have run into this edge case. If LUKS is unsupported in virtual environments, I'll accept that with citation. – Aaron Copley Jan 04 '16 at 23:43
  • 1
    Hmm. An errno gets passed up from that function, but those kernel messages don't bother to print it. – Michael Hampton Jan 04 '16 at 23:48
  • Red Hat/CentOS team might not have back ported exactly as shown in the mailing list? I'd have to check. – Aaron Copley Jan 04 '16 at 23:55
  • 1
    No, I'm talking about the code in the actual kernel (as of today!). – Michael Hampton Jan 04 '16 at 23:58
  • 1
    I installed a fresh encrypted VM. I see the same messages, but then the VM continues to boot normally. – Michael Hampton Jan 05 '16 at 00:02
  • Yes, sorry, I wasn't clear about that. It does boot normally. My main concern is that it's possibly* insecure. I don't understand the error well enough to make that interpretation of it. Glad it's not just me, though. Thanks for confirming. – Aaron Copley Jan 05 '16 at 00:08
  • I expect the best thing to do is file a bug against the kernel. I would think that the tests ought to pass in that configuration, even if the configuration is a bad idea. – Michael Hampton Jan 05 '16 at 00:10
  • Ok, thanks. Feel free to clean-up these comments. I'll still leave this up for a few days in case anyone else has some info. – Aaron Copley Jan 05 '16 at 00:16

2 Answers2

8

Consicely, I do not believe it affects the strength of your encryption.

I've checked the source code and as long as I'm interpreting what I read right, you dont necessarily have to worry about this.

This code belongs to the module 'stdrng'. At least on Fedora 23 this is built into the kernel rather than exported as a kernel module.

When stdrng is initialized for the first time the following calls occur.

In crypto/drbg.c initialization starts here.

1997 module_init(drbg_init);

This registers all the drbgs known to the system..

1985         for (j = 0; ARRAY_SIZE(drbg_cores) > j; j++, i++)
1986                 drbg_fill_array(&drbg_algs[i], &drbg_cores[j], 1);
1987         for (j = 0; ARRAY_SIZE(drbg_cores) > j; j++, i++)
1988                 drbg_fill_array(&drbg_algs[i], &drbg_cores[j], 0);

It then passes it to a helper function that performs the initialization:

1989         return crypto_register_rngs(drbg_algs, (ARRAY_SIZE(drbg_cores) * 2));

In crypto/rng.c this just iterates through each rng to register it..

210         for (i = 0; i < count; i++) {
211                 ret = crypto_register_rng(algs + i);
212                 if (ret)
213                         goto err;
214         }

This function does a bunch of initialization steps then calls another function for allocation.

196         return crypto_register_alg(base);

Whats not so obvious is what happens during register.

Another module called tcrypt also built into the kernel receives notifications of new algorithms being inserted. Once it sees a new registered algorithm it schedules a test of it. This is what produces the output you see on your screen.

When the test is finished, the algorithm goes into a TESTED state. If the test fails, I imagine (I couldnt find the bit that produces this behaviour) it isn't selectable for searching if you pass the right flags.

Whether or not the test passes is definitely internally stored.

In addition to this, calling the psudeo random number generator causes a list of algorithms to be iterated of prngs in order of strength as dictated by this note in crypto/drbg.c

107 /*
108  * The order of the DRBG definitions here matter: every DRBG is registered
109  * as stdrng. Each DRBG receives an increasing cra_priority values the later
110  * they are defined in this array (see drbg_fill_array).
111  *

Since the strongest one does not fail (hmac sha256) its unlikely you are using the failed ones even if they could be selected.

To summarize -

  • This happens when the stdrng module is required for something.
  • It loads all its known algorithms.
  • All algorithms loaded get tested. Some can fail (why is not considered in this answer).
  • Test failed algorithms shouldnt be available for selection later.
  • The PRNGS are ordered by strength and strong PRNGS that do pass are tried first.
  • Things that rely on stdrng hopefully should not use these algorithms as the basis for their PRNG source.

You can see which algos have succeeded and passed the tests using the following command:

 grep -EC5 'selftest.*passed' /proc/crypto

You can also see the selection priority with the 'priority' field. The higher the value the stronger the PRNG according to the module author.

So, happy to be wrong here as I dont consider myself a kernel programmer but, in conclusion -

When stdrng loads it appears to select other algorithms from the list of acceptable algos which are considered stronger than the failed ones, plus the failed ones aren't likely selected anyway.

As such, I believe that this no additional risk to you when using luks.

Matthew Ife
  • 22,927
  • 2
  • 54
  • 71
  • Thanks for the very thorough analysis. Breaking it down really helped. Is there any known reason that these tests fail consistently in a QEMU/KVM guest? (I really should try in VirtualBox/VMware just for curiosity.) – Aaron Copley Jan 07 '16 at 16:56
1

How can I fix it?

Per Red Hat Knowledge Base, you must add 'ctr' Kernel module to your initrd. Their instructions also say to include 'ecb' though it seems the issue is with the 'ctr' module not being loaded.

dracut -f -v --add-drivers "ctr ecb"

Subscribers can see the full information. I am not sure if I am permitted to republish the rest here so I have paraphrased the full solution.

https://access.redhat.com/solutions/2249181

Edit 9/29/2016:

You can also add these drivers to /etc/dracut.conf so that they are added to the new initramfs on Kernel upgrades. Otherwise, your symptoms mysteriously reappear many months later. ;)

add_drivers+="ctr ecb"
Aaron Copley
  • 12,345
  • 5
  • 46
  • 67