10

I feel like I'm asking a fairly obvious question here, but with it being so easy to make mistakes in this space, here goes.

From wikipedia:

DK = PBKDF2(PRF, Password, Salt, c, dkLen)

dkLen is the desired length of the derived key

How do I decide which is the "desired length"?

user50849
  • 2,490
  • 2
  • 15
  • 15
  • What are you using it for? – CodesInChaos Mar 10 '14 at 21:47
  • Sorry, was aiming for a generic question. Encryption, do you want me to add more details than that as well? – user50849 Mar 10 '14 at 21:51
  • In theory the desired length is exactly what the protocol using PBKDF2 requires. In practice I recommend a [combination of PBKDF2 and HKDF](http://crypto.stackexchange.com/questions/5976/how-to-salt-pbkdf2-when-generating-both-an-aes-key-and-a-hmac-key-for-encrypt-t) – CodesInChaos Mar 10 '14 at 21:55

2 Answers2

11

The output length for PBKDF2 is what you need. But there are details.

PBKDF2 is a Key Derivation Function: it produces a sequence of bytes of configurable length, whose intended purpose is to be used as keys for some symmetric encryption algorithm (or a MAC). So a first response is: if you want to use a symmetric encryption algorithm which expects, say, a 19-byte key, then you want 19 bytes out of PBKDF2.

Now for the details:

  • PBKDF2 has a drawback, which is that it produces data by blocks, and has a cost which is proportional to the number of blocks that you are asking from it. For instance, if you use PBKDF2 with HMAC/SHA-1 as inner PRF (that's the usual case), then you get bytes by blocks of 20. If you want, say, 24 bytes, then PBKDF2 will need to produce two blocks, and this will be twice as expensive as if you wanted 20 bytes or less.

    This is unfortunate because PBKDF2 is intentionally slow, in order to defeat exhaustive search on the password. However, depending on how you use the PBKDF2 output, the attacker might be able to "test" a password on the first block of PBKDF2 output only; in effect, this may give a x2 or x3 performance boost to the attacker, which is counterproductive.

    The suggested work-around is to produce one block of output with PBKDF2 (say, 20 bytes), then expand these 20 bytes with another, fast KDF, into as many bytes as you need. HKDF is a fast KDF with good repute.

  • PBKDF2 is often abused into password hashing. With password hashing, you don't really use the output, you just store it for ulterior comparison. For such a use, you need to store sufficiently many bytes so that attackers don't "get lucky": finding an output matching what was stored should be overwhelmingly improbable. If you use PBKDF2 for password hashing, then 12 bytes of output ought to be fine, with a non-negligible security margin.

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
8

If you are using the output for password hashing, then the output length:

  • Must be no more than the native hash's output size
    • SHA-1 is 20 bytes, SHA-224 is 28 bytes, SHA-256 is 32 bytes, SHA-384 is 48 bytes, SHA-512 is 64 bytes
  • Must be no less than your risk tolerance. In practice, I'd say anything less than 20 bytes (SHA-1 native output size) is too small.

If you are using the output directly as only a single encryption key:

  • Should be equal to the size of the encryption key you need
    • Ideally is also no more than the native hash's output size (see above)

If you are using the output as both an encryption key and a MAC, or any other case where you're using the output for more than one purpose:

The reason for "no more than the native hash's output size" is that RFC2898 section 5.2, in the PBKDF2 definition, specifies that if more output bytes (dkLen) are requested than the native hash function supplies, you do a full iteration count for the first native hash size, then another full iteration count for the second, and continue until you're done or you have to truncate the output because the remainder of what you need is less than the native output size.

Anti-weakpasswords
  • 9,785
  • 2
  • 23
  • 51