PBKDF2 doesn't seem to have a standardised implementation?

Question

I recently read about the bug in Django with regards to PBKDF2 causing Denial of Service with large passwords:

https://www.djangoproject.com/weblog/2013/sep/15/security/

This is because PBKDF2 "mixes" in the input password with each iteration, therefore amplifying the effect of a large input. Django originally solved this by setting a password character limit of 4096 bytes.

I tried to replicate this in Go, only to find that their implementation isn't vulnerable. This is because they HMAC the input password before performing the iteration loops, hence the iterations are always on a constant size input. Here's the source in question:

https://github.com/golang/crypto/blob/master/pbkdf2/pbkdf2.go

I then found that the Django "fix" which set a maximum input size was reversed, and they instead fixed it by implementing a similar solution to Go:

https://github.com/django/django/commit/68540fe4df44492571bc610a0a043d3d02b3d320

This is baffling to me. Does the PBKDF2 spec say that it's supposed to be HMACed first? As far as I can tell, it doesn't. Does that mean that the Go and current Django implementations are non-compliant to the spec? Was the previous Django implementation more correct, but they've modified PBKDF2 to fix the DoS issue?

The OpenBSD implementation doesn't seem to perform the initial HMAC either: http://bxr.su/OpenBSD/lib/libutil/pkcs5_pbkdf2.c

But the PHP implementation does: https://github.com/php/php-src/blob/1c295d4a9ac78fcc2f77d6695987598bb7abcb83/ext/hash/hash.c#L600-L723

Why does there seem to be no standardization in this regard? Which is the "correct" implementation?

PS: I found this discussion regarding it, which came to the conclusion that PBKDF2 is not meant to be HMACed first, and that a "user friendly" wrapper should perform the HMAC before handing to the PBKDF2 algo: https://github.com/defuse/php-encryption/issues/230

That makes sense to me based on my reading of the RFC. Is that correct?

score 2 · Accepted Answer · edited Nov 09 '19 at 20:56

Saying that the Golang implementation computes the HMAC of the input password before performing the iteration loop, therefore making PBKDF2 non-standard compliant, is not fully accurate.

The Golang implementation of PBKDF2 is fully standard compliant and interoperable.

First note that - strictly speaking - PBKDF2 requires a PRF (Pseudo Random Function). The PBKDF2 in the Golang crypto library is limited in that it only works if the PRF is implemented as an HMAC (which is fine, since everybody does that in practice).

Having said that, assuming the PRF is an HMAC, the inner and most time consuming loop of PBKDF2 does this:

U_1 = HMAC(password, S || INT (i)) ,
U_2 = HMAC(password, U_1) ,
...
U_c = HMAC(password, U_{c-1}) .

where c is a very high number and all U_x items are short and matching in length the output of the underlying hash function, such as 32 bytes for SHA-256.

For very long passwords, HMAC is actually a composition that looks like this:

Kp = SHA-256(password)
HMAC(password, data) = SHA-256(Kp xor constant1 || SHA-256(Kp  xor constant2 || data))

You may notice that a naive implementation could compute the very same Kp = SHA-256(password) value for every iteration of the PBKDF2 loop (so, c times), with a cost that grows linearly with the length of the password, and that may lead to a DoS if an attacker is free to pick any password (as it happened with Django).

Most implementations (such as Golang's) are more careful and cache the intermediate computation of the SHA-256 function up until data, which is the only value that changes from iteration to iteration, that is:

HMAC(password, data) = SHA-256(Kp xor constant1 || SHA-256(Kp  xor constant2 || data))
                       ^^^^^^^^^^^^^^^^^^^^^^^^    ^^^^^^^^^^^^^^^^^^^^^^^^^
                                         CACHED ACROSS ITERATIONS

In such optimized implementations, every iteration consists of completing the HMAC starting from the cached value by feeding the small data component. The cost of the algorithm is now almost fixed and independent of the password length (apart from the step required to create the cached state).

Conclusion: the Golang implementation of PBKDF2 is fully standard compliant and interoperable, although it only works for the case where the PRF is an HMAC. The implementation is secure against DoS attacks because it precomputes the intermediate HMAC state which is always the same for all iterations, including a contribution from the secret password. However, that is not an HMAC of the password, and it is just an optimization of the standard algorithm.

It should be noted that the PRF used for PBKDF2 is a parameter of PBKDF2, so if you use a different PRF, you will of course get different results. — , Oct 15 '19 at 11:26
Correct, and with another PRF you may also need a different countermeasure against a DoS attack. — SquareRootOfTwentyThree, Oct 15 '19 at 11:32
The simplest countermeasure I can imagine is to limit the password length to 512 characters. There is no measurable security benefit from using a password longer than 512 characters. — , Oct 15 '19 at 11:34
I never responded at the time, but this comment was super helpful. I actually wrote a POC for the vulnerability here for those interested in learning more: https://github.com/rsheasby/PBKDF2-DoS-POC — Ryan Sheasby, Nov 11 '21 at 09:28

PBKDF2 doesn't seem to have a standardised implementation?

1 Answers1