As a password cracker, I encourage all of my targets to use this technique.
This (understandably!) seems like a good idea, but it turns out that against real-world attacks, wrapping an unsalted hash with bcrypt is demonstrably weaker than simply using bcrypt.
(EDIT: First, to be clear up front, bcrypt(md5($pass))
is much better than md5($pass)
alone - so none of this should be take to mean that this scheme should be left as is.)
Wrapping an unsalted hash is problematic from a real-world attack perspective because attackers can do this:
- Acquire existing MD5 passwords from leaks - even MD5s that haven't been cracked yet
- After simpler attacks have been exhausted, run these MD5s as a "wordlist" against your
bcrypt(md5($pass))
corpus, to identify uncracked bcrypts with known MD5s
- crack those MD5s outside of bcrypt at much higher speed
And yes - you do have to discover the MD5 inside the bcrypt first. But the crucial point is that that MD5 can be an otherwise uncracked MD5 that happens to be present in some other leak, which you can then attack at massively increased speeds.
This is not a theoretical attack. It is used all the time by advanced password crackers to successfully crack bcrypt hashes that would otherwise be totally out of reach for the attacker.
How this attack works is very non-intuitive for non-specialists, so I strongly encourage skeptics to experiment with a real-world scenario to understand how it works:
- Hash a 6-character random password with MD5.
- Presume that this MD5 is already present in some other list of leaked passwords, proving that it has been used as a password at some point.
- Try to attack the MD5 directly with brute force.
- Wrap the MD5 in bcrypt and try to attack it directly with brute force.
- Attack the same bcrypt-wrapped MD5, but this time pretend that you haven't cracked the MD5 yet, but instead use a "dictionary" of leaked MD5 that includes your MD5.
- Once you've "discovered" that you have an MD5 in hand that is inside one of your bcrypts, attack the MD5, then pass the resulting plaintext to your bcrypt(md5($pass)) attack.
Again, very non-intuitive, so play with it (and don't feel bad that it takes work to understand it; I argued vigorously against it with Jeremi Gosney for an hour straight before I finally got it!)
I don't believe that this technique has an "official" name, but I've been calling it "hash shucking" or just "shucking."
So depending on use case, it's totally understandable why wrapping bcrypt can be attractive (for example, to get beyond the 72-character bcrypt maximum, though this can be tricky for other reasons, including the 'null byte' problem), or to migrate existing hashes.
So if someone needs to wrap a hash in bcrypt, the mitigation for this weakness should be clear by now: your inner hash must never appear in any other password storage system that might ever become available to an attacker. This means that you must make the inner hashes globally unique.
For your specific use case - in which you need to preserve existing hashes - there are a few options, including:
- adding a global pepper within your web or DB framework - so,
bcrypt($md5.$pepper)
This allows you to easily migrate existing MD5s, but that global pepper is still subject to being stolen (but if your web tier is segmented from your DB tier/auth, this might be an acceptable risk, YMMV);
- adding a global pepper using HSM infrastructure (storing the pepper in such a way that not even the web app can see, so it can't be stolen)
- adding an extra per-hash salt (but you'd have to store it outside of the hash somehow, which starts to get tricky and verges into 'roll your own crypto' territory);
- hashing the MD5s with a slow, salted hashing algorithm or HMAC inside the bcrypt layer (not recommended, I'm not even vaguely qualified to advise on how that might be done properly, but is possible - Facebook is doing it, but some very smart people designed that);
For more details, including some specific scenarios to illustrate why this is weaker than bcrypt alone, see my SuperUser answer here, this OWASP guidance on "pre-hashing" passwords which supports my assertion with more clarity, and this talk by Sam Croley discussing the technique.
Password upgrading in general can be tricky; see - this answer and Michal Špaček's page on password storage upgrade strategies.