MD5 Preimage Vulnerability in 2017

Question

I've recently discovered that one of our (strictly internal facing - no external risk unless other systems are completely compromised) platforms is storing passwords by salting and then MD5 hashing them. According to the Wikipedia article, MD5 is very vulnerable to collisions, but only theoretically vulnerable to preimage attacks (cost being ~2¹²³). The article cites the preimage attack being created in 2009, however - quite a long time ago! I am unable to find any significantly more recent information about MD5 vulnerability to preimage attacks.

As a result, my question is this: Have there been any vulnerabilities found in MD5 that would allow a preimage attack on a hashed password, either to find a collision or determine the password, with a cost of less than 2¹²³?

As a follow up question, how important is it that I attempt to pressure the vendor into moving off MD5 for password hashing? Is this major and immediate vulnerability, or more of a theoretical future one (should the database be breached).

When saving a password verifier just using a hash function is not sufficient and just adding a salt does little to improve the security. Instead iteration with a random salt for about a 100ms duration and save the salt with the hash. Use a function such as `PBKDF2`, `Rfc2898DeriveBytes`, `Argon2i`, `password_hash`, `Bcrypt` or similar functions. The point is to make the attacker spend substantial time finding passwords by brute force. — zaph, Nov 03 '18 at 20:53
The MD5 vulnerabilities are much less than "2^123" given they are password based. The general attack used lists of frequently used passwords ordered by frequency of use, see [SecLists](https://github.com/danielmiessler/SecLists/tree/master/Passwords). Also there are readily available cracking programs, see [password-cracking-tools](http://resources.infosecinstitute.com/10-popular-password-cracking-tools/). — zaph, Nov 04 '18 at 08:32
To support what @zaph said above: [*What is the specific reason to prefer bcrypt or PBKDF2 over SHA256-crypt in password hashes?*](https://security.stackexchange.com/questions/133239/what-is-the-specific-reason-to-prefer-bcrypt-or-pbkdf2-over-sha256-crypt-in-pass) — Franklin Yu, May 21 '19 at 21:16

Conor Mancone · Accepted Answer · 2018-11-04T12:11:31.370

MD5 should be considered completely compromised for password use, and has been "deprecated" for passwords for a long time. It doesn't even have to involve preimage attacks or explicit vulnerabilities. It is as simple as the fact that the hash rate for modern GPU stacks against MD5 is so fast that you can feasibly brute force nearly any password (okay, I exaggerate slightly)

This is an extreme setup, but it can run through almost 200 billion hashes per second. That means it can guess about 200 billion passwords every second if they are hashed as MD5. I don't have an exact translation but as you can imagine, being able to try 200 billion password guesses every second means that even strong passwords can get easily cracked. The article suggests that it can crack a 14 character windows XP password (which is slightly weaker, and has roughly double the hash rate as for MD5) in just 7 minutes.

More realistic hashing setups might hash passwords at roughly a tenth of that rate, but even still it is quite possible to brute force any realistic password that is hashed with MD5.

Edit to address the second half of your question

Is this an immediate threat? Yes and no. In practical terms it is a theoretical threat, as your passwords will only be vulnerable in the event that this internal system is breached. However, the more you read about the very involved kind of hacks that criminals go through when they want something, the more you realize how important it is to have thorough security at every level of the system. I personally believe that internal systems should be as secure as external systems. Here is a good example of a time that bad security lead to expensive breaches:

https://gizmodo.com/hackers-found-a-new-way-to-rip-off-atms-1818859798

Moreover, the other issue is that MD5 has been "out-of-fashion" for password storage for quite a long while now. I would be very concerned that the rest of their security is equally out-of-date, and that this internal system of yours is full of security holes.

Another edit

An important thought to keep in mind: with these things an important consideration is the potential damage done in the event of a breach. I don't know what this internal system does, but there is one important bit of sensitive information it definitely stores: your user's passwords. Even if that is all it stores, it is potentially dangerous. Here is a very plausible worse-case scenario. What are the odds that you have an administrator that has a user account on this internal system? If so, what are the odds that that person used the same password for the internal system as the did for the password to administer, say, your company-wide email system? If so it is a short hop, skip, and jump from cracking an MD5 password to taking control of your email system, and from there to probably any aspect of your company that is web-facing.

While you try to work things out with the people who manage your internal system, you can think through this thought process yourself and take appropriate steps: "If a malicious user managed to get a hold of the password to email account X, how much trouble could they cause?". You would be surprised how many companies out there have effectively their entire system dependent upon the security of a single email account, and that without any 2FA. If that is the case for your company you should fix that, regardless of what this third-party vendor says. Otherwise your worst-case-scenario is very bad, and this internal system is just one of many ways in which a malicious attacker may be able to cripple your company.

+1 for "I would be very concerned that the rest of their security is equally out-of-date, and that this internal system of yours is full of security holes.". Regarding the current password brute-forcing capabilities : https://gist.github.com/epixoip/ace60d09981be09544fdd35005051505 (8*GTX1080 Ti slighly surpassed 300 Billons MD5 hashes per second). — ATo, Oct 10 '17 at 06:35
Thank you for your response. I have contacted the vendor to try and figure out the best path forward. — Cowthulhu, Oct 10 '17 at 16:14
@Cowthulhu Sounds like a good starting point. FYI, I added a couple more paragraphs about what is (potentially) at risk. — Conor Mancone, Oct 10 '17 at 19:27
To translate the 200 billion per second, that means that in 30 days or less, the keyspace for a lower+number, 11 character, a lower only 12 character, a lower+upper+number+number symbols 9 character, or lower+upper+number+normal keyboard symbols 8 character (and nearly all the 9 character) could be exhaustively searched. — Anti-weakpasswords, Jan 27 '18 at 05:05
It is not true that "MD5 should be considered completely compromised", it depends on usage. Certainly it is compromised for signing but It can still be used as a checksum to verify data integrity, but only against unintentional corruption. However it is best to choose a current hash function such as from the SHA2 or SHA3 series. — zaph, Nov 04 '18 at 08:37
@zaph I actually meant that only in the context of this question, e.g. for passwords. Looking back though I can see that wasn't clear, so I'll update my answer. I agree with you - there are some use cases for which MD5 is maybe not the best choice, but not a terrible choice (checksums for data verification is the only use that comes to mind for me as well). — Conor Mancone, Nov 04 '18 at 12:10
This answer contains logical fallacies, e.g. dictionary attacks on weak passwords have nothing at all to do with a *preimage attack* (as was actually asked) or the overall cryptographic strength of a hash function for that matter. BTW, SHA-256 (with no practical exploits against the function itself) may be particularly susceptible due to the availability of insanely fast ASICs used for cryptomining. There are *no practical and published preimage attacks* on MD5, period. As for cryptanalysis that he NSA or maybe the Chinese don't tell us about, anyone's guess is as good as mine. — Arne Vogel, Jun 18 '19 at 14:29
Well, I guess to some extent you wanted to emphasize the importance of [key stretching](https://en.wikipedia.org/wiki/Key_stretching) which is fair enough but is mostly orthogonal to the choice of hash function. — Arne Vogel, Jun 18 '19 at 14:37
@ArneVogel I think you're reading the question too narrowly. Yes, the OP specifically asked about preimage attacks, and you are certainly correct that such weaknesses have not been found in MD5. However, it's clear that the OP was concerned about the overall security of MD5 and the implications for its use with passwords in a secure system. In that case the answer is clear: due to ease of bruteforcing MD5 hasn't been approved for use with passwords in a long time, and doing so is without question a serious security risk. — Conor Mancone, Jun 18 '19 at 14:39
@ArneVogel So yes, it's true that I did not answer the exact question answered (does MD5 have preimage vulnerabilities), but I answered the much more important (for the OP) question: is it safe to use MD5 for passwords? Definitely not. Call it a frame challenge if you would like. — Conor Mancone, Jun 18 '19 at 14:40

score 1 · Answer 2 · answered Nov 03 '18 at 18:10

1

Please consider using a more modern password hash, either Scrypt or Argon2id.

GPU-based dictionary attacks are brutally effective against salted hashes, even if the hash is SHA512. PBKDF2 is a very poor hashing algorithm, with trivial collisions and poor defense. Bcrypt is OK, but you'd be safer switching to a modern memory-hard password hashing algorithm that is inefficient on GPUs. Scrypt is prevasive and still quite good, but if you have access to Argon2id, I'd use that.

Sent me a random sample of your MD5 hashes and salts, and I'll send back 90% of the corresponding passwords within 24 hours. I'll crack them without a GPU: just using my laptop. Really. Just download HashCat, and it's just plug and chug.

answered Nov 03 '18 at 18:10

Bill Cox

11
1

Keep in mind that while while Argon2id is better than PBKDF2 Argon2 is not available on all platforms. Characterizing PBKDF2 as a *"very poor hashing algorithm, with trivial collisions and poor defense"* is disingenuous and just plain wrong. PBKDF2 is still recommended by NIST and since it relies on strong hash algorithms is not collision prone. – zaph Nov 03 '18 at 19:31
Note: PBKDF2 is not a hashing algorithm, it is a Password Based Key Derivation Function. It uses a HMAC which uses a hashing function such as SHA-256, a salt, and a repetition count. A reasonable value of the the repetition count is a value that consumes ~100ms of CPU time. – zaph Nov 04 '18 at 07:59
PBKDF2 itself doesn't create trivial collisions. It's PBKDF2 with HMAC that causes them. HMAC has pairs of equivalent keys. Another PRF choice isn't necessarily collision vulnerable. – Future Security Nov 04 '18 at 16:14
For password based authentication the collisions have no effect on security. With HMAC it just means that if you have a very long password then you can enter that or another long, random-looking password at a log in prompt. It doesn't make password cracking easier. It's safe for key derivation too if it's your password and your key. Maybe you can exploit a collision pair in an unusual scenario, but anyone that doesn't know your password can't. – Future Security Nov 04 '18 at 16:39
@zaph You should still not choose to use PBKDF2 in new applications unless you're forced to. Definitely try to add support for Argon2. Argon2 > scrypt > bcrypt > PBKDF2. Never request an output length from PBKDF2 that exceed the output length of its PRF. Do rehash passwords in your database, immediately after the user enters their password, with a stronger algorithm if such an algorithm becomes available. – Future Security Nov 04 '18 at 16:59
@FutureSecurity Note that I state "while Argon2id is better than PBKDF2 Argon2 is not available on all platforms" which is essentially stating to use the best available method. It is important information that collisions are not an issue with password hashing. – zaph Nov 05 '18 at 09:07
2

Are you serious? I know MD5 is not collision-resistant, but a preimage attack like this? Anyway just in case you meant it, here is a few salts + MD5 hashes: https://pastebin.com/raw/zfSudWG5 I have included the first password for reference. I'd be *very* surprised if you can find any of the others. – RocketNuts Feb 05 '19 at 00:25
The median user password strength at a major web service has been measured against CMU's neural net based attack-dictionary generator, and found to be 32 bits strong. Your example password appears to 2.4 million times stronger than that. You should not assume users use unique random 11-character passwords on all their web sites. Anyway, my son's 1080 card could crack your passwords, assuming 7 upper/lower letters followed by 4 digits, in around a week. He is not interested in loaning me his gaming computer for that long :) – Bill Cox Feb 07 '19 at 01:33
@RocketNuts did anyone ever take you up on your offer? – Cowthulhu Apr 16 '19 at 14:21
@Cowthulhu Never. As I already expected, to be honest. – RocketNuts Apr 18 '19 at 11:15
@bill-cox Well an 11-character password, especially with the assumption that it's 7 letters followed up 4 digits (which just randomly happened to be the case here, that assumption cannot be made in general) can in fact already be considered quite weak. If people use passwords that are millions times weaker.. then *that's* the vulnerability, not the usage of md5. Also, don't many people use password managers nowadays? – RocketNuts Apr 18 '19 at 11:32
bcrypt is memory-hard, so memory-hard in fact that it does not allow efficient concurrent hasing on modern GPUs, all GPU cores spend the majority of the time waiting for memory to become available for writing. – user1067003 Sep 21 '19 at 13:08

MD5 Preimage Vulnerability in 2017

2 Answers2

Linked

Related