Reversing Password Hashes

Question

If someone got hold of the hashes of all passwords for a website, how do they

get passwords from the hashes? As far as I can tell, everything seems to say do a dictionary attack/brute force attack, but if you did that how could you know that you extracted the password? Wouldn't you still have to test all the passwords by entering them onto the website?
If they reverse one hash, does that mean they can easily reverse all the hashes. E.g. if they manage to brute force a 6 digit password reversing its hash would that mean they could then easily reverse a hash for an 80 digit password?

score 6 · Answer 1 · edited Mar 17 '17 at 13:14

6

"Hashes" are one-way, deterministic functions. If you hash a password, you get some value. If you try again to hash the same password, you again get the same value. Therefore, a hash value can serve as a password verification token: to verify a password, just hash it and see if it matches the hash value.

The attacker does just that: he tries passwords until a match is found.

The attack is a more or less efficient process; if the hash function is unsalted, then various optimizations are possible (precomputated tables, parallel attacks...).

More on the subject in this answer.

edited Mar 17 '17 at 13:14

Community

1

answered Jul 03 '13 at 21:32

Tom Leek

168,808
28
337
475

So basically even if someone got all the hashes for a website it would be (practically) impossible for them to extract an 80 digit password with upper case, lower case, digits, symbols & altcodes? Thanks for the answer – Blue Jul 03 '13 at 21:40
3

It is not the length that matter, but the entropy. 80 _random_ digits will be "unbreakable" because there are a lot of possible sequences of 80 digits, and the attacker has only a negligibly small chance of hitting the right one, even if trying many times. – Tom Leek Jul 03 '13 at 23:01

score 3 · Answer 2 · edited Mar 17 '17 at 10:46

I'm not going to say anything new, but I'll try another explanation.

Look at hash functions this way: they're consistent random number generators.

Consistent in the sense that if you give them the same input, they always give you the same output.

Random in the sense that the output can not be predicted without using the hashing algorithm, no matter how many other numbers you've seen or generated. There are no similarities in the hashes, even when you can easily see similarities in the passwords that were used.

Number in the sense that just like 79DAE4F8931 can also be written as 8373815707953, you can also convert the output of an MD5 hash to a number (or SHA, or any other hashing algorithm). It's all ones and zeros to a computer.

Some examples

Let's say we have an hashing algorithm that outputs 20 bits (e.g. 5FFD1) and is good in the sense that it's collision resistant and whatnot. If we give it any input, we know the output corresponding to that input. This will always be the same when you give it the same input, else we can't verify the output later when we run the hashing algorithm again.

If we give it the input 9, it outputs A93BA. If we give it the input 10, it outputs CB3F2. If we give it the input 11, it outputs AFFF1. As you can see, when counting upwards in input, there is no way to determine the next output. It's random.

If we give it the input 9, it (still) outputs A93BA. If we give it the input 99, it outputs DED1C. If we give it the input 999, it outputs 01903. Again, there is no correlation between any of the values, besides of course that the 9 from before still generates the same output when we run it again. Thus, the output is considered random.

Let's say we determine that this is a safe hashing algorithm, and instead of using something tested and proven like bcrypt, we foolishly decide to store passwords after hashing them. If a user registers and uses 9 as their password, we will store A93BA in the database. If a user registers and enters aBadPassword, we store 6ED19 in the database.

Now our database has leaked and the attackers want to crack the hashes. Let's pretend we're the attackers (or the NSA if that makes you feel any better, it's cracking in the name of national security!).

We have a user whose password is stored as A93BA. We can see that one of the things we've previously hashed (namely 9) corresponds to that output, so we know the password was 9. By the way, this is my favorite way of cracking password hashes: google them. Here's the MD5 hash of "password1": 7c6a180b36896a0a8c02787eeafb0e4c. Go ahead, google it.

So now we want to find the input for output C7B7C, something we haven't previously found. How would you do it? Well, since the output is random and there is no correlation between any of the outputs when there is a correlation between the inputs, there is no way to do it. You just start with "a" and continue to "ZZZZZZZZ". At some point you'll find the correct input (the user's password), and the output will be C7B7C.

You can do this more smartly, instead of using "a" through "ZZZZZ", you can use a dictionary. Or a dictionary plus a number (password1 is a dictionary word plus one digit). Or something even more sophisticated (professionals do this) to predict what passwords might be chosen. You can even look at previously cracked passwords to see what common passwords are, then use that in your prediction model. But basically, if your password was randomly chosen (not something that can be predicted), then your hash will be completely safe.

Note: Actually, in our case, you can easily find an input for any given output, and it may not be the original input. Because our output is only 20 bits, we can just search the numbers 1 through 1048576 (2^20), and one of them will work. However this is just a fictive example; modern hashing algorithms usually output 128 bits or more (32 hexadecimal bytes). I used 20 bits instead because it makes the answer much easier to read.

I think the above explanation should answer your questions.
To summarize:

1) how do they get passwords from the hashes

By running the hashing algorithm on lots of inputs, and each time checking whether the output is equal to the output that they are looking for. They may optimize the way they look through outputs (calculate a bunch, then compare each hashed password in the database, or use a dictionary, etc.), but they can never "easily" crack them all if you had a strong password.

2) If they reverse one hash, does that mean they can easily reverse all the hashes

No, not if you have a good hashing algorithm. There is no correlation between outputs, even when there is a correlation between inputs (such as repeating characters or incrementing input). This is called the avalanche effect (wikipedia)

Examples of broken algorithms: Windows' LM hashes, MD4, and most things people made at home. Examples of algorithms that are reasonably okay in most cases: MD5, SHA1, SHA2. Note that "SHA2" is actually a set of algorithms that work much the same: SHA-224, SHA-256, SHA-384 and SHA-512.

Warning: Don't just use SHA-512 or any other 'raw' hashing algorithm to store user passwords. There are things like user-specific salts, iteration count, possibly an initialization vector... just don't touch it. Instead, use one of these: https://security.stackexchange.com/a/1164/10863

Hope this helped! Actually this was all covered in the other question that is marked as "possible duplicate", but it's explained differently. Maybe this explanation works better for you, maybe not. Or maybe it works better for others that google this question. In any case, I hope this helped you understand!

score 2 · Answer 3 · edited Jul 03 '13 at 22:21

Every password has a single hash. Meaning, a single hash does not have any other value than that one (taking hash collisions out of scope). If you bruteforce the hash, then you will get the resulting password. You won't need to test all passwords as normally the username kept in the same table as the password.

Even taking that aside you don't try to obtain hashes to reuse them on the same site as you got dn, rather you hope the user has also used that password for his Facebook, email,... so you can get in there as well.

For the second part of your answer, refer to Thomas Leek's answer.

Reversing Password Hashes

3 Answers3

Related