53

Storing the hash of users' passwords, e.g. in a database, is insecure since human passwords are vulnerable to dictionary attacks. Everyone suggests that this is mitigated via the use of salts, but the salt is considered non-sensitive and does not need to be protected.

In the event that the attacker has the salt how has his dictionary attack become more difficult than before? Does not having access to the salt, effectively, remove its usefulness?

Andrei Botalov
  • 5,267
  • 10
  • 45
  • 73
Jim
  • 1,395
  • 4
  • 13
  • 18

4 Answers4

54

Salt doesn't protect you against a lone attacker who is only after one password. An attacker who just wants to break one password will calculate hash(salt + guess) instead of hash(guess) (if the password scheme is hash(salt+password)).

Salt helps if the attacker wants to break many passwords. This is usually the case. Sometimes the attacker is attacking a site and wants to break into an account on that site, any account. Without a salt, most of the attacker's work can be used for all accounts, so she can test each of her attempts against all accounts at once. With a correctly-chosen salt (i.e. if no two accounts have the same salt), the attacker has to start over for each hashed password.

Furthermore, in a sense, all password cracking attempts are attempting to crack all accounts passwords at once. That's because hashes can be precomputed; all it takes is for someone to generate a table of hashes — or more efficiently, a rainbow table — and that initial work can be distributed to multiple attackers who can use it on any account database that uses the same password hashing algorithm. Again, salt makes these precomputations useless.

A brute-force password attack can be summarized like this:

  1. Make any precomputation that the attacker deems useful, such as building a rainbow table (which is an efficient way to represent a table mapping hashes to common passwords).
  2. For every one of the n accounts that the attacker is interested in breaking in, and for every one of the p password guesses that the attacker includes in her dictionary, test whether hash(guess[i]) = hashed_password[j].

In a naive approach, the second step requires n × p hash computations to try all guesses against all accounts. However, if the first step already calculated all the possible hashes, then the second step requires no hash computation at all, just testing whether each hashed_password is in the precomputed database, so the attack requires just n table lookups (this can even be sped up, but we've already gone from n x p slow computations¹ down to n table lookups).

If each password has a different salt, then in order to be helpful, the precomputation would have to include an entry for every possible salt value. If the salt is large enough, the precomputation is infeasible. If the precomputation doesn't take the salt into account, it won't be useful to speed up the second step, because any cryptographic hash function “mixes” its input: knowing the hash of UIOQWHHXpassword doesn't help compute the hash of NUIASZNApassword. Even to attack a single account, the attacker needs to perform p hash computations to try all guesses, already an improvement on the single table lookup that would be sufficient if the attacker has a precomputed dictionary.

¹ A password should not be stored as a hash such as SHA-1, but using a slower hash function such as bcrypt or scrypt or PBKDF2.

Gilles 'SO- stop being evil'
  • 50,912
  • 13
  • 120
  • 179
  • 2
    Salts prevent easy dictionary lookups. This is the case whether the attacker is after one password or a thousand. – Polynomial Apr 21 '12 at 20:42
  • @Polynomial No, dictionary lookup itself is just as easy whether there is salt or not. The point of salt is that each salt value requires its own dictionary. – Gilles 'SO- stop being evil' Apr 21 '12 at 20:43
  • @Gilles:To be honest your answer, the way I understand it, seems a little inconsistent.You say:`Salt helps if the attacker wants to break many passwords`.Ok so far.Then:`wants to break into an account on that site, any account`.If that's the purpose, since the salt is cleartext and unprotected, why can't he achieve exactly his purpose?I.e `break into an account on that site, any account`.One account/password would be enough – Jim Apr 21 '12 at 20:43
  • 1
    @Jim The point is that without salt, the attacker can check `hash("password")` against every account. If every account has salt, the attacker needs to check `hash(salt1+"password")` against account1, then `hash(salt2+"password")` against account2, and so on. – Gilles 'SO- stop being evil' Apr 21 '12 at 20:51
  • @Gilles:Ok, so what does that mean?More CPU time?He can still get a match.Am I missing a point here? – Jim Apr 21 '12 at 20:55
  • @Gilles By "dictionary lookups" I mean hash dictionaries - whether they be stored in a relational database or a rainbow table. Salted hashes prevent those lookups. – Polynomial Apr 21 '12 at 20:55
  • @Jim I've updated my answer to show more precisely how much work the attacker needs to do with and without salt. – Gilles 'SO- stop being evil' Apr 21 '12 at 21:14
  • @Polynomial No, salt doesn't prevent these lookups, but it makes it useless to precompute such dictionaries since you'd need a separate dictionary for each salt. – Gilles 'SO- stop being evil' Apr 21 '12 at 21:15
  • @Gilles:+1.One last question.In your side note suggesting `PBKDF`, you mean to use `PBKD` first on password and then hash the resulting key? – Jim Apr 21 '12 at 21:28
  • 1
    @Jim PBKDF2 and bcrypt are in fact hash functions; the basic principle is to make each hash computation slower by taking a hash function and computing `hash(hash(hash(…(hash(password)))))`. Just repeating the hash wouldn't be so good, though, so PBKDF2 and other similar functions throw kinks in the works at each iteration, to make sure that no precomputation can help the attacker. I recommend reading [Thomas's answer](http://security.stackexchange.com/a/6415) (and before that the definitions of PBKDF2 and Bcrypt on Wikipedia). – Gilles 'SO- stop being evil' Apr 21 '12 at 21:50
  • 1
    @Gilles It prevents them from being feasible. If you're computing a table for a particular salt, it's useless unless another hash uses that salt. You might as well not bother storing the data in the table, since you're effectively bruteforcing. – Polynomial Apr 21 '12 at 23:30
17

I'm going to explain each level of security for database stored password, maybe this will help clarify the importance of salt.

Level 1: Password stored in clear in the database.

This is just pure stupidity. But sometimes, big company get their database compromised and, oh surprise, all password are stored in clear. What a shame!

Level 2: Password stored using an hashing algorithm.

This is a step toward more security. If an attacker get the hashed password, he can't still login using this credential to the service. He will first have to "unhash" the password using a Rainbow Table or by brute forcing it. That means it require for the attacker a bit more work, but it can still be possible to retrieve the original password.

Level 3: Password stored using an hashing algorithm with a salt.

This starts to be interesting. As we saw on Level 2, an attacker that has access to the hashed password can either brute force it or use a rainbow table. Adding a salt to the original password make the rainbow table totally useless because they don't take into consideration the salt. It still possible to get the original string using a rainbow table, but it would mean that in the rainbow table, the password+salt exists. Since a salt is generally very long, the odd having a hashed value of a string(40+) is almost impossible. The only solution left is brute force.

Level 4: Password stored using an hashing algorithm with a salt and a peper.

This is very interesting (it's the reason why I'm posting my answer since the actual ones are already interesting).

I recommend you to use an hashing algorithm like BCrypt, SCrypt or PBKDF2 since they aren't beek broken so far, and are very slow to hack, increasing the time to break one, and then a complete database of passwords.

The difference between salt and pepper is their location. The salt is generally stored in the database but the pepper is located in the code. This can seems odd but the original idea is that a database can be compromised, but not the code, leaving the attacker a salt and a list of useless hashed password. Moreover, it add even more complexity to the hashed password, making rainbow tables completely useless (if we suppose they were still a bit useful with a salt!).

In the case that only the database is compromised, even the brute force is then useless since the attacker is not looking for one value (the original password) but for two values (the original password AND the pepper).

Cyril N.
  • 2,649
  • 2
  • 18
  • 28
  • 3
    Level 4 is not enough - one also needs to use a slow hash (like bcrypt, scrypt, or PBKDF2). – D.W. Jul 24 '12 at 05:56
  • You're right, and I'm surprised I didn't mentionned it, I always do! I updated my answer! Thanks for pointing this out! – Cyril N. Jul 24 '12 at 08:08
  • I endedup here starting from microservice authentication check. So then I learned the concept of "salt". Then with this answer I learned the concept of "pepper". What seems to me an issue for implementing till level 4 is in term of infrastructure and for example in term of microservice. Indeed how credential encrypted via peper could be retrived back from microservices? – Carmine Tambascia Nov 12 '19 at 13:44
  • For example how to produce JWT(JSON Web Token) when level 4 ( a "peper" in the code) is implemented? – Carmine Tambascia Nov 12 '19 at 14:30
8

If your hashes are unsalted I can generate a whole dictionaries worth of hashes and store them in a database (special kind called a rainbow table, it's more efficient for large numbers of hashes) , then all I need to do is look them up. Basically one giant memory vs. CPU time optimization. Also, if I see two users with the same hash I know they are using the same password.

A salt makes every users password hash unique, and if you do it right it'll likely be unique even if they use that same password on some other site (not uncommon), and so I would actually have to take word, apply salt and hash it, repeat for next entry in dictionary.

ewanm89
  • 2,043
  • 12
  • 15
  • I should add, the same applies to brute force (though these are massive rainbow tables). And there are hash tables too, which store more for even faster checking. – ewanm89 Apr 21 '12 at 20:40
  • But if I know the salt, I could still use the dictionary along with the salt to guess the password – Jim Apr 21 '12 at 20:45
  • Yes, but you have to hash every entry in the dictionary one at a time, and check it. – ewanm89 Apr 21 '12 at 20:51
  • But wouldn't the attacker do that anyway (salt or no salt)? – Jim Apr 21 '12 at 20:53
  • 1
    Yes, but he has to do it real time, not use a precomputed table. I would also point out precomputed tables can be shared between attackers. – ewanm89 Apr 21 '12 at 20:58
  • If an attacker is just trying to get access to any account, they could compute a hash of a bunch of words and see if they match any account record. With a salt, they have to compute the hash for every password for each account, using the account's salt. It also prevents lookups in hash tables such as rainbow tables, because you will find "bagel" in such a table but not "6$hGk45!bagel". – Polynomial Apr 21 '12 at 20:58
  • 1
    @Polynomial, hash tables and rainbow tables are technically totally distinct ;) – ewanm89 Apr 21 '12 at 21:03
6

Unsalted hashes are vulnerable to lookup attacks. There are freely available databases containing millions of password hashes (mainly MD5 and SHA1) which can be used to look up the plaintext of a hash. These may be based on standard relational databases, or on specialized database formats like rainbow tables.

So, here's an example:
7c6a61c68ef8b9b6b061b28c348bc1ed7921cb53 (unsalted)
2d67062d67b4d0eaca31a768e901d04982de7352 (salted with "RxB6aE" prefix)

A quick lookup on the first hash will reveal the plaintext to be "passw0rd". However, the latter isn't very likely to be found. In the case where the salt isn't known by the attacker, it makes dictionary and bruteforce attacks difficult.

If you're looking to secure passwords in a database, you should use something like bcrypt. It uses a large number of iterations of a hash algorithm to make each computation slow. As such, bruteforce attacks, dictionary attacks and the building of rainbow tables are highly infeasible.

Polynomial
  • 132,208
  • 43
  • 298
  • 379
  • LM_hash tables a dime a dozen ;) – ewanm89 Apr 21 '12 at 21:01
  • So the idea is that you get the text of a hash from the dictionary using a lookup of the hash?You don't each word in a dictionary, hash it and see if that hash exists in the target data? – Jim Apr 21 '12 at 21:16
  • You make a big table containing hashes that map to their plaintexts. You search by hash, then get the plaintext directly. They trade CPU time for disk space, making lookup very fast for hashes the database contains. – Polynomial Apr 21 '12 at 23:29