Why is it always `HASH( salt + password )` that we recommend?

Question

Browsing over this site, many forums, online articles, there's always one specific way we're suggesting to store a password hash:

function (salt, pass) {
   return ( StrongHash(salt + pass) );
}

But why this exact way? Why aren't we suggesting to do this?

function (salt, pass) {
   return (StrongHash(( StrongHash(salt) + StrongHash(pass) ));
}

Or even something like this?

function (salt, pass) {
   var data = salt + pass;
   for (i=0;i < 1000; i++) {
       data += StrongHash(salt + data)
   };
   return (data);
}

Or some other crazy combination? Why are we specifically saying hash the concatenation of the raw salt and the raw password? The hash of both being hashed seems to be a fairly high-entropy alternative, as does hashing 1000 times as per my third example. Why don't we hash the first one a few more times for entropy's sake?

What's so amazing about the first way?

By request, examples of this:

Where are you seeing that advice here? The best question (referred to from the password tag wiki) is [How to securely hash passwords? - IT Security](http://security.stackexchange.com/questions/211/how-to-securely-hash-passwords) — nealmcb, Jun 15 '11 at 15:25
Thanks. I certainly agree that many other sites have horrid advice on password hashing, and many products do it very badly. But on this site, the best answers for the two questions you point to agree with you that the hash needs lots of iterations, as Thomas clarifies. — nealmcb, Jun 15 '11 at 18:02
Doesn't that third example leave the password in plaintext near the start of the output string? Or does `+` mean something other than "concatenate" when used on strings in the language you're using? — Ajedi32, Jan 21 '15 at 14:34
Oops. I thought this was a new question. Thankfully internet resources are a little less bad now than they used to be. Use Argon2id, everyone. If you're worried, then A) Be prepared to update when something better comes along, B) Don't be worried. - Argon2 > scrypt > bcrypt > PBKDF2 . — Future Security, May 10 '19 at 17:07

score 45 · Accepted Answer · edited Oct 07 '21 at 06:47

45

Actually, "we" are not recommending any of what you show. The usual recommendations are PBKDF2, bcrypt or the SHA-2 based Unix crypt currently used in Linux.

If the hash function you use is a perfect random oracle then it does not really matter which way you input the salt and the password; only matters the time it takes to process the salt and password, and we want that time to be long, so as to deter dictionary searches; hence the use of multiple iterations. However, being a perfect random oracle is a difficult property for a hash function; it is not implied by the usual security properties that secure hash functions must provide (resistance to collisions and to preimages) and it is known that some widely used hash functions are not random oracles; e.g. the SHA-2 functions suffer from the so-called "length extension attack", which does not make them less secure, but implies some care when using the function in funky password-hashing schemes. PBKDF2 is often used with HMAC for that reason.

You are warmly encouraged not to feel creative with password hashing schemes or cryptography in general. Security relies on details which are subtle and which you cannot test by yourself (during tests, an insecure function works just as well than a secure one).

edited Oct 07 '21 at 06:47

Community

1

answered Jun 15 '11 at 14:52

Thomas Pornin

320,799
57
780
949

5

+1, but to be fair `StrongHash()` could very well be a wrapper function for bcrypt et al ;) – AviD Jun 15 '11 at 15:48
1

@AviD that's why I used StrongHash() rather than bcrypt() or sha1() or sha2() or even... uhh, you know, that one digest we treat as a swear word. – Incognito Jun 15 '11 at 17:15
Being a nitpicker here, but your Wikipedia link says it: "No real function can implement a true random oracle."; still +1 because what you intend to say is right and PBKDF2 is always a useful hint and to discourage self-made crypto is never wrong :P – freddyb Jun 15 '11 at 17:20
1

@freddyb: strictly speaking, a random oracle is defined as part of a family of functions; the experience being: I pick one function from the family, and one random function among all possible functions, I give you both and you cannot tell which is which on average with probability better than 0.5. With a given specific hash function, there is only one picking and thus "average" makes no sense. You can distinguish SHA-256 from another function because it outputs the same values than SHA-256. But such details tend to be confusing; the main idea here is: existing hash functions are not ideal. – Thomas Pornin Jun 15 '11 at 18:45
Thomas - In February, you said "PBKDF2 has been published as a KDF and has been analyzed as such. Whether it can be subverted into a password storage mechanism remains to be proven." (http://security.stackexchange.com/questions/2051/is-pbkdf2-based-system-cryptology-rfc2898derivebytes-better-for-unicode-pass). Has PBKDF2 now been proven? – Robin M Oct 25 '11 at 10:25
3

@Robin M: no. I am not sure anybody specifically works on it, and, as a general rule, "not being proven" is the normal state for most cryptographic algorithms. Proofs are _hard_. – Thomas Pornin Oct 25 '11 at 10:36
This also highlights one of the reasons I'm against the frequent use of "we" in programming. – Pharap Oct 28 '14 at 08:22

score 10 · Answer 2 · answered Jun 15 '11 at 16:48

On the two proposed -

function (salt, pass) {    return
    (StrongHash( StrongHash(salt) + StrongHash(pass) ) 
}

There's no bonus here. A hash renders information into something random. So on the first pass, you made two pieces of data into two random strings and then combined two random strings. Why? Security through obscurity offers no benefit. The critical elements of the generally proposed solution are:

combine salt & password so that the salt can provide an added chaos factor for creating a password's hash
one way hash the conglomeration to make a seemingly random string.

You can't be more random than random - and with security it's good to avoid work that doesn't have a purpose.

function (salt, pass) {
   var data = salt + pass;
   for (i=0;i < 1000; i++ {
       data += StrongHash(salt + data)
   }
   return (data)
}

The first thing happening here is that the data is given the salt and the password. So your output will look like:

<Salt><Password><randomString0><randomString1>....<randomString999>

<Salt> = the salt in cleartext
<Password> = the password in cleartext
<randomString> = a series of different random strings.

The code as written just exposed the password, so all the random strings serve no purpose.

A fix to this problem would be:

function (salt, pass) {
   var data = salt + pass;
   for (i=0;i < 1000; i++ {
       data = StrongHash(salt + data)
   }
   return (data)
}

Note removal of the += and change to a simple assignment. It's a small change, but it means that now your final data is a single random string of the length that your hash algorithm outputs.

That's probably fine from a security perspective, but there's, again, no real improvement over the simple original version. The repetition of many recursive hashes adds no security - your first pass with a hash algorithm should produce a random result. Hashing the same thing over and over again does nothing in the best case, and worst case could end up reducing the value of the hash algorithm.

The first way offers a very significant benefit - the KISS principal. Keep It Simple. When repeating the function offers no benefit, there's no reason to make your logic more complicated, longer to process, more open to error and harder to debug.

Also - with cryptography, all algorithms should come with a user manual that would fill your average cubicle. The discussion of weak keys, problems with repetition, exposure and other mathematical input/output debates will keep mathematicians stocked with thesis topics for the rest of eternity. Until there's a specific algorithm in play, it's hard to discuss the real repercussions, but it's a generally good policy to leave the repetitions and permutations up to the design of algorithm rather than trying to help out. Many hash algorithms already have cycles of repetition and recursion that mean the user has no reason to do more.

"The repetition of many recursive hashes adds no security" is not *entirely* true. It slows down computation of the hash which makes dictionary attacks less feasible. — Cameron Skinner, Jun 17 '11 at 09:40
Why not XOR the password with the salt instead of concatenating it ? Can it make a difference on the strength of the resulting hash for certain hash function (MD5 for example) ? — Shadok, Jul 21 '11 at 10:19
A couple thoughts - salts are sometimes used to extend the size of the input so as to meet the crypto function's requirements. XORing isn't going to help there. I don't see any big security awfulness in doing it - but here's the caveat - I don't design crypto algorithms for a living. It'd take me a while to brush up on the math involved in MD5 to tell you for sure whether XOR was just as good as concatenation. — bethlakshmi, Jul 21 '11 at 18:27
The answer is incorrect: hashing the result of a hash prevents length extension attacks, for instance. So there is a definite benefit for hashing twice, but it isn't terribly relevant to this discussion. — Nakedible, Aug 09 '11 at 21:03
If you modify the loop to include the password in every iteration like `data = StrongHash(salt + data + pass)`, it would be very close to what some of the recommended algorithms are actually doing. — kasperd, Oct 29 '14 at 11:16
@Shadok it increases the input length, which increases the amount of work to crack passwords considerably larger. Not sure if xoring some data would introduce more randomness. — bunyaCloven, Nov 07 '16 at 07:41

Why is it always `HASH( salt + password )` that we recommend?

2 Answers2

Linked