Theoretical - Password salting with concatenation vs. salting with HMAC

Question

I was looking through different methods of salting and tried to compare which is more secure for password storage. I know HMAC wasn't meant at first for password hashing but it is widely used for that purpose. I also know the more secure method is using key stretching. I looked through CrackStation, old don't hash secrets article, question on this topic and 3 wrong ways to store passwords article. I still was not able to conclude which of the following methods is more secure than the other and why:

sha256($pass.$salt)
sha256($salt.$pass)
HMAC-SHA256 (key = $pass, $salt)
HMAC-SHA256 (key = $salt, $pass)

WARNING: None of these are secure for password storage. The question is about marginal theoretical differences between these methods of password hashing and salting.

I was considering length extension attacks which probably are not a factor here, or are they (which would make HMAC more secure)? Then also maybe would HMAC be less secure because the key is known? How would you sort those methods by security?

Putting aside,forms a moment that there are other properties possible/needed (iterations, memory hard, parallelism which is all addressed in PHC argon2 or at least scrypt) the third method of your list is the most secure: HMAC is specifically designed to protect the key from beeing revealed/reversed that's why putting the password there (and not a public seed) is the best fit. Both the hash and the HMAC have the pre-image protection, but only the HMAC (and non-MD hashes) withstand extension attacks, which is a thing if you concatenate — eckes, Apr 26 '17 at 22:16

score 4 · Answer 1 · answered Apr 27 '17 at 06:41

The four methods you listed are essentially equivalently secure for the purpose of hashing passwords.

The HMAC construction of H(k1 ++ H(k2 ++ m)) (where ++ is concatenation) is designed specifically to prevent attacks against MACs being used to authenticate messages as being written by someone possessing the secret key (k1, k2; where k1 = K ⊕ opad, k2 = K ⊕ ipad, where opad=0x5c5c...5c and ipad=0x3636...36). Due to common hash functions like md5, sha1, sha2 being vulnerable to length extension attacks, you can't use a construction like H(k ++ m) as a MAC, since an eavesdropper who observed a signed message m could use a length-extension attack to make a valid MAC for a message m' = m ++ tampered_msg. Similarly, the construction H(m ++ k) is also insecure if an attacker could generate a collision (find two messages H(m1) = H(m2) where m1 and m2 lined up on a block) and then if he can trick someone into creating a MAC for m1, he can use that as a valid MAC for m2.

However, being susceptible to length-extension or pre-collision attacks is not relevant for the use of storing and validating user passwords. All that is relevant is the length of the returned hash, the fact the salt is unique for every stored password in your system (so multiple passwords are not attacked in parallel, the time to calculate the hash (intrinsically slow hashes are more secure against brute force), and the strength of the password.

If I use H(random_salt ++ password) with a hash that's vulnerable to length-extension attacks, there's no equivalent vulnerability. Who cares if someone who doesn't know my password but has seen the hash could potentially calculate H(random_salt ++ password ++ tampered_pw) for whatever tampered_pw they want to add onto my password without figuring out what the password is?

That said, the HMAC construction may be a tad safer in that it calls the hash function twice, so theoretically should take double the time to brute force; though typically when you perform multiple rounds of hashing you do many thousands of rounds of hashing (that makes brute forcing many thousands of times harder) instead of just 2 rounds of hashing.

Would there be even a small diffrence between using HMAC-SHA256 (key = $salt, $pass) and HMAC-SHA256 (key = $pass, $salt)? — fsacer, Apr 27 '17 at 14:34
@fsacer - No, the only significant difference I could imagine was if one method truncated long passwords (e.g., passphrases) sharply reducing the entropy of the password (which is your main strength against being brute forced). However, the [HMAC construction explicitly allows arbitrary length keys](https://tools.ietf.org/html/rfc2104#section-3), so this should not be an issue. — dr jimbob, Apr 27 '17 at 20:38

score 2 · Answer 2 · answered Apr 27 '17 at 05:29

(None of the examples you give should be used for password hashing. They're much too fast!)

The first general principle I would mention here is that we want our cryptographic operations to be high level interfaces; there should be a common, reusable abstraction that defines:

What preconditions the caller should fulfill before calling them;
What arguments the operation takes;
What values it returns, or what postconditions it fulfills when correctly called;
What security properties the concept should offer.

In this case we're talking about password hashing. A password hashing function, in its simplest incarnation, should:

Accept two arguments, a salt and a password;
Return a hash code for that password/salt combination;
Inherently consume a lot of resources so that it slows down a password-guessing attacker substantially, but not so much that an honest user

So SHA-256 fails the first and third points. It only accepts one argument, and is much too fast!

And actually, the interface described above is arguably not even the best one for password hashing. A better interface is as a pair of functions written on top of the password hash (in Python-ish pseudocode):

def generate_new_verification_code(password):
    salt = crypto_random(16)     # 16 byte random salt
    hash = password_hash(salt, password)
    verification_code = salt + hash
    return verification_code

def verify_password(putative_password, actual_verification_code):
    actual_salt = actual_verification_code[0:16]
    actual_hash = actual_verification_code[16:-1]
    putative_hash = password_hash(actual_salt, putative_password)
    return putative_hash == actual_hash

So proper abstraction says that for password verification we should be using two functions like these, written on top of a resource-intensive password_hash function, which in turn would likely be written on top of a low-level function like SHA-256.

Note that PHP's password hashing interface actually follows this latter design:

The password_hash function is meant to be called with a password but no salt; it picks a salt randomly, and as the page says: "The used algorithm, cost and salt are returned as part of the hash. Therefore, all information that's needed to verify the hash is included in it."
The password_verify function is used to check whether a putative password matches the verification string.

Second problem: one key idea in the design of security protocols or other security constructs is that attackers will often break the rules, and cause your code to be called in unexpected manners and contexts. For example, when people write something like this:

hash($salt.$pass)
hash($pass.$salt)

...they often implicitly assume that salt will always be the same fixed length, and don't stop to consider whether some attacker might be able to:

Find a way to "break the rules" so that they can cause the length of salt to vary across multiple calls to the code;
Find some unexpected way to exploit this to their advantage.

For example, you might think it's impossible for an attacker to find a collision for passwords hashed this way, but in fact if they can control the salt and pass it's trivial to do it:

hash("0001"."Passsword1!")
hash("0001P"."asssword1!")

This might sound farfetched, and well, to tell you the truth, I can't actually think of a scenario where this might be exploitable. But we really would like our security to founded on grounds more solid than "I can't think of any way to break this." So as a general philosophy, it's safer to design things in such a way that ambiguities cannot arise. In this case, we would like the following rule to hold:

If we hash two different passwords with two different salts, the inputs we provide to the low-level hash function should be different.

So this means that to produce the input that we feed to the underlying hash, we should prefer to combine them with an injective function—a function such that combine(salt1, password1) == combine(salt2, password2) if and only if salt1 == salt2 and password1 == password2.

hash(combine($salt, $password))

Some ways of doing this:

Pad one of the inputs to a fixed size, and prepend it to the other. HMAC does this internally for keys.
Hash one of the two inputs and prepend it to the other. HMAC does this for keys that are longer than the underlying hash function's block size.
Put an unambiguous delimiter between the inputs (possibly requires you to escape occurences of the delimiter in the input values).

Dedicated password hashing functions like PBKDF2, bcrypt, scrypt and Argon2 adhere to most of these ideas.

Good job explaining this from another perspective :) – fsacer Apr 27 '17 at 13:42 — fsacer, Apr 27 '17 at 13:42

score 1 · Answer 3 · answered Apr 24 '17 at 23:27

1

HMAC, by itself, is not an appropriate password hash. You are working at way too low a level for your cryptographic expertise. Constructs designed specifically for foolproof verification of passwords already exist; use them! Choose one of bcrypt, scrypt, or Argon2.

answered Apr 24 '17 at 23:27

Stephen Touset

5,736
1
23
38

This comment is correct. Hashing with a salt isn't enough, you need to consider the work involved in cracking the hashes if they're disclosed. The hashing mechanisms you listed all have a work factor of 1. Standard algorithms like scrypt or bcrypt will have a work factor of 10-12 while still retaining reasonable times for computation. – u2702 Apr 24 '17 at 23:36
1

I know that neither of methods is appropriate for password storage by today's standards but that was not the question. The question still stands which of those methods is more secure (order) than the other and why? – fsacer Apr 25 '17 at 05:27
1

None of them are secure, and they are equally bad for the same fundamental reason: most passwords are weak, and an attacker with a list of salts and hashes can attempt *billions* of guesses per second or more. It doesn't matter how unpredictable the output is when the inputs are so predictable. Modern password hashes have customizable work factors that increase the CPU and/or memory necessary to verify a guess. When properly tuned, an attacker can try only a handful per second. – Stephen Touset Apr 25 '17 at 05:42
But still there must be some difference (ignoring weak passwords) between them that's what I'm looking for. – fsacer Apr 25 '17 at 05:45
1

Why must there be some difference? None of these methods can be attacked directly by reversing the hash. In that way they are all equally secure. However all of these methods are extremely fast on modern hardware, so all of these methods when used for password storage will allow 50% or more of all the passwords in your database to be guessed within minutes. Attackers don't ever attack the hash directly by trying to reverse it. They use a dictionary of common passwords and just hash each one, looking for matches in your database. So there is no difference in the security of these methods. – Ben Apr 25 '17 at 13:55
1

What does "You are working at way too low a level for your cryptographic expertise" mean? It's unclear since you don't explain why it's too low level, whether "low level" means digging too much into underlying implementation or just too simplistic, or how you know his level of expertise. I'm also not sure how that sentence addresses the question. – Cody P Apr 27 '17 at 22:35

score -1 · Accepted Answer · edited Dec 23 '20 at 23:06

-1

Based on additional research and conversation on reddit, I guess the order would be:

(more secure)

HMAC-SHA256 (key = $salt, $pass) - best because it doesn't have drawbacks of limiting password length, it also seems favorable because key is more random which might make hash even more random and you don't have to sanitize password in some cases (thanks to @Nat)
HMAC-SHA256 (key = $pass, $salt) - possibly limited password length based on sha256-hmac implementation, more secure because of a bit higher work factor
sha256($pass.$salt) - based on some specifics of SHA implementation it is more secure to append salt
sha256($salt.$pass) - prepending salt not as secure

(less secure)

Disclaimer: This difference is more of a theoretical nature and in practice you should not use any of these methods but use any of these stated in Stephen's answer.

edited Dec 23 '20 at 23:06

Tsundoku

127
1
5

answered Apr 25 '17 at 19:22

fsacer

127
9

No, the password is the secret to protect. There is no length problem, HMAC defines an additional hash around the key if it exceeds the blocklength. So would swap 1+2 but otherwise fully agree. – eckes Apr 26 '17 at 22:19
But isn't HMAC key supposed to be random https://security.stackexchange.com/a/95977/124532? Well it's best to not have key to protect anywhere as hash does not include the message. I also see that length shouldn't be problem. – fsacer Apr 27 '17 at 14:06
What do you mean by "secure"? This is, what security property are you trying to get? – Nat Apr 27 '17 at 14:10
@Nat well probably how difficult is to calculate hash and if there are vulnerabilities with any of these approaches. – fsacer Apr 27 '17 at 14:13
Vulnerabilities against what, though? – Nat Apr 27 '17 at 14:15
@Nat ones that would make it easy to get password from hash or calculate hashes even easier probably. – fsacer Apr 27 '17 at 14:19
The main attack against passwords is brute-force hashing a dictionary of possible passwords, then checking which matches the observed hash. It's really hard to defend against this attack because the attack itself is so simple. In practice, the main defense is to make the hash algorithm really slow (expensive) so that attackers aren't inclined to waste their time. The algorithms others recommended are were specifically designed to be slow. – Nat Apr 27 '17 at 14:22
If you want to reproduce this behavior with SHA hashes, you'd probably just do something like heavily nested hashing. Like, instead of HASH(pass || salt), you'd do HASH(HASH(HASH(HASH(pass || salt)))), except lots more, 'cause SHA's really fast. The four recommendations that you give above are all very fast, such that most reasonable attackers could break any of them for pretty minimal cost. Beyond that, I'm not sure what sort of security notion you're going for. – Nat Apr 27 '17 at 14:24
Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/57833/discussion-between-fsacer-and-nat). – fsacer Apr 27 '17 at 14:26
Why even ask the question if you're just going to ignore all the responses to peddle your own nonsense with zero real-world meaning? Have a -1. – Ben Apr 27 '17 at 15:44
@Ben I'm not ignoring them, but they do not answer the question I was having. As you see I was looking for theorethical difference not a practical one. I already knew these were not for use in practical application for password storage. I upvoted all good answers, but they do not answer the question I was haviong. So I do not see the reason for downvote. You can edit the answer if you know more details or post a new one and I might accept that one if it has better reasoning around the order of security. – fsacer Apr 27 '17 at 16:40
@Ben Theory by nature does not have much of a real world meaning. – fsacer Apr 27 '17 at 16:50
@Nat I guess, I should have had expressed and learn to express my question/question more clearly then. – fsacer Apr 27 '17 at 17:20

Theoretical - Password salting with concatenation vs. salting with HMAC

4 Answers4