Hashed passwords storage: iterations vs entropy

Question

Context: A website is hosted on multiple dedicated servers. There are frontend webservers, backend DB servers and other servers between them. The DB holds user's accounts passwords. All connections are secure and the registration/login pages can not be bruteforced (out of the question's range).

Threat model: A hacker manages to get a copy of the DB and wants to crack the passwords. The hacker has very powerful hardware power at his disposal (CPU/GPU/RAM/ASIC) and only needs to crack one single password to be considered as the winner (protecting a maximum of users is not enough, all users must be protected).

Question: Which of the cases below would be considered the most secure against the hacker efforts?

Traditional iterations based hashing with salt (scrypt/bcrypt)
Handmade multi-peppered hashing with very long secrets held at each "stage" (webserver/middle/DB).
- Webserver operation: Hash1 = SHA512 (password + secret1)
- Middle operation: Hash2 = SHA512 (hash1 + secret2)
- DB operation: Hash3 = SHA512 (hash2 + secret3)
- DB storage: Hash3 with salt

My opinion is that case 2 is by far the more resistant because iterations will not protect poor (low entropy) passwords such as "123456" for more than a few hours/days at best. As long as the secrets are very long and at least one of them remain unknown then I think the increased entropy will secure everyone for decades.

I do not see such scheme rolled out on production very often though, in general the consensus is "iterations+salt or get shamed", so maybe I am missing something.

Of course the implementation would need to be secure (handmade vs more standart scrypt) and the secrets must remain unknown. Moreover if a secret is revealed and must be changed, the update process would be a pain. On the other hand, login process would be more hardware friendly (no iterations).

Edit (based on Matthews comment): My question is not about the benefit of a pepper in an iterations based method. It is the security difference between iterations based vs secrets based methods.

Edit 2: I made more searches and managed to find a similar question after all, my bad (maybe the results improved based on my activity on current question though). It has a nice answer (by Thomas Pornin) similar to here.

There are loads of questions asking very similar things around. The key problem is that if you're trying to protect against weak passwords, an attacker won't bother trying to crack the passwords. They'll get a botnet and try weak passwords against the usernames, using your normal system. They've got the usernames - they're in the DB... — Matthew, Feb 02 '17 at 15:56
I agree about the number of questions on this topic but I think mine is quite specific and has not been covered by any other. Moreover, while your mention of "botnet attacking the normal system" could be valid on a normal context, I asked to keep login process abuse as off-topic here. — HashDoe, Feb 02 '17 at 16:08
@Matthew : I do not think my question is a duplicate of the one you linked. What you linked is "iterations+salt+pepper" vs "iterations+salt" (all cases considered include iterations method only). My question is different, it is "iterations+salt" vs "no iteration+many peppers". Anyway, from the answers of what you linked, as long as the pepper can remain secret, it is considered as the most secure option, which leaves my question unanswered : why is the industry standart based on iterations instead of peppers ? — HashDoe, Feb 02 '17 at 16:28
Because there is some question as to whether there really is such a thing as a pepper in this context, or if it is just a fancy salt protected via security by obscurity. If the hacker could get the DB, why couldn't he get the pepper too? — John Wu, Feb 02 '17 at 16:44
@JohnWu : I guess it mainly depends on how the website is hosted/configured. In current context, the hacker could have got the DB through an SQL injection or by stealing the DB server from the datacenter, but the webserver/middle secrets should still remain safe/unknown. Which is the point of my question : as long as at least 1 secret is safe, does it mean iterations would be useless (because less secure than the remaining secret itself) ? — HashDoe, Feb 02 '17 at 16:53
Why not do both? Use a slow hash algorithm and many peppers? The pepper is not a replacement to the "slowness" of the algorithm, it is a complement. — Anders, Feb 02 '17 at 17:03
@Anders : I fully agree that would be the ultimate method. Removing iterations is a convenient way to reduce hardware load though. Could you please elaborate why several secrets can not replace (or even improve as I believe) the slowness of the algorithm ? That's exactly what I fail to understand : why we almost never see the slowness replaced by secrets on production. — HashDoe, Feb 02 '17 at 17:13
Say an attacker manages to get a db dump. That means there is at least one security vulnerability (whether that is a bug in your application or a disgruntled employee who deliberately exposes your data). Can you guarantee that there are no vulnerabilities that may expose the pepper? Worst-case scenario, vulnerabilities in your application expose the database and your secret peppers. At that point, your only defence is a salt and a slow hashing algorithm. — knbk, Feb 02 '17 at 19:29
@knbk : a vulnerability or an employee leading to a DB leak is quite unlikely to be able to impact the other layers (different applications and access rights) so the other secrets should be safe. If the webserver's secret is compromised this way, hell's gates are already wide open because he probably can replace the frontend pages with malicious code to steal the user's passwords in clear when they are submitted at login. But you still have a point : such critical far reaching vulnerability can happen in theory. — HashDoe, Feb 02 '17 at 20:23
The attacker getting a dump of the db is also quite unlikely, but it does happen. The point is that you want to defend against those scenario's anyway, you don't want to assume that they just don't happen. — knbk, Feb 02 '17 at 20:25

Luis Casillas · Answer 1 · 2017-02-02T19:55:45.567

A message authentication code with a secret key that's unavailable to the attacker will indeed resist many attacks that resource-intensive password hashing won't. There's no question there. You don't even need the excessively complex three-tier key architecture; just one 128-bit secret key would be plenty.

(Side note: I highlight "message authentication code" because that's the cryptographic construct that's appropriate for your proposal. Your "SHA512(password + secret)" betrays that you don't seem to know what construct is appropriate for what you've proposed; HMAC-SHA512(key, password) would be the better choice.)

But here's the big, difficult topic that your question completely sidesteps: how do you protect those secret keys? You say this:

Of course the implementation would need to be secure (handmade vs more standard scrypt) and the secrets must remain unknown.

...but those two are precisely where laypersons who attempt to implement your proposal will fail, again and again! You already have questionable crypto in your proposal (handrolled keyed hash instead of HMAC, three layers of MAC computation when just one would do), and you say nothing about how you'll make sure your secret keys don't get disclosed—which is the point on which the security of the scheme hinges!

Moreover if a secret is revealed and must be changed, the update process would be a pain.

This can be solved with a hash-then-encrypt approach, like Dropbox has adopted (with bcrypt, however). The equivalent in your no-iterations approach would be something like encrypt(key, hash(password)), or (scarcely) better encrypt(key, hmac(salt, password)), which would allow you to later rekey the encrypted hashes.

I do not see such scheme rolled out on production very often though, in general the consensus is "iterations+salt or get shamed", so maybe I am missing something.

Insecure storage of long-term secret keys is really, really common, and nearly everybody who suggests using a pepper haven't really thought the key management problem through. This is what makes key-less schemes like resource-intensive password hashing so valuable and such a strong recommendation; if we can avoid or reduce the reliance on secret keys, that is generally a plus. Basically, key management is hard, so the fewer long-term cryptographic keys we need to store and defend the better, and the more we can mitigate the consequences of failure the better as well.

Secret keys should therefore only be considered for password hashing after:

You've adopted resource-intensive password hashing, which serves at least as a fallback line of defense if the keys are disclosed.
You have a well-thought out and well-engineered system for protecting and managing your systems' secret keys—ideally one where the keys are stored in just one secure key management system whose only purpose is to store long-term secret keys and never to leak them to any external system. A KMS shouldn't disclose the keys not even to its clients, who instead must resort to asking the KMS to perform cryptographic operations on their behalf. Some examples:
- A hardware security module
- Secret management tools like Amazon KMS or HashiCorp Vault's Transit backend.

Your answer's last part seems to be the conclusion everyone else share : "secret keys are much more secure but key management is hard". I do not agree with your first part though, HMAC will not provide any added security over hash (unless you expect SHA512 to have collisions on a field a short as a password). Hacking 3 layers of service, each on dedicated hardware with different running applications and access rights is probably much more difficult than hacking a single one too, for obvious reasons. — HashDoe, Feb 02 '17 at 22:30
The idea that hacking three different layers is harder is wishful thinking. You highlight the ways in which they're *different*, but they will likely also share just as many commonalities—e.g., they'll all be running the OS, receiving the same patches, same third-party monitoring software, etc. Even the custom software that differs between them will often be written by the same team who may well introduce the same security flaws into all layers. And in any case, if the solution worked the fact that it would require three layers of dedicated hardware is a big disadvantage! — Luis Casillas, Feb 03 '17 at 00:13
Also, the fact that you're talking about "collisions" when discussing HMAC vs. SHA-512 does not reassure me that you understand the issue here: what are the *necessary* properties for your scheme to be secure, and what functions or construction supply those properties? What you want here is a *message authentication code*, because (a) it's a secret keyed function, and (b) it provides existential unforgeability under adaptive chosen message attack. HMAC is a very secure MAC; `hash(msg + key)` might not be. — Luis Casillas, Feb 03 '17 at 00:21
To clarify that last comment, in this context, "existential unforgeability under adaptive chosen message attack" means that even an attacker who can submit passwords of their choice for you to hash, observe the results, and adapt their attack based on what they learn from these results, still cannot feasibly forge *any* message + hash pair that they haven't tried already. HMAC provably has this property, and HMAC-SHA512 likely has it. `sha512(password + key)` doesn't. — Luis Casillas, Feb 03 '17 at 00:25
We may be going off-topic but when you wrote "message authentication" and "forge message", it should have ringed a bell on your side : we are not trying to verify that a random message comes from a specific sender, we are simply verifying that the provided password is correct (always the same). To clarify the difference I suggest you to have a look at the HMAC wikipedia page or to "curiousguy" comment here : http://security.stackexchange.com/a/16811/138146 (same comment too by Terry Chia at the top of the page under OP question). It is ok unless SHA512 has collisions within password length — HashDoe, Feb 03 '17 at 00:49
PS: and even if it has collisions on such short ranges (which would have been found by now), it might only allow you to login to your own account with "password2" instead of "password1". Fine with me, I do not see any potential security issue with that. — HashDoe, Feb 03 '17 at 01:10
"Message authentication code" is just a name, what matters is what it *promises*, not what it's called. A MAC promises that an attacker who doesn't know the key, even if they can learn lots of `(message, tag)` pairs, even if these pairs are of their choice, can't forge even a single other one. Applied to your context this means that an attacker who doesn't know the key, even if they can learn lots of `(password, hash)` pairs, can't learn any other one, much less one for a password or hash of their choice. This is *exactly* what you want and HMAC provides it. — Luis Casillas, Feb 03 '17 at 01:24
Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/52971/discussion-between-luis-casillas-and-hashdoe). — Luis Casillas, Feb 03 '17 at 01:24

score 4 · Accepted Answer · answered Feb 02 '17 at 20:46

Assume that your system is fully compromised. An attacker has all of your peppers, all required salts, and knowledge of how you're combining them. This is the equivalent situation for an attacker who gets a table of bcrypted passwords.

At this point, for your scheme, breaking passwords is fast. SHA-512 can be calculated at millions of hashes per second. For the current general advice, bcrypt, breaking passwords is still slow. By tuning it, standard hardware could be restricted to 100s per second, with specialist hardware still only giving 10,000s per second.

Now, if you can be absolutely sure that no one can steal your pepper data, that's fine. However, if you do lose them, your passwords are almost instantly vulnerable - fast hashing lets an attacker try common passwords easily. On the other hand, the iterations don't have that issue.

The iterations are also easy to implement - one, usually built in, function. That's very little effort for developers, which makes it easy to implement correctly. On the other hand, your scheme requires hashing code in multiple places, on different servers. You need to make sure that the intermediate hashes aren't exposed at any point. You need to ensure that the code at each layer is correct. You might need to run extra software on some systems (does the DBMS natively support the hashing method?). There are lots of moving parts. It's a lot easier to make a mistake somewhere.

In short, iterations retain security in the event of complete compromise, and are easier to develop. In an ideal world, multilayer pepper would help against the failure mode where the database is compromised, and no other systems are. It would be suitable for adding to a system with high security requirements, in addition to using slow (high iteration) hashing, but not instead, unless you have complete confidence that the peppers can never be compromised.

Thanks, this is a nicely written summary of what most people seem to be thinking. Key management is hard and can totaly fail, yet it can offer superior security in some cases... it's sad that it is not used more often. — HashDoe, Feb 02 '17 at 22:38

Hashed passwords storage: iterations vs entropy

2 Answers2