Why is password strength often underestimated and uncertain in the context of password hashing?

Question

An aspect of security has bugged me for a long time: Why is there positive certainty about the importance of hashing algorithms and salts, but password strength is either never mentioned at all or considered a rather "philosophical" matter?

To me, there are three components of a correct hash implementation:

Slow and collision-free algorithm.
Good random salt.
Strong, dictionary-attack-proof password.

A failure in any component will render the whole hash thing useless. So, to me, there is no point to make a scene about using MD5, but then allow them to use 12345 as a password. To me, password strength is inseparable and strictly a technical matter, of no less importance than the other two aspects.

Yet, when it comes to the simple particular question, there is no particular answer at all!
Want a hashing algorithm suggestion? At your service!
Want to know how to get a good salt? There are a thousand ways!
Want to know what should be the minimum acceptable password strength? Err... you know, there is a trade-off and some political affairs of the sort, so "use the best you can".

But is there a technical, practical definition for the "use the best you can" technology? Just like one about hashing algorithms?

Or, in other words, why there is always a recommendation for the hashing algorithm, and a certain one ("use this"), but there is never a recommendation for the password strength ("allow at least such strength or understand the risk")? Or always an enormous emphasis on the algorithm that should be used and no concern for the strength of the password at all. Can't this leave an impression that the hashing algorithm alone is enough, leaving the poor programmer with a false feeling of security?

[Salts need to be unique but not random.](http://security.stackexchange.com/q/41617/539) — Gumbo, Feb 15 '14 at 11:16
Yes, yes - that's what I am talking about. There is always certainty about salts. Everyone is up with the good practice about salt, but no one about password. Though, random would be enough, because good password will make all these trifle differences negligible. — Your Common Sense, Feb 15 '14 at 12:59
The problem with ‘password strength’ is that you can only measure it in respect to a certain attack technique. One password can be hard to crack using brute-force but can be quite easy using a dictionary and vice versa. In most cases the brute-force technique is assumed, hence the emphasis on entropy. But an attacker would rather try different techniques: first the computationally easy ones (dictionary, certain patterns), then the harder ones (brute-force). So a ‘password strength’ needs to reflect the more realistic approach. Let alone the re-use of passwords … — Gumbo, Feb 15 '14 at 13:24

score 2 · Answer 1 · answered Feb 15 '14 at 07:23

Password hashing gets a lot of coverage in security guides and there is no good reason for this. Stopping SQL injection and executable file uploads is far more important. I think the reason for so much discussion is because it uses interesting-sounding technologies (like bcrypt) and few authentication libraries do it right by default.

The password storage best practice you mention is not universally agreed on, although it is heavily promoted by users on this site. In particular, using a slow hash algorithm is not usually possible on high traffic sites as the processing load is too great. There's an interesting document here

Password storage is generally invisible to users. Whether you use unsalted MD5 or scrypt, a user won't see any difference. Password strength is highly visible to users, and in particular, it is part of the registration process that most sites want to keep as simple as possible.

Something that is pretty universal is that having more security costs you something. And here's the difference between password storage and strength. Doing stronger password storage costs the site admins - but you can always buy more hardware, etc. Enforcing stronger password complexity costs the site users - and they will go away if they don't like it.

All security choices should be based on an understanding of risk. You don't need the same level of security for Angry Birds as you do for online banking, or indeed for nuclear weapons. In commercial environments, this kind of risk-based thinking is generally not done for defences that only cost the site admins, but it is heavily done for anything that affects the user experience.

For password strength, my advice is:

Low sensitivity - at least 6 characters
Medium sensitivity - Either: at least 8 characters, and a mix of letters, numbers and punctuation) OR at least 16 characters (at the user's option)
High sensitivity - use multi-factor authentication

It doesn't just depend on the sensitivity of the site though. There's a difference between online and offline attacks. A web site password doesn't need to be that strong, because an attacker can only try to guess it online, and you can use a lockout policy to slow these attacks. However, a Truecrypt password can be attacked offline (if the attacker has the drive) - so they need to be much stronger.

Password lockout policies are another important part of doing password authentication well, but they do not get the same coverage as password storage.

The more you look at passwords in detail you realise they are a pretty flawed authentication mechanism. All this best practice advice for making passwords as secure as possible is kind of like trying to re-attach a plane's wing using sticky tape.

"The password storage best practice you mention is not universally agreed on, although it is heavily promoted by users on this site"..very true. — Shurmajee, Feb 16 '14 at 18:19

score 0 · Answer 2 · answered Feb 15 '14 at 00:09

Long story short, you have complete control over how you hash passwords. But beyond a very limited point, you don't have control over how strong your users' passwords will be.

Best practices exist in password hashing so that you can minimize the effect of weak passwords. Salts should be unique so successful effort cracking one password doesn't contribute toward the effort in cracking others. Slow hashes like bcrypt or scrypt should be used to slow down an attacker iterating through password guesses. These algorithms increase the difficulty of cracking passwords by the equivalent by at least a few bits of entropy.

On the other hand, the best you can practically do with passwords themselves is limit their minimum length. Requiring symbols likely results in minimal extra entropy, as an overwhelming number of users will simply add an exclamation point to the end of their password. At the end of the day, what makes a good password is entropy, and users are notoriously bad at coming up with passwords with all but minimal amounts of it. Worse, you can't actually test for entropy — it's a measure of the means by which the password was generated, and not directly measurable on the password itself. "Password strength checkers" are as a result typically quite bad, and typically only frustrate users who end up trying to game the system in order to use their preferred password.

So we do the best we can on the parts of the process we can control and try to limit the downsides of the parts we can't.

Anti-weakpasswords · Answer 3 · 2014-03-15T02:12:35.690

It bugs me a long time already, why there is positive certainty about hashing algorithm and salt importance, but password strength either never mentioned at all or considered rather "philosophical" matter?

Or, in other words, why there is always a recommendation for the hashing algorithm, and a certain one ("use this"), but there is never a recommendation for the password strength ("allow at least such strength or understand the risk")?

It is possible to define and meaningfully measure the flaws in hashing algorithms (including salts) independently of all other factors
Hashing techniques can be published and peer-reviewed
Algorithm choices are made by us, as professionals building systems.
The business/owner/requirements setter/check signer:
- does not care about algorithms at all
- cares only that the algorithm isn't so slow as to incite user complaints
- cares only that the password rules don't incite user complaints
- (rarely) cares only that the algorithm meets legal, regulatory, certification, standards or similar requirements
The first standard deviation or two's worth of users:
- cares only that the algorithm isn't so slow as to annoy them
- cares only that the password rules aren't so complex as to annoy them
  - having to remember anything annoys them
  - failing to type their password right annoys them
  - failing to log in the first time annoys them
  - having to type too many characters annoys them
  - not being able to use their favorite password annoys them
  - having rules they're not already used to annoys them
Unlike algorithms, published passwords are rendered instantly weak, so they cannot be peer reviewed
While a plaintext password in and of itself (i.e. knowing only the string the user entered) can always be judged "weak", they can never be judged "strong".
- See Should I reject obviously poor passwords?
- "password" is obviously weak.
- "5f4dcc3b5aa765d61d8327deb882cf99"* is almost exactly as weak, yet it looks strong (long, lower case [a-f] and numbers).
  - note that it's long enough that length would overcome the limited character set were it actually a cryptographically random string.
- "NWY0ZGNjM2I1YWE3NjVkNjFkODMyN2RlYjg4MmNmOTk="** is also almost exactly as weak, yet it looks even stronger (long, upper case, lower case, numbers, symbols).
You will never, ever know the precise wordlists and rules that a rules-based dictionary attack will use. You will also never be as up to date on the latest in passwords leaked from other breaches as the attacker that did the other breach (or, probably, their forum buddies).

P.S. Using a great hashing algorithm on a password of 12345 conveys little added security, but using a 1000 byte purely random passphrase conveys little added security if it's stored in cleartext, too. The chain is only as strong as its weakest link!

* MD5("password")

** Base64(MD5("password"))

Why is password strength often underestimated and uncertain in the context of password hashing?

3 Answers3