25

I've given myself the task of writing code that determines the strength of a password, and really want to break out of a lot of already established ways we do that, as they're often lacking, not designed the right way, or quickly become irrelevant.

Generally, we'll see patterns where you enter a password, and a boolean "you need 7 chars with letters and numbers" is presented. Later, we had a reasonless graphic that shows a "strong to weak" scale ranking (and variations where this is a percentage). We've been getting better at this pattern, a lot better actually, where we think about showing user time to brute-force, explicitly showing reasons for why that score was assigned,

Now, I really like the presentation of these last two ideas, but that's not my primary concern here, and my question has nothing to do with presentation of this tool.

My question is what should a program that generates an estimated range of password security take into consideration?

After looking through the source of howsecureismypassword.net, we see it does a few neat things, like keep a list of top 500 common passwords, groups passwords by character classes that are involved, and relates a meta-score to an estimated time to crack a bunch of hash values in that range. There's some problems with this, such as the top 500 passwords don't reflected present "top 500," but rather the ones at time of writing the code, same goes for password cracking speeds, and attacks that may not end up using brute-force.

I see similar issues with passwordmeter.com, but more in the sense that it runs best-case rather than worst-case scenarios.

So my question is

What do I need to think about to measure a password's security score at time "now"?

AviD
  • 72,138
  • 22
  • 136
  • 218
Incognito
  • 5,204
  • 5
  • 27
  • 31
  • 1
    Very closely related, but not duplicate: http://security.stackexchange.com/questions/2687/how-reliable-is-a-password-strength-checker – AviD Jun 18 '11 at 23:03
  • 1
    Just a cautionary note about usability of passwords. Strong passwords are good only to the point at which people remember them and keep them confidential. – this.josh Jun 19 '11 at 03:08
  • One of the mathematically cool things about true randomness is that you can't measure it accurately. Because the kolmogorov complexity of a string is noncomputable, you can never say how random a string is. All you can say is how random a string *isn't*, based on the best compression of it you've found so far. – user502 Jun 20 '11 at 12:56
  • 1
    @user502 Thanks for that, searching for untruths is often a powerful problem solving tool that so many people forget to think about. – Incognito Jun 20 '11 at 13:21
  • 1
    @user502 And with true randomness you could have a key of all ones or all zeroes, which we specifically try to avoid. You could even improbably have a hundred keys of all ones or zeroes. Thats why we like pseudorandomness: the great taste of randomness with half the calories! – this.josh Jun 23 '11 at 07:03
  • Showing uset time to brute force an attack really depends in how mich computing power an attacker can access. This could potentially deliver a false sense of security. – Daniel Jan 15 '16 at 08:39

3 Answers3

18

The best work in this area I've seen is by Matt Weir in Reusable Security: New Paper on Password Security Metrics (2010). He describes the difference between "Shannon entropy" and "guessing entropy". He also has an interesting method of taking a password for a user, analyzing it, and offering suggestions to make it better:

....other methods for password creation policies, including our proposed method to evaluate the probability of a human generated password by parsing it with a grammar trained on previously disclosed password lists. This allows us to build a more robust reject function compared to a simple blacklist, while attempting to provide the most user freedom possible, given the security constraints of the system, when selecting their passwords.


Update: as user185 notes, Appendix A of the NIST Electronic Authentication Guideline from 2006, revised in 2013, is also very helpful. It goes into detail on calculating these two terms:

” As applied to a distribution of passwords the guessing entropy is, roughly speaking, an estimate of the average amount of work required to guess the password of a selected user, and the min-entropy is a measure of the difficulty of guessing the easiest single password to guess in the population.


Note that this question is closely related:

nealmcb
  • 20,544
  • 6
  • 69
  • 116
  • 1
    There is a new revision of the NIST document: http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-63-2.pdf – Alexis Wilke Jan 13 '16 at 05:22
6

Appendix A of the NIST Electronic Authentication Guideline details the method they use to construct the entropy vs. password length table A.1, including a few references for further reading.

  • That's an awesome resource, do you happen to know of any posts where people scrutinize any potential problems with that section? – Incognito Jun 20 '11 at 13:23
  • No, I don't :-( –  Jun 21 '11 at 16:09
  • Your link currently states that the document has been superseded. This seems to be the updated document: http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-63-2.pdf – Ben Jan 13 '16 at 16:47
  • @Ben Which has again been superseded by https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-63b.pdf – Luc Nov 22 '18 at 14:30
2

I think you should account for the various ways in which a password is actually attacked, which is going to take some research. Obviously a password that is wholly composed of an exact match to a common password should have a "very weak" strength (or be disallowed outright). You should probably expand this by searching out "default" or commonly used word lists that the script kiddies will definitely try when cracking a password; there are readily available lists of tens of thousands of passwords (or more) that will definitely be tried against your users if your database is ever leaked. So include those in your "common password" check.

But crackers certainly won't limit themselves to a simple "exact match" search, so neither should your strength meter. Research common patterns used by crackers, such as combining two words from the password dictionary, or substituting letters for numbers or other "1337 speak" types of substitutions (e.g. "p@ssw0rd$ 4r3 4w3$0m3!1"). The howsecureismypassword.net site you mention fails here: it rates "passwordpassword" as taking "345 thousand years" to crack, which is absurdly wrong. I'd guess that will fall in less than a second. These are not the only rules to consider; many passwords follow very simple patterns like {capital letter}{6 lowercase letters}{number}! which are far less secure than 9 random characters (but still slightly better than a simple dictionary match). A variety of such common patterns will also be tried before resorting to brute force.

How you handle these transformations or word matches is up to you; but regardless they should be accounted for somehow. One thing to look into, is how have open-source tools handled this?

As an example, the quality estimation function for KeePass password manager reportedly handles this by calculating an entropy based on how many patterns are used to create the password, and the strength of that pattern, rather than using character counts, if patterns are detected. In older versions of the software, a naive count-based entropy was simply penalized based on recognized patterns. You could probably do well with either method. The trick will be to keep the patterns up-to-date with advances in cracking, but at the very least accounting for the really basic stuff could have drastic improvements in the strength of your users' passwords, especially if you explain in the interface what common patterns their password is going to be guessed by.

Ben
  • 3,846
  • 1
  • 9
  • 22