5

In the question Source code as password I suggested using a combination of entropy and as @Lawtonfogle put it 'meatspace difficulty'. As suggested in the xkcd comic correct staple horse battery is a better password than Tr0ub4dor&3 since it has both more entropy and is easier to remember.

Are there password strength metrics which take both entropy and ease of memorization into account?

Clarification: I'm interested in an "information theoretical" answer. In other words, assume that the password is converted to a form where the entropy is 1 bit/bit and then stored using a cryptographically secure perfect hash.

COL Wotohice
  • 503
  • 2
  • 10
  • 1
    probably depends on what you mean by "strength". Ease of memorization isn't usually considered a password "strength". – hft Apr 10 '15 at 19:04
  • @hft Any password that you need to write down is a weak password in this context. – COL Wotohice Apr 10 '15 at 19:22
  • So, you are asking how one quantifies the possible need to write down a password versus the difficulty to crack a password. Sounds like apples to oranges to me, but maybe someone else can help.... – hft Apr 10 '15 at 21:09
  • as @hft said, memorizability is not a component of password strength, it is in fact orthogonal. As I mentioned [in my answer to the question on that xkcd](http://security.stackexchange.com/questions/6095/xkcd-936-short-complex-password-or-long-dictionary-passphrase/6116#6116), a **good** password has two INDEPENDENT components: strength and rememberability. – AviD Apr 13 '15 at 20:39
  • It's also worth noting that only strength can have an "information theoretical" answer, for rememberability you need to go into psychology, neuroscience, etc... – AviD Apr 13 '15 at 20:40

3 Answers3

2

Say you use Diceware you can generate a memorable, secure passphrase.

For the technically inclined, each word in your Diceware passphrase yields 12.9 bits of entropy, the way passphrase security is measured.

If you want a passphrase to be uncrackable, ever, using today's technology (and the technology of the foreseeable future), you need 128 bits of entropy in your passphrase.

Therefore you need 10 dicewords to achieve this level of entropy:

128/12.9 = ~9.9

You can use a list memorising technique to connect the words in a memorable fashion. You could also use this technique to link it to the purpose of the passphrase. This is the type of technique that artists such as Derren Brown use in their stage shows. For example, the "tricks" where they remember whole telephone directories.

If the passphrase is stored hashed using a key stretching algorithm, you could get away with less entropy. This is because for every password an attacker tries, they have to run the hashing algorithm a number of times, known as the iteration count. The iteration count will cause any brute force cracking of the passphrase to be sufficiently slowed. If you want 10 bits less, then this is the equivalent of 1024 hash iterations (2^10) using, say bcrypt. For 20 bits less, then this is 1,048,576 iterations (2^20). So if you went for the latter, you could make your password slightly easier to remember by using 9 dicewords. However, if you're using a memory linking technique, an extra word won't hinder memorability of the passphrase - it would just be slightly easier to type in each time it is used.

Are there password strength metrics which take both entropy and ease of memorization into account?

If you're measuring password strength in an objective way the only metric is amount of entropy. Entropy does take into account the passphrase generation method. If the average English word length is five letters, a password generated using Diceware would be 50 characters long on average if we're going for 10 words (for simplicity, we're discounting the character symbols used in Diceware). A 50 character lower case password completely randomly generated would normally have 235 bits of entropy - as we can see though, as the method of generation is to use whole words generated by five throws of a six sided die we only have 128 bits.

With the Diceware approach, the aim is to remove the memorisation portion from the equation, so we're left with the mathematical calculation of entropy only. We make memorisation of the passphrase a solved problem so we're just left with an objective "strength" value, based on the time required to brute force it.

You don't have to go for the full 128 bits. Just be aware that a targeted offline attack on a passphrase with less strength will be cracked eventually. A 128 bit passphrase is more important where the passphrase protects something that can be accessed directly - such as in password-based-encryption where cracking the password will enable an attacker to read the encrypted data. It is less important where the passphrase is only used as an authentication mechanism, for example where it protects an online account. It is advisable to regularly change your online passwords if they are a lesser strength because if an attacker has managed to retrieve hashed passwords, they will be cracked given enough time - although I disagree with the 90 day period usually mandated.

Edit:

So I would recommend a (separate) memorable passphrase with sufficient entropy for situations where you need to be able to type the passphrase. For example, as the master password for your password manager, for Operating System logins, or to authenticate you for full disk encryption. For online accounts, I would recommend the use of a password manager to generate a completely random 128 bit password that you can't remember.

SilverlightFox
  • 33,408
  • 6
  • 67
  • 178
  • You've come close enough to what I'm thinking of with your eloquent explanation. I guess the term "strength" is a misnomer, as what I'm looking for is a combination of strength against brute force and ease of use. In your last paragraph, you point out an important use. For everyone who uses a password vault a la KeePass, this is important. You want a password strong enough to resist attack, yet easy to remember. – COL Wotohice Apr 12 '15 at 13:52
1

Ease of memorization and cryptographically secure password criteria manifestly contradict each other by their respective definitions. In fact, we can show that the information-theoreticall measure of password strength is equivalent to entropy.

What is Entropy Again?

Remember that Shannon Entropy is the information content of a discrete variable X - that we will take to be password string - assumed to be random, as a function of H(X) = log2(2^n) bits of entropy. According to this definition, a brute-force strategy that makes no a priori assumptions about the nature of X will require at most 2^n rounds before exhausting all distinct values in the search space.

The exponential time complexity makes the problem intractable for arbitarily large values of n and alphabets. The success of any other strategy must therefore hope to find exploitable knowledge factors that influence X to reduce the size of this problem. Intuitively, we know this to be the case since since without knowledge of X, any guess we make would be just as correct as any other (with equal probability).

Password Strength Depends on Conditional Entropy

Analysis of problem hardness can be expressed in the following way. Let the conditional entropy of the function H(X|K) quantify the entropy of X conditioned on the known value of K.

As an example, we might define K in the context of a pseudo-number generator in terms of statistical correlations and inferences that condition an underlying cryptographic protocol. The of conditional entropy notion can be further generalized so that K is admissible as as one function or more variables through the chain rule for conditional entropy.

Password Strength = Entropy = Zero Conditional Entropy

Now what happens if the variable K is random? By the definition of Shannon entropy X is not conditioned by K implying that is X is random further implying that it's entropy the maximized. The final conclusion: the function H(X) is information-theortically perfectly secure. With no exploitable knowledge of K the problem has an EXPTIME complexity that is asymptotically bounded by entropy of X.

Humans are Dumb and Try to Reduce Entropy Through Associative/Pattern/Mnenomic Techniques

So, hopefully it should be clear that entropy is essentially the only metric of password hardness. Ease of memorization is a concession that is made to address a user's inability or lazy proclivities to recall random data.

The only information-theoretically encryption method that might reconcile these two definitions is one-time encryption that uses (say) random keys characters from pages of a book. The user could potentially associate this in such a way that substitution remains random.

bw_0x7c6
  • 11
  • 2
1

I gave this answer to another question, but I think it addresses your question as well (in a roundabout way).

What is "password strength"? In most people's minds, it's the difficulty factor malicious actors would have when they are trying to guess your password. Password strength meters generally answer a slightly different question: how many iterations would it take someone to guess your password if they were guessing every possible character combination in order starting with 'A' through 'z', then 'AA' through 'zz' and so on, a technique known as brute forcing.

What's the difference? Those two things aren't the same because you don't know how fast the bad guys are guessing nor what technique they're using to guess. "Your password would take 1 year to guess" is utterly meaningless without context. How fast they're guessing is usually dependent on the method the server used to store passwords because the most common time they're guessing passwords is when they've hacked a site and downloaded the database of user info. If the hacked server was storing passwords in plaintext, then bad guys can figure it out regardless of your password "strength". Commonly, servers won't store passwords directly and will instead store the output of one way functions like MD5 and SHA. At that point the bad guys can't take the output and figure out the input; they have to guess the input and run it through the function to see if the outputs match. Modern password cracking software can leverage GPUs and make billions of guesses every second, and if you see a site that calculates "years to crack your password" it's generally assuming this scenario with bad guys who are brute forcing their guessing. Sites with better security, like updated versions of Wordpress, will use these functions on the password and then use them again on the output and again and again, so that it may take a few milliseconds extra for the server to log you in but the bad guys will only be able to guess thousands of passwords a second in these sorts of attacks.

And what if they're not brute forcing? A password of "passwordpasswordpasswordpasswordpassword" would take a very long time if bad guys were guessing every single letter combination, but if they were using a modern password cracking tool they'd be able to guess concatenations of words from their dictionary and would probably be able to guess that one a lot faster than "ogNeTJeB6w5YhRsy972c". I haven't seen a password strength meter that actually uses a good dictionary of leaked passwords, yet that's exactly what your password will be up against if a site you've registered with gets compromised.

None of that background directly answers your question, but I think it's necessary to understand in order to answer whether or not a password with those 4 characteristics (upper case, lower case, number, special character) is "strong". The answer is somewhere between "it depends" and "it doesn't matter". If a vulnerability in your site has allowed a malicious actor to download your list of hashed passwords, whether there's a special character in your admin password isn't actually all that important. But if your 50 character password with high ASCII characters is the same password you use on every site, and you've registered on a site that stores passwords in plaintext, and that site gets hacked, your password strength (by our first definition) has just become very weak. And if you're asking this question, my guess is that you are reusing your password and you're worried that this password you've used all over the internet might not be as great as other password strength meters have led you to believe. That habit is really what's making your password "weak", not whether it contains capital letters.

I gave a TEDx talk on this last year, which I'd recommend watching if you want the 14 minute version of this post. If you want "strong" passwords, use password management software like KeePass, LastPass, or 1Password to have a different password on every site, and lock your database with a Diceware password.


Additional information: Mark Burnett and Dave Kleiman's Perfect Passwords: Selection, Protection, Authentication is mostly devoted to listing tricks you can use to memorize passwords. They are numerous and clever, and you can look through some parts of the book on Amazon to get an idea. As a 10 year old book, almost all of the passwords it recommends are insufficient today at protecting your credentials if they're hashed with MD5 and then stolen, but combining techniques could make some good, easy to memorize passwords.

I also remember reading an academic book (my google-fu has failed at finding it) where the author suggested a different form of password: presenting users with a large body of text, like the declaration of independence or an ASCII art picture, and their "password" would be how they modified that block of text. Deleting a word or adding a number in a specific location would be easy to remember but relatively difficult for an attacker to guess.

Aron Foster
  • 1,204
  • 2
  • 11
  • 19
  • Aron, I'm referring to entropy, amount of information encoded per alphabetic character as the default measure of strength. Password rules reduce the entropy of a string since they limit the amount of information in the password, e.g. 49823025702 isn't a valid password in my example. The entropy of a password is directly related to the compressibility of the password. Assume that the password is converted to a form where the entropy is 1 bit/bit and then stored using a cryptographically secure perfect hash, how strong is the password? This is a variation on brute force. – COL Wotohice Apr 10 '15 at 19:35
  • 1
    Password entropy only comes into play when 1) a malicious actor is guessing your password, and 2) their guessing method limits them to only passwords that fit your scheme. Today, malicious actors tend to brute force shorter passwords, use dictionaries of leaked passwords, and use common modifications on those dictionaries, but any password memorization strategy that becomes a popular way of reducing entropy will get added to that toolkit. If there are n possible passwords using a scheme, malicious actors need at most n guesses once they know your scheme, without exception. – Aron Foster Apr 10 '15 at 19:44
  • I agree that there are much better ways than brute forcing passwords. I don't actually think any mechanism that uses only passwords is secure where the data protected by the password has significant value. There are too many side channel attacks on password systems. For example, keyloggers installed via malware is a relatively cheap method of bypassing all passwords. This is a more theoretical question rather than a practical question. – COL Wotohice Apr 10 '15 at 19:59
  • If you want to focus only on the theoretical, then I think you're trying to compare two impossible to quantify concepts. Both the "strength" of passwords and "ease of memorization" are almost completely dependent on the situation and the person. However, there are some novel authentication methods that have been suggested in academia; it's impossible to rank them and they're not workable in practice, but I'll add a few links to my answer. – Aron Foster Apr 10 '15 at 20:21