11

For a while now I have been interested in the passphrase concept as a potentially more secure replacement for classical passwords. My interest stemmed from a gut feeling that passphrases would be of a lower entropy than passwords given their nature.

Here is my logic. From my Google based research most blog type posts that a typical computer user might read hailed passphrases as amazing secure thing and even one site claimed that existing password cracking software is incapable of cracking passphrases as they are too long. Taken literally this is true but a simple change in the dictionaries that feed your password cracker can fix that. Never the less, here is my logic:

Passwords:

Users are now somewhat used to the idea that passwords must be complicated, using symbol and number substitution and in length. I personally use passwords based on un-common dictionary words with symbol substitutions and random numbers. Technically, a user who picks a 8 character password (at random) would have a password entropy of 94^8. Granted most people do not have truly random passwords but I will explain why I think this will cancel out with a similar human error in passphrases shortly. That is a crazy huge number since 94^8 = 6.1 x 10^15.

Passphrases:

Everywhere and everyone I have seen talk about passphrases have consistently given examples of all lower case, no symbol or number phrases such as the classic xkcd strip with "correct horse battery staple". According to Oxford Dictionary here there is approximately 170 000 words currently utilized in the English dictionary. Assuming users pick on average a three word passphrase (any longer seems to exceed user laziness), that is an entropy of 170000^3 = 4.9 x 10^15 which is roughly 20% lower entropy than the 8 character randomly chosen password. Now, before you argue that no one uses randomly generated passwords, it will be equally rare people choose totally random words from the English dictionary. Instead, users will frequently use common sayings. Even if people avoid common sayings, I am no linguistics expert but there is limited options that can logically follow a word in the English language, seriously cutting down the possible permutations used in passphrases.

With that in mind, I argue that at the very least the failure to comply with randomness in passphrases will equal if not exceed the failure to comply with randomness in password creation as users are lulled into a false sense of safety with the concept of passphrases.

So I present my formal question, am I overlooking some key consideration here or am I correct and passphrases really are less secure if not only equally as secure as passwords and are not some revolutionary new way to make your password/passphrase much more difficult to guess?

dFrancisco
  • 2,691
  • 1
  • 13
  • 26
  • 1
    perhaps. but "correct horse battery st@ple" blows both out of the water, especially realistically when considering how tools like john the ripper and hashcat operate. if you have to transliterate, your 179k words turn into, idk, quadrillions? also, longer inputs take longer to hash, so there's that... – dandavis Jan 19 '18 at 21:24
  • 3
    Random passphrases aren't necessarily supposed to be stronger than random passwords, they're supposed to be easier to remember. Comparing bad passphrases to bad passwords is difficult. – AndrolGenhald Jan 19 '18 at 21:27
  • 1
    @dandavis longer inputs taking longer to hash is only applicable to the first round, and if you're using a decent number of iterations the difference will only be measurable if your password is at least a megabyte. – AndrolGenhald Jan 19 '18 at 21:47
  • 3
    You should carefully read [Thomas Pornin's answer to a question about XKCD #936](https://security.stackexchange.com/a/6096/112339), because it will show you how you can't say that either passphrases or passwords intrinsically have more entropy than the other. – Luis Casillas Jan 20 '18 at 02:32
  • Those who recommended passphrases would also scoff at the idea of stopping at only 3 words. I never use fewer than 5. – Ben Jun 30 '18 at 18:44
  • I anti-recommend "correct horse battery st@ple". Things like that only add a few bits strength while adding more than a few yes or no questions you have to ask yourself if you're trying to remember a password you haven't used in a long time. Was the nth character capitalized? Did I deliberately misspell that word? Was this letter in this position in this word replaced with a 1337speak character? Add an extra word. It's only one additional unit of information you need to be able to remember. If that doesn't sound easier, then just use the "denser" full-character-space password format instead. – Future Security Jan 27 '20 at 19:55

2 Answers2

10

Two general points.

First, the benefit of Passphrases is that they make it easier for users to generate entropy while still remembering their key. Generating entropy through randomized characters is hard - and the harder something is, the less people will do it.

The reason that XKCD comic is so cited isn't just the math - it's that people who haven't seen the comic in years can still tell you what that password is. That's the point of the comic: that those bytes of entropy were easy for a human to remember.

Second, you might want to take a look at:

https://wpengine.com/unmasked/

The main highlight? The average password entropy is 21.6. Aka, it's off by a factor of 2-billion from your 94^8 number. The reason? Users don't choose random ascii characters. They don't choose random letters with randomized capitals. They don't choose randomzied letters. They simply choose a word and decorate it.

Basically, it's easier to get a lazy person to generate entropy through a 15-character passphrase than it is to get them to generate entropy through a 8-character password.

Kevin
  • 852
  • 5
  • 10
  • 10^9 is billion, not trillion -- in US; in UK 10^9 is thousand million and 10^12 is billion. – dave_thompson_085 Jan 20 '18 at 08:12
  • @dave_thompson_085 the long scale hasn't been used officially since 1974: https://en.wikipedia.org/wiki/Billion – anthonyrisinger Jun 29 '18 at 19:07
  • @anthonyrisinger - it's my bad. What dave is saying is that (94^8) / (2^21.6) has 9 digits instead of 12 (which is what my post was implying.) Am fixing. – Kevin Jun 29 '18 at 20:19
  • "The average password entropy is 21.6." What's the average passphrase entropy, over a similar sample? – LarsH Aug 03 '18 at 15:40
2

While you get the numbers, your logic seems to be based on some rather weird assumptions.

First off, random passwords. Your randomly compare 3 random words to 8 random characters and handy-wavy point to "lazyness". You don't really explain why someone would be to lazy to memorize 4 words, but would memorize 8 random characters which are harder to remember and harder to type. (4 words would give you the same entropy as 8 characters even with the standard 7776-word diceware dictionary).

On the other hand, if you compare user-picked passwords, they will be both vulnerable to dictionary attacks. The person picking the phrase will pick some kind of sentence. The person picking the 8-character password will pick a single short word, maybe with some symbol added in for fun. A 3-word english sentence still has more entropy than a single english word.

At the end of the day, human-selected passwords are not really secure, no matter the method. That's why you should always generate random ones. With a password manager it makes no difference of which type they are - but a random combination of words can be memorized, a random string not so much

But if you assume that your users will pick a password by hand, longer ones will still be more secure.

averell
  • 1,083
  • 7
  • 10