14

If you've got a website aimed at people that are not necessarily technically proficient, letting new users choose their own password isn't very secure because of the simple fact that a lot of people don't choose secure passwords. Some methods of dealing with this problem are:

  • Most common: have a 'password strength checker' or demand that the password contains a certain number and different kinds of characters. These often aren't very good and result in passwords that are not actually very secure and/or difficult to remember (of course this doesn't mean you can make a very good password checker, although that is pretty difficult).
  • Generate a random sequence of characters: very impractical and insecure because people tend to write these down because they can't remember them.
  • Two phase authentication through SMS or e-mail: not very convenient for the user.

I've thought of a different approach that is inspired by this XKCD comic. It works like this:

When a new user registers, they are presented with 20 randomly picked dictionary words. They are asked to make a secret sentence (let's call it a passphrase) that contains at least five of these words. They are encouraged to create a nonsensical sentence that is hard to guess but easy for them to remember. They have an option to refresh the 20 random words multiple times if they can't think of anything.

This passphrase will simply function as their passwords. You only have to check they use at least five of the provided words and take the usual security measures you would for regular passwords.

I personally think this is a pretty user-friendly way of letting people choose strong passwords; and the problems I mentioned for the other methods appear to be avoided. Of course I could be completely wrong and might be overseeing something important. So I'd really like to hear what the smart people here think about this system.

techraf
  • 9,141
  • 11
  • 44
  • 62
AardvarkSoup
  • 577
  • 2
  • 10
  • A number of years ago I wrote a password generator that basically works on the concept of generating nonsensical but somewhat meaningful phrases mixed with other randomness and word alterations to prevent the limits imposed by a fixed wordlist: http://xato.net/windows-security/pafwert-smarter-passwords/. It sounds similar in concept to what you are talking about. – Mark Burnett Jun 09 '12 at 22:59
  • @AardvarkSoup - So you want want to replace non-dictionary passwords with dictionary passphrases. It would be a simple matter of trying every single 5 dictionary combination would not exactly take that long. – Ramhound Jun 15 '12 at 10:58
  • I wonder if you guys have tried this [XKCD based generator](https://xkpasswd.net/s/) – Jedi Jul 01 '16 at 23:02

3 Answers3

12

That sounds great to me.

The entire point of generating passwords is to pick one that is extremely hard for an attacker to guess, and easy for the authorized users to remember and type in. Ideally a password generating process should generate such extremely hard-to-guess passwords even if the attacker knows exactly which password generating process you used ( Kerckhoffs's principle ). If the "perfect knowledge" attacker finds it hard to guess the password, then other attackers will find it even harder.

I'm going to assume that you use a /dev/random that is not broken (it's connected to a hardware random number generator or a cryptographically secure pseudorandom number generator).

Many people seem to feel that a so-called "10-random-character password" formed from a randomly selected uppercase letter, randomly selected lowercase letters, followed by a randomly-chosen digit, a total of 10 characters long, is more than adequate for many purposes. That gives ln2( 26^9 * 10 ) = 45.6 bits of password strength.

If you completely randomly pick words from any list of 7776 words ( 6^5 words ), such as the the standard Diceware list or the easier-to-type-on-a-phone Dialdice list, and the attacker knows exactly what wordlist you use, and your user chooses the first 4 words you select, in the order you select them, the user gets ln2( (6^5)^4 ) = 51.7 bits of password strength.

increasing randomness?

If the user hits the refresh key a random number of times, and then rolls the dice to randomly selects 4 words from among the ones you present, and then rolls the dice to shuffle those words into a random order to generate the 4-word Diceware passphrase, then (again assuming /dev/random is not broken) I can a priori calculate that it has no effect -- the user still has 51.7 bits of password strength.

If the user thinks that's not enough password strength, then I think your system should allow the user to add additional letters, symbols, words, etc.

effects of non-random selection

Your user is unlikely to make a completely random selection, but that's OK. In the worst case -- your user puts things in alphabetical order or chooses some other order, and the attacker somehow knows exactly what order that is, you've lost -ln2( 4! ) = -4.6 bits of password strength from a 4-word Dialdice passphrase.

If we assume the user always rejects certain words, and keeps refreshing until he can make a passphrase without those words, and those words make up about 1/6 of the dictionary, and somehow the attacker knows which words those are, then for a 4-word Dialdice passphrase we've lost about -1 total bit of password strength. (The worst case is a loss of N bits of password strength for every 2^N times the user pushes the refresh button -- but that doesn't seem to be realistic to me).

The combination of these two effects results in your users having 46.1 bits of password strength with a 4-word Dialdice passphrase -- stronger than the 10-random-character password, and much easier to memorize.

If you later think that's not enough password strength, it's pretty straightforward to increase the minimum number of words in the passphrase, or switch to a wordlist with more words in it, or both. (You could use the approach in "Migrate old md5 passwords to bcrypt passwords" to force everyone with old, low-strength passwords to upgrade to new, higher-strength passwords).

That sounds great to me.

David Cary
  • 2,720
  • 4
  • 19
  • 20
  • 1
    +1 Good analysis. Its worth noting that ~46 bits of entropy with a simple hash (e.g., sha256) is still in the range brute forceable by a [GPU in a day with an electricity cost of ~$0.50](http://security.stackexchange.com/questions/12994/whats-the-practical-limit-for-rainbow-table-based-bruteforce/13016#13016). So you still should use a per-user salt, and a little bit of [key strengthening](http://en.wikipedia.org/wiki/Key_stretching) depending on the maximum value of the cracked resource (e.g., 100 rounds makes it cost $50 in electricity; a million rounds would cost $500 000) – dr jimbob Jun 14 '12 at 21:31
  • @drjimbob: Yes, per-user salt and key stretching is the right thing to do, for exactly the reasons you mention. – David Cary Jun 17 '12 at 06:22
  • I know this is old, but I thought I'd point out an important issue: if you let the user rearrange the words and get new ones until they like it, the result will have much less entropy, since an attacker could try dictionary attacks using only commonly used words and avoid hard-to-type or old ones (most users will choose "dog" over "hypothesis" – Hankrecords Apr 27 '17 at 14:39
  • @Hankrecords: Yes, as the "effects of non-random selection" section of this answer already points out, letting the user get new words *may* reduce entropy, but even if it does reduce entropy the resulting easy-to-type 4-word phrase is still "stronger than the 10-random-character password". – David Cary May 16 '17 at 14:26
  • Sorry, I only noticed it now. Apparently I thought I had read all the post while in fact I had missed out the last section ^^' – Hankrecords May 16 '17 at 15:07
6

The idea of the xkcd comic is to generate a passphrase based on 4 words from a small dictionary of about 2000 to 3000 extremely common words.

If you let users pick words from a refreshable list, you significantly reduce entropy. The same is true for letting them change the order. Have a look at XKCD #936: Short complex password, or long dictionary passphrase?

Therefore you need to provide a much larger dictionary with less common words to compensate.

On the practical side: Users don't like typing in long passwords, especially on devices without a proper keyboard.

Hendrik Brummermann
  • 27,118
  • 6
  • 79
  • 121
  • You have a good point regarding the refreshing list and reordering; it does reduce entropy. I didn't think of it that way but giving the users more choices is actually a bad thing because they will have common preferences regarding word choice. But, looking at the sheer size of the Oxford dictionary I doubt that dictionary size or common words will be the problem. :) – pyrocumulus Jun 08 '12 at 22:07
  • It's not a given that letting users pick words from a refreshable list reduces entropy. If the words are chosen by a predictable process, refreshing and reordering might be the only entropy in the system. System generated passwords have routinely been cracked by cracking the generator. – Major Major Jun 09 '12 at 00:03
  • 1
    @MajorMajor, well, we assume a working random number generator. This is not about a poorly designed algorithm for password generation, but about a specified one: Pick four words in order from a list of 2000-3000 common words. For this algorithm, those actions obviously reduce entropy. – Hendrik Brummermann Jun 09 '12 at 07:39
  • @Hendrik, how does letting the user change the order of the words and expand the presented list of words form 20 to 40 or 60 _reduce_ entropy? – Major Major Jun 09 '12 at 15:39
  • @Hendrik, BTW ***assuming*** a working random number generator has been the downfall of many security schemes, especially password and key generaton schemes. See http://www.developer.com/tech/article.php/616221/How-We-Learned-to-Cheat-at-Online-Poker-A-Study-in-Software-Security.htm and http://www.suspekt.org/2008/08/17/mt_srand-and-not-so-random-numbers/ and http://en.wikipedia.org/wiki/Random_number_generator_attack#Prominent_examples_of_random_number_generator_security_issues – Major Major Jun 09 '12 at 16:27
  • @MajorMajor, well, when looking at an algorithm, it is reasonable to assume, that it is implemented correctly, especially those parts for which free and audited off-the-shelf libraries are available. Including implementation bugs created by people with little experience, will eventually end up at trojan horses infected computers and finally at [xkcd 538](http://xkcd.com/538/). But lets focus on the algorithm instead. – Hendrik Brummermann Jun 09 '12 at 18:03
  • @MajorMajor, the entropy is reduced although the number of words to pick from is increased, because all those words are from the 3000 word dictionary. Let me give an example that is more intuitive: Let a person do 1 role with an ideal dice. The result is a perfect randomness with a chance of 1/6 for every number. Now let the person do 2 or 3 roles and let them pick a number of their choice. The size of the dictionary is still 6. Showing 3 numbers does not add entropy. But the opposite is true: Humans have preferences that are less than perfect random, for example picking the highest number. – Hendrik Brummermann Jun 09 '12 at 18:12
  • @Hendrik, your entropy example in not a good one. The users are being asked to pick five words and to reorder them. You cannot _a priori_ determine if users will add or reduce entropy in this process. Clearly if the RNG is compromised and the users are not allowed any choices, the attacker can recover 100% of the passwords via the cracked RNG alone, so I argue that user choice adds entropy. As to the security of the RNG, I think when someone asks "is this process secure" and its security depends on the security of the RNG, it is dangerous to omit a warning about the prevalence of bad RNGs. – Major Major Jun 09 '12 at 18:28
  • @MajorMajor, if the RNG is broken, the user might add entropy. On the assumption that it is okay, it will produce the highest possible entropy, so whatever the user does, it cannot be better. I am not aware of any broken number generator that was audited based on the requirements for secure random number generators. I just went through the list in your links. All of them are about home brewn RNG or modifications to well known RNG. Stick a note to the algorithm saying: Use a well known random number generator that was audited for security. – Hendrik Brummermann Jun 09 '12 at 18:41
0

While I like the idea of assisting users with a strong password (especially inspired by the XKCD comic), I see a problem. Your dictionary. How big of a problem this is, depends on what your site is for obviously.

First of all your dictionary has to be huge, and needs to contain uncommon words. Still you should assume that whatever dictionary you have, someone else will too. I'm not a cryptographer, but you really need a lot of words because those five picks are all that makes the password. The other part of the problem is quite obvious; you have to secure it really well. If some evil genius gets hold of your exact dictionary cracking the hashes or trying to gain access to the account, will be a lot more easy.

That's where the importance of your site comes in. Since you provide the password for the user, it is unlikely (but not impossible) that the user will choose to use the same password somewhere else. So should someone obtain your DB, brute forcing the hashes won't provide them with something really useful.

Apart from that making sure that you block accounts after X failed login attempts (and making login attempts really slow) will help make sure that attackers have trouble gaining access to accounts using a stolen dictionary.

Just some thought on this interesting subject, I hope you find them useful.

pyrocumulus
  • 173
  • 1
  • 6
  • Thanks, those are some good points. However, I don't think the dictionary is such a problem: the Oxford dictionary contains about 170000 words, which would translate to a word list of about 1MB (you can reduce its size by e.g. using a smaller dictionary of more common words and not including short words). That's not very difficult to do lookups in at the server-side. If someone where to use only five words (without anything added to make it into a valid sentence) you'd still have a massive amount of possible combinations. – AardvarkSoup Jun 08 '12 at 21:55
  • 1
    Additionally, the possibility of two people choosing the same sentence shouldn't be any problem if you salt the hashes with a unique random string for each user, so that each user has a unique hash. This is already good practice for regular passwords, so I didn't mention it. – AardvarkSoup Jun 08 '12 at 21:58
  • That is true I guess. English is not my primary language, I forgot how vast the set of English words actually is. I would opt to use words no shorter than four characters (three bare minimum). The total character count is important. Make sure it exceeds 12 at the most minimum case possible. Preferably higher. – pyrocumulus Jun 08 '12 at 22:03