How much is the entropy of a randomly generated password reduced if I regenerate until I get a password I like?

Question

In many of the answers and comments on the well-known XKCD #936: Short complex password, or long dictionary passphrase? question, the importance was stressed of generating the password randomly, and not just making one up. In particular, the current second highest voted answer on that question says:

Random choices are random and uniform. This is hard to achieve with human users. You must convince them to use a device for good randomness (a coin, not a brain), and to accept the result. This is the gist of my original answer (reproduced below). If the users alter the choices, if only by generating another password if the one they got "does not please them", then they depart from random uniformity, and the entropy can only be lowered (maximum entropy is achieved with uniform randomness; you cannot get better, but you can get much worse).

This got me wondering: what if I don't accept the result and instead generate a new random password? How big of an impact would that have on the entropy of the final generated password?

Presumably the more times I allow myself to reject a password, the more the entropy of the final password could potentially be reduced, so let's say for example that I generate 8 passwords and pick the one I like best. How much could that potentially reduce the effective entropy of the password I select? (Maybe log_2(8) = 3 bits as a worst-case scenario?)

How much entropy does the complete set of passwords you will like have? — Phil, Jul 01 '15 at 18:12
@Phil Good point. I guess that's kind of a hard thing to nail down. That plus variability in tastes between individuals would make it pretty difficult to give a definite answer to this question. That's why I suggested "generate 8 passwords and pick the one you like best" as an example to try and make things a bit easier. — Ajedi32, Jul 01 '15 at 18:24
As is explained at http://security.stackexchange.com/q/6949/971, if you look at N candidate passphrases and keep the one you like best, the entropy has been diminished by at most lg N bits. Since in practice N will be small, this causes a very small harm to entropy -- so yes, if you don't like the result, you can instead generate a new password, and that will be safe, as long as you don't do that a crazy number of times. — D.W., Jul 01 '15 at 20:12
@Ajedi32, no, not a mis-click. Have you read the question and answers there carefully? Don't stop at just the title -- make sure to read the question and answers. Your "Huh?" comment arrived only 30 seconds after I provided the link; you'll probably need to spend more time absorbing the material there. — D.W., Jul 01 '15 at 20:13
@D.W. Yeah, I see now that point 1 in the linked question does cover this one. Just going off the title though it seems that question isn't even remotely related to this one, hence the "huh?" — Ajedi32, Jul 01 '15 at 20:25

score 5 · Answer 1 · answered Jul 01 '15 at 16:40

5

Your rejection/acceptance of a truly randomly generated password, and how many times you reject until you accept, will not affect the entropy as the entropy is determined in isolation from generation to generation (just like each toss of a coin doesn't affect the entropy of the next toss). All strings have the same entropy, it's only the context of evaluation that determines the 'subjective entropy' (which isn't relevant).

But, the strength of a password isn't directly linked to its randomness - as you suggest - if you ask for a random sequence of 11 characters from letters and numbers then you could get password123 as your first random sequence. It would be completely random, just not very strong. It would have the same entropy as the next generation.

Similarly, although the entropy isn't less or more, if you keep generating random strings until you 'like' one you are more likely to select a weaker random string of characters (well, if your 'liking' is based on typical human considerations such as memorableness or lyrical quality or pronounceability etc.)

So entropy and human liking aren't connected, but liking and password strength most certainly are.

answered Jul 01 '15 at 16:40

David Scholefield

1,824
12
21

If you perform rejection until you "like" the password, then you are actually drawing from another distribution, with different (typically smaller) entropy. For instance, if you draw from 8-char passwords (64 bits of entropy) until the first char is 'a', then you are in fact drawing from a distribution with only 56 bits of entropy. – Alexandre C. Jul 01 '15 at 18:49
4

Does this answer imply that the entropy of the password generated by the following process is nonzero: 1. Generate a random 11-character string. 2. If the result is "password123", exit. Otherwise, go to 1. – JiK Jul 01 '15 at 18:50
@Jik: You are drawing from a zero-entropy distribution in this case. – Alexandre C. Jul 01 '15 at 18:50
2

@JiK I actually think that's a good point. David's assertion that "your rejection/acceptance of a truly randomly generated password, and how many times you reject until you accept, will not affect the entropy" is contingent upon the number of rejections being independent of the passwords chosen, which isn't the case for a human. E.g. If my password generation algorithm is that I always reject the first 5 generated passwords and accept the 6th, then obviously that doesn't reduce entropy. If, however, I reject passwords until I get one that "I like" that's a different matter entirely. – Ajedi32 Jul 01 '15 at 19:02
2

As the other comments are hinting... this answer is simply factually incorrect. Rejecting passwords you don't like *does* affect the entropy of the resulting password. Fortunately, the size of the effect is relatively small -- but the statements in this answer are not correct. – D.W. Jul 01 '15 at 20:15
I stand by my answer. If a person rejects 3 passwords and then chooses the 4th, that is the same result as if the password they eventually chose randomly came up first and they then chose it. The password 'hacker' (effectively the entropy) has no knowledge of whether the the password chosen was as a result of prior rejections or not. – David Scholefield Jul 02 '15 at 06:45

score 3 · Answer 2 · answered Jul 01 '15 at 17:21

In order for user rejection of specific random word selections to have an meaningful impact on passphrase security those rejections would need to be predictable. Otherwise there's no way for an attacker to eliminate certain words or word pairings and save time in their attempts to guess the passphrase, which is the primary way this practice would benefit them. As David Scholefield points out, one particular random word selection doesn't have any less entropy than another.

If we had data that 90% of people will not choose a randomly generated passphrase that contains the word "asymptote" (either because it is meaningless to them or they worry about misspelling it), then an attacker could either eliminate all potential passphrases containing that word or deprioritize them so they are tried last in the guessing process. Excluding that one word eliminates hundreds of millions or billions of guesses (depending on the size of word dictionary being used to generate the passphrases, and passphrase length).

So as you can imagine being able to gather data on typical user word rejections for an entire dictionary of words would be very valuable to an attacker since it would probably allow them to eliminate significant portions of the work normally needed to brute force a passphrase. I'm not aware of any public research regarding the predictability of user acceptance of specific random words for passphrase use.

But there is data out there on words that people know and commonly use, which might correlate with their preference to then use those words in passphrases. Similar research exists on words that commonly appear in passwords. I also wouldn't be surprised to hear that there is unpublished research on passphrase word preferences that one or more intelligence agencies are keeping to themselves.

So to be most secure you should resist the temptation to reject a particular passphrase since you may be falling into a predictable pattern that is known to attackers. However, it is hard to quantify whether rejecting the occasional random passphrase in preference for another will impact your security enough to outweigh the potential usability benefits.

score 1 · Accepted Answer · answered Jul 01 '15 at 18:05

1

Your assumption is reasonable. If we repeatedly generate truely random passwords and pick the one we like most then a good assumption is that it is the simplest among the generated list. For simplicity of the argument, let us assume that the passwords are numbers between 1 and N, inclusive, with small numbers being "simpler", that is: the attacker will attempt the numbers in order 1, 2, 3, ... and your pick of "most liked" password is also the lowest number among the ones generated. (You may imagine that 1 corresponds to "1234", 2 corresponds to "password", 3 corresponds to "summ3r", ..., 1000000 corresponds to "zg_uP%bwG", ...)

The expected value of a single random number is N/2, so the attacker will need N/2 attempts on average. But the expected value of the minimum of k random numbers is N/(k+1), so now the attacker will need only N/(k+1) attempts on average. Thus if k is not very small compared to N, the attacker will get a significant advantage.

answered Jul 01 '15 at 18:05

Hagen von Eitzen

1,098
8
19

+1 This seems like a pretty reasonable way to model the "generate 8 and pick the one you like most" method. So with this model, if `n` passwords are generated and the weakest password from that group chosen, the *effective* entropy (in terms of the number of attempts required on average to brute force) is reduced by `log_2(n)` bits, right? – Ajedi32 Jul 01 '15 at 18:39
2

If that's true, then it seems like this does create some advantage for the attacker, but not a huge one, since the bits of entropy lost grows only logarithmically with the number of passwords generated. E.g. You'd have to generate and reject 2047 passwords before accepting one in order to reduce the effective entropy by 11 bits. And if your password generation method is to choose random words from a 2048-word dictionary, you can easily counter that loss of entropy by simply adding another word. Most users would likely give up and accept a password long before rejecting more than a few dozen. – Ajedi32 Jul 01 '15 at 18:51
1

Accepting this answer since it is the only one that provides an actual mathematical approximation of what impact the method described in the question would have on an attacker attempting a brute force. (Which is what I was looking for in my question.) – Ajedi32 Jul 01 '15 at 20:29

score 0 · Answer 4 · answered Jul 01 '15 at 18:23

I believe the randomness would be reduced to lean to the human intuition that makes the password you chose "better".

Linguistic, phonetic, logical, and other patterns can be used to seed such an attack dictionary.

If a dictionary can be created, then it's not "perfect entropy", so choosing a password will reduce it somewhat.. but given the constraint of 8 passwords, I don't think the security reduction is significant.

score -1 · Answer 5 · answered Jul 01 '15 at 18:52

It depends on the parameters of what you like. Say you prefer a password that you can type on your mobile device without switching keyboards. You can compute the entropy of passwords composed of only the characters on your main keyboard. If what you like is more subjective, the question becomes much harder to answer.

How much is the entropy of a randomly generated password reduced if I regenerate until I get a password I like?

5 Answers5