Does entropy apply to passwords generation when some passwords are more likely?

Question

Entropy is a standard measure of the complexity of a password generation. The standard way to calculate entropy is to take the probability p of a password being generated and take the sum of p*log2(p) for every password in the set of possible passwords. This entropy measures complexity based on how the password is generated or a worst case attack (e.g. we're not trying to calculate the entropy of rolling 6,2,6,3 on a loaded die, but the entropy of the distribution that led to those numbers getting rolled). Entropy here is not measuring how the password would stand up to brute-force, etc.

However, for passwords with patterns (non-uniform distribution) Shannon entropy's not an accurate measure of guessability, since two random numbers can have the same entropy but one can act much more predictably if they have different distributions. When the entropies are similar, the distribution with the least uniform distribution (the least random-looking) is always less secure.

For example, consider the following schemes with the same Shannon entropy but different security levels: flip a coin repeatedly and count the number of tails you get, stopping when heads comes up first. Looking at the probability distribution, the numbers generated here has a Shannon entropy of 2 bits. Great! That means this scheme is as secure as two independent coin flips, right? Not quite. If you think about it 50% of the numbers generated in the first scheme are 0 (heads on the first flip), and you have a total 1/3 chance of two users getting the same result using the scheme. Two independent coin flips, on the other hand, gives a 1/4 chance of producing each number, for a 25% chance of two users getting the same result. (The math for calculating the probability of such a collision is not complicated but requires you to know the frequency of each possibility, as explained here). Entropy gives a misleading figure for the security of the first scheme.

What happens when a password is generated using a random process that does not use a uniform distribution, or a distribution that's not easy to describe with or map to equally likely outcomes, like say:

A scheme to pick long phrases taken from a large English corpus
A scheme to generate a mnemonic-based password
Rolling a loaded die 15 times

How do you compare the strength of random but non-uniformly generated passwords like these to passwords generated from uniform distributions? Does entropy apply? Is anything else a widely-accepted substitute? In my experiments entropy isn't completely accurate but still gives a rough estimate of password strength.

I'm also not asking how entropy is calculated (I already stated that above), whether we apply entropy to the individual password or the distribution, or about the entropy of passwords made up non-randomly by users. Unlike the answer that says that it all comes down to "what dictionary is your password in" (which is cited as a reason for being a duplicate), this question assumes the distribution is not uniform, or in other words each password is not equally likely. I'm also not asking about the security of these systems and am well aware that patterns make passwords weaker. I'm trying to understand the best way to quantify the security of these schemes to answer questions like this one.

Note: This question was closed as a duplicate, although some of this question isn't accurately addressed by the linked answers (e.g. how do you calculate the strength of a 4-word passphrase where the phrase is taken from the COCA corpus?) I've edited the question to address these points. If you think this question should be reopened please vote to reopen it, and if you think this question would be better if made more specific please say so.

Possible duplicate of [Calculating password entropy?](https://security.stackexchange.com/questions/21050/calculating-password-entropy) and [Confused about (password) entropy](https://security.stackexchange.com/questions/21143/confused-about-password-entropy). — Steffen Ullrich, Jul 19 '17 at 05:47
@SteffenUllrich The first question you have marked seems to only address user-chosen password and uniformly distributed password. No mention is made of non-uniform distributions for passwords. The second question you have marked as a possible duplicate does not address my question at all. I never implied that I was treating passwords as strings of characters and stated clearly that I am asking about calculating entropy based on distributions used to generate the password. — Cody P, Jul 19 '17 at 16:37
Perhaps my question before I edited it was unclear, but I don't see how the question linked even answers my question, let alone meets the criteria for duplication listed at https://meta.stackexchange.com/questions/213976/should-i-mark-as-duplicate-if-the-question-is-answered-elsewhere-but-its-not-t Can anyone please explain by pointing to a paragraph in the "duplicate" answers that answers my question about non-uniform distributions? — Cody P, Jul 19 '17 at 16:43
The focus of your question has shifted from "what happens" with the entropy to "how to compare" the entropy in non-uniform distributions. The first question was actually answered in the ones I've referenced in that you cannot naively apply simple entropy calculation any longer in non-uniform distributions. And the second question actually has an example of how to compute the entropy in such cases and how not. Essentially it boils down to mapping the non-uniform distribution to a uniform one, i.e. don't use the characters but the words as a base in case of a dictionary. — Steffen Ullrich, Jul 19 '17 at 17:48
I'm still confused about how that makes this a duplicate. None of the questions include the point you've explained about mapping a non-uniform distribution to a uniform one, and not all my examples can even be correctly mapped to a uniform distribution. If you're referring to the "What dicitionary is your password in," that scheme wouldn't be accurate for say, choosing a 5-word phrase from the COCA corpus. As you can see from these complexities, having *one* answer in question A that is general enough to answer most of question B does not usually make question B a duplicate of A. — Cody P, Jul 19 '17 at 18:19
Using the probability of picking a word from a dictionary instead of using the probability of a single character in case of a dictionary based password is essentially mapping the non-uniform distribution based on characters to a uniform distribution based on the dictionary. I don't think that you can compute a single entropy value if you cannot build such a mapping because in this case different password have different probabilities. — Steffen Ullrich, Jul 19 '17 at 18:34
Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/62478/discussion-between-cody-p-and-steffen-ullrich). — Cody P, Jul 19 '17 at 18:35

score 0 · Answer 1 · answered Jul 19 '17 at 03:44

0

The problem lies in the fact that you are considering a single password (i.e. one sample from a random distribution) alone rather than the full scope of possible passwords. When using information entropy as a means of quantifying the difficulty of a brute-force attack, we are interested in the latter.

What you describe as entropy is actually the information content of a single event. Entropy applies over the entire distribution and is sum{-p_i log_2{p_i}} for all i in the space of possible passwords. When considered like this then the uniform distribution has, as you correctly pointed out, a greater information entropy.

answered Jul 19 '17 at 03:44

Arran Schlosberg

904
1
7
14

No, I'm not talking about the information content of a single event. Note example with coin flips and the statement that "The standard way to calculate entropy is to take the probability p of a password being generated and take p*log2(p)." . I'm not trying to calculate the entropy of the phrase "out here in the dark with a man I", I'm trying to calculate the entropy of non-uniform schemes that generate these passwords. – Cody P Jul 19 '17 at 16:16
Using *-p log_2{p}* however implies a single event. Even in a uniform distribution in which all *p_i* are equal, the entropy calculation still requires a sum. In the fair-coin scenario, *-0.5 log_2{0.5} = 0.5* so, to get the entropy of the coin (1 bit) you need to sum over both heads and tails. In the case of a non-uniform distribution of passwords then, for each password *i* you have a probability *p_i* of it being selected and, the expectation over the information content of all of them is the entropy of the scheme. – Arran Schlosberg Jul 26 '17 at 01:18
This is true, and I had already updated my question to reflect this nuance. The issue here is that entropy for two very different distributions can be the same (due to a balance between very likely outcomes and very unlikely outcomes), so other metrics like the chance of two users having the same password may give a better estimate of password security. – Cody P Jul 26 '17 at 16:18

Does entropy apply to passwords generation when some passwords are more likely?

1 Answers1