Entropy is a standard measure of the complexity of a password generation. The standard way to calculate entropy is to take the probability p of a password being generated and take the sum of p*log2(p) for every password in the set of possible passwords. This entropy measures complexity based on how the password is generated or a worst case attack (e.g. we're not trying to calculate the entropy of rolling 6,2,6,3 on a loaded die, but the entropy of the distribution that led to those numbers getting rolled). Entropy here is not measuring how the password would stand up to brute-force, etc.
However, for passwords with patterns (non-uniform distribution) Shannon entropy's not an accurate measure of guessability, since two random numbers can have the same entropy but one can act much more predictably if they have different distributions. When the entropies are similar, the distribution with the least uniform distribution (the least random-looking) is always less secure.
For example, consider the following schemes with the same Shannon entropy but different security levels: flip a coin repeatedly and count the number of tails you get, stopping when heads comes up first. Looking at the probability distribution, the numbers generated here has a Shannon entropy of 2 bits. Great! That means this scheme is as secure as two independent coin flips, right? Not quite. If you think about it 50% of the numbers generated in the first scheme are 0 (heads on the first flip), and you have a total 1/3 chance of two users getting the same result using the scheme. Two independent coin flips, on the other hand, gives a 1/4 chance of producing each number, for a 25% chance of two users getting the same result. (The math for calculating the probability of such a collision is not complicated but requires you to know the frequency of each possibility, as explained here). Entropy gives a misleading figure for the security of the first scheme.
What happens when a password is generated using a random process that does not use a uniform distribution, or a distribution that's not easy to describe with or map to equally likely outcomes, like say:
- A scheme to pick long phrases taken from a large English corpus
- A scheme to generate a mnemonic-based password
- Rolling a loaded die 15 times
How do you compare the strength of random but non-uniformly generated passwords like these to passwords generated from uniform distributions? Does entropy apply? Is anything else a widely-accepted substitute? In my experiments entropy isn't completely accurate but still gives a rough estimate of password strength.
I'm also not asking how entropy is calculated (I already stated that above), whether we apply entropy to the individual password or the distribution, or about the entropy of passwords made up non-randomly by users. Unlike the answer that says that it all comes down to "what dictionary is your password in" (which is cited as a reason for being a duplicate), this question assumes the distribution is not uniform, or in other words each password is not equally likely. I'm also not asking about the security of these systems and am well aware that patterns make passwords weaker. I'm trying to understand the best way to quantify the security of these schemes to answer questions like this one.
Note: This question was closed as a duplicate, although some of this question isn't accurately addressed by the linked answers (e.g. how do you calculate the strength of a 4-word passphrase where the phrase is taken from the COCA corpus?) I've edited the question to address these points. If you think this question should be reopened please vote to reopen it, and if you think this question would be better if made more specific please say so.