4

I rencently seen this :

enter image description here

I can't figure out, how he computes 28 bits of entropy for a password like "Tr0ub4dor&3" seems really few...

PwdRsch
  • 8,341
  • 1
  • 28
  • 35
snoob dogg
  • 143
  • 1
  • 5
  • 2
    XKCD is usually explained on their wiki site. This comic is explained here https://www.explainxkcd.com/wiki/index.php/936:_Password_Strength – bhavya.work Aug 10 '17 at 20:50
  • To explain what @ConorMancone means, some of the answers to the question Conor linked linked to include a good explanation that answers the question here (especially in the second answer, by Thomas Pornin). However, that question focuses a lot on the wide usability and applicability arguments raised by the comic. Over half of the answers, including the accepted one, don't directly address this question. – Cody P Aug 10 '17 at 21:59
  • Interesting question for many crypto students, I'm sure. Could you post what would be the entropy by your calculation? One motivation for this request is that you might discover the answer by yourself :) – Sas3 Aug 11 '17 at 05:02

2 Answers2

5

He's modeling the password as the output of a randomized algorithm similar to this one:

  1. Pick one word uniformly at random out of a dictionary with 65,536 (= 16 bits) words. (We assume the dictionary is known to the attacker.)
  2. Flip a coin (= 1 bit); if heads, flip the capitalization of the first letter of the word.
  3. For each vowel in the word, flip a coin; if it lands heads, substitute the vowel with its "common substitution". Munroe is simplifying here by assuming that words in the dictionary typically have three vowels (so we get ~ 3 bits total).
  4. Pick a numeral (~ 3 bits) and a punctuation symbol (~ 4 bits) at random. Flip a coin (= 1 bit); if heads, append the numeral to the password first and the symbol second; if tails, append them in the other order.

The entropy is a function of the random choices made in the algorithm; you calculate it by identifying what random choices the algorithm makes, how many alternatives are available for each random choice, and the relative likelihood of the alternatives. I've annotated the numbers in the steps above, and if you add them up you get about 28 bits total.

You can see that Munroe's procedure isn't hard science by any means, but it's not an unreasonable estimate either. He's practicing the art of the quick-and-dirty estimate, which he very often demonstrates in his work—not necessarily getting the right number, but forming a quick idea of its approximate magnitude.

Luis Casillas
  • 10,181
  • 2
  • 27
  • 42
2

Each small square is a bit of entropy that's being accounted.

  • 16 bits for the word alone
  • 1 for the first letter: caps or not?
  • 1 for each substitution of O and 0, A and 4
  • 4 for using a symbol that's not that common
  • 3 for using a number
  • 1 for the unknown order of symbol + number or number + symbol.

There is some reasoning about it. For example, when the password requires caps, almost everybody put the caps in the first letter. So you don't get much more than just a bit of entropy out of it.

woliveirajr
  • 4,462
  • 2
  • 17
  • 26
  • could you elaborate a bit ? I got the fact it sould be 1 bit if it is caps or not, but why 16 bits only for the word alone ? how did you get that ? – snoob dogg Aug 10 '17 at 22:06
  • 1
    The typical number of words in an English dictionary is about 100000, which is about 16 bits. – Lie Ryan Aug 11 '17 at 01:44