4

Amid all the discussion about password length vs. complexity (summarized by the famous xkcd strip and the followup dicussion) I am trying to make up my mind about passphrases made up with dictionnary words.

I understand the various calculations for brute force attacks -- would you know of a reasonable study about passphrases which would be made from dictionnary words?

I was namely wondering about a policy with a minimal length of, say, 13 characters - all lowercase. I expect to end up with phrases like volleyballisfantastic. This is 3 words. If the phrase is in English or French, they would average at about 4 words.

Since one of the reasons to move to a passphrase would be to go away on some systems from throttling mechanisms (lock the account for 10 min after a failed login or exponentially change the time in between login prompts), I wonder if dictionnary attacks would not be way more successful on such passphrases.

Thanks for the thoughts or pointers to existing studies (I googled around and what I found are discussions about pure bruteforce).

logicalscope
  • 6,344
  • 3
  • 25
  • 38
WoJ
  • 8,957
  • 2
  • 32
  • 51
  • 1
    Could you be a smidge more specific about your question? Are you asking about finding a study, or about whether this specific policy is secure? Or both? :) – Steve Dec 12 '11 at 16:43
  • Both, actually. I found quite a bit of discussions around these subjects (but like I said in my post -- mostly related to classical bruteforce). I am looking at a way to quantify this policy (by reading and then hopefully understanding some studies on that subject) to make up my mind between - a old-school password scheme with a lockout policy which may be dangerous (possibility to block accounts through automatized login attempts) - and a passphrase, longer but made of common words (users will choose it...) + a lack of lockout policy Apologies for not having made myself clear – WoJ Dec 12 '11 at 18:14

5 Answers5

5

I would say that this isn't strictly a dictionary attack, since you're not testing words in a dictionary, but strings of words. It's really a brute force with a different "character" set. That lets you use regular entropy calculations.

If the attacker brute forces the password it as if it were a string of lower case characters, then it's 13 long and each token is one of 26 choices. That's 61 bits of entropy.

If the attacker brute forces it as if it were a series of lower case words from the english dice ware list, then it's 4 long, 7776 choices, so 51 bits of entropy. Or, if the attacker brute forces it as lower case words from the OED, 4 long, 181,000 choices, which gives you 70 bits.

Since most people would not use obscure words, in your example, you'll be closer to 50 bits than 60, so a dictionary attack would give better results. 13 characters of random letters would be theoretically much stronger than four words.

(Non-theoreticaly, of course, a password rule requiring 13 characters of gibberish just means post-it notes on monitors.)

Graham Hill
  • 15,394
  • 37
  • 62
  • Your analogy about the "character set" is very interesting. I therefore need to find out if it is easier to bruteforce i) an 8 or 10 signs password with a character set of 24+24+10 letters or ii) a 4 or 5 signs one with a character set of "letters". I checked with scrabble standards and there are 1200 3-letters, 5400 4-letters and 9000 5-letters words. Since most of them are quite obscure, I would say that the "character set" in the second case is, say, 6000. – WoJ Dec 12 '11 at 19:44
  • After quick highly scientific calculations it looks like that the complexity is roughly the same (58^8+58^9+58^10 vs. 6000^4+6000^5). This is just an estimate based on an awfully lot of assumptions (the ones from my previous comment, plus the fact that people would choose 8 to 10 letters passwords vs. 4 or 5 concatenated words) – WoJ Dec 12 '11 at 20:20
3

Since most people would not use obscure words, in your example, you'll be closer to 50 bits than 60, so a dictionary attack would give better results. 13 characters of random letters would be theoretically much stronger than four words.

@Graham,

It's true that most people don't use obscure words; however, wouldn't a simple use of the numerical substitution for a single letter be enough to then bump up the complexity? As an example, instead of houseonfire you'd write h0useonfire. It'd still be easy to remember and the dictionary would have to include the 'leet speak' in order to more effectively brute-force it.

Would this be accurate, do you think?

  • "Most people" is the operative word, here and they aren't necessarily discussing how to increase entropy with different character classes (see question title). What Graham and Wojtek are claiming is that the use of simple -- but otherwise lengthy -- **lowercase** phrases are just as effective and can be able to be implemented by a large segment of the computer-using population. Whether that means lowercase passphrases are effectively useless or effectively secure is what is up for debate. – logicalscope Dec 13 '11 at 02:31
  • Yes, you're right, I glossed over that part and instead focused on the 50 vs. 60 bits of entropy part. – Appropriate Sound Dec 14 '11 at 07:52
1

If you have a look through our question on that XKCD strip you will see a lot of discussion on the calculation of entropy, which is all that is important here, because a concatenation of 3 or 4 words requires either a normal brute force procedure, or a brute force using concatenations of dictionary words.

The problem you get into is this:

A standard brute force will break the passphrase. The only question is time. And if you make that time big enough, then you can treat the passphrase as unbreakable by this method.

A dictionary based brute force attack will only break the passphrase if all the words in the passphrase exist in the dictionary used, so you already start off with an uncertainty in outcome. Then you need to plan for every word in the dictionary in positions 1, 2, 3 and 4 in your passphrase (assuming you restrict to 4 words) and possibly all 4 positions.

Of course, a good throttling mechanism will make all online brute force attacks impossible, so all you need is a passphrase "good enough" to force an offline attack to take longer than your safety period (which could be twice your password expiry time, or 5 times, or whatever factor you require)

Rory Alsop
  • 61,367
  • 12
  • 115
  • 320
  • Yes, this is obviously a matter of time. If my calculation is correct, we are talking, worst case, about orders of 10^12 possibilities (actually much more, but I take worst case scenarios). At a rate of 1000 attempts per second we are in the decade range. – WoJ Dec 13 '11 at 13:05
  • About throttling: I want to avoid it as it carries the risk of a simple denial of service on a given account (a script constantly trying "aaa" as the password) – WoJ Dec 13 '11 at 13:06
1

You should always implement some form of throttling on login attempts — even with the random words trick, it's not really practical for many people to memorize a passphrase with enough entropy to be secure against an unthrottled dictionary attack, especially if they need a separate one for each site they use.

(There are ways around that on the client side, like using a secure password wallet to store the per-site passwords, with a single strong passphrase to access the wallet, but for various reason they haven't caught on widely yet.)

It's also worth noting that a determined attacker can DOS your site just fine even without login throttling. The throttling, if carelessly implemented, just makes a particular kind of targeted DOS attack easier. To mitigate this effect, I'd suggest at least the following steps:

  • Set up a fairly low per-IP login limit, as well as a higher per-site one. This means that an attacker must must employ multiple computers with different IP addresses to effectively DOS your site, which, while certainly possible, still presents a speed bump. (You may or may not also want a per-user limit; it makes sense if you expect dictionary attacks targeted at single users, but in many cases a typical attacker would be just as happy with the password to any account, in which case their best strategy is to attack them all in parallel.)

  • When the limit is hit, log it. If it keeps being hit repeatedly, alert the site admins — whether it's a DOS attempt or a genuine dictionary attack, they'll want to know about it.

  • Make sure you provide an alternative way for users to contact the admins if they can't log in due to throttling.

  • Last but not least, require an anti-CSRF token on all login attempts. You'll want to do this anyway to protect against login CSRF, but as a useful side effect, it also stops certain kinds of simplistic distributed attacks (such as crafting a direct link to the login script and including it as an image URL in widely read forum posts).

Ultimately, IMO password length and form requirements are a red herring, at least from the site admin's viewpoint: if you let your users choose their own password, some of them will choose weak ones no matter what. If you require alphanumeric passwords with punctuation, some users will choose "abc&123"; if you require four-word phrases, they'll choose "I love my mom".

You can either accept that and find ways to live with it, or, if you can't or won't do that, take the choice away from your users and generate random passwords for them. In which case I'd very strongly suggest using the "four (or more) common words" method, since it yields much more memorable passwords for a given amount of entropy than, say, picking random characters or syllables.

Ilmari Karonen
  • 4,386
  • 18
  • 28
  • Thanks for some interesting points. As for the DoS -- I am looking at avoiding a simplistic DoS on the login. A DDoS will in practical terms always be a threat and this is another story. The pr-IP login limit is something we will be putting in the design as well as the anti-CSRF measures (which are part of the OWASP adherence we design around). – WoJ Dec 13 '11 at 19:39
  • About the password choice: "ilovemymum" would not meet the length restriction, "ilovemymumverymuch" would be fine and would be a good passphrase. I actually **want** people to choose easy to remember passphrases, thus the initial post. – WoJ Dec 13 '11 at 19:41
  • There's nothing wrong with choosing easily remembered passphrases -- quite the opposite. But the point I was trying to make is that humans, besides being fundamentally lazy creatures, are not very good at making truly random choices and often tend to think alike. Thus, you'll probably find that, whatever restrictions you put on your passphrases, many of your users will end up choosing the same (or very similar) passphrases out of the permitted set. And then attackers will figure that out and try those passphrases first. – Ilmari Karonen Dec 13 '11 at 19:49
  • If you don't believe that, take a look at [some lists](http://blog.jimmyr.com/Password_analysis_of_databases_that_were_hacked_28_2009.php) of [most common passwords](http://www.whatsmypass.com/the-top-500-worst-passwords-of-all-time) chosen by real users. I'm not aware of any published lists of most common user-chosen multi-word passphrases, but I doubt they'd be very much harder to guess. – Ilmari Karonen Dec 13 '11 at 19:53
  • I do believe that - please have a look to my other comments with an estimation on paraphrases based on common scrabble words. There is always the possibility of someone choosing "passwordpasswordpassword" as their passphrase (to take an example of a commonly used password) but this is also a matter of education - they can also write down their passwords on their whiteboards if they get too complex. Finding the right balance (complexity, ease of cracking, user choice of passwords, technical complexity to implement the restrictions, ...) is what I am looking for. – WoJ Dec 13 '11 at 20:39
0

If you pick words from dictionary randomly (not pick and choose), this would be fine password for most people. It would be even better as it would be easier to remember and type. You don't have to capitalize anything or add weird symbols or do other stupid things, which would just make it harder to remember.

If you have a dictionary of 30,000 words, adding each randomly chosen word would add entropy of 14.8 bits

three random words would mean 14 .8 x 3 = 44 bits

five random words 14.8 x 5 = 74 bits.

I would say 3 random words from a dictionary of 30,000 words would be fine for an online account where brute force is detectable. For an offline account (or as a master password for password manager), 5 words would be enough for most users, especially as most modern software do some kind of key stretching/hashing (PBKDF2 or scrypt, etc).

5 words (74 bits) with PBKDF2 or scrypt/bcrypt as most software would do with a password would be pretty close to impossible to brute force even for the US govt (would cost them several million dollars and some years, at least).

In other words, something like this:

"ghost iron physical serious shame"

(when words are picked randomly without human bias)

is a fine password for offline account where brute force is possible (like wifi), easy to type and remember, yet pretty secure.

user12480
  • 186
  • 1
  • 5