15

I was thinking about correct horse battery staple. I am not a native English speaker. In my language, the above could be written as:

vqrno kon bateriq telbod OR

вярно кон батерия телбод.

Keep in mind that the Latin variant can have several variations, depending on how I choose to represent the sounds from my language with English letters. This is, however, very easy for me to remember, as I do it consistently.

Roughly what would be the security value of the above two pass-phrases*?

*ignoring the fact that they are now published

TildalWave
  • 10,801
  • 11
  • 45
  • 84
Vorac
  • 1,817
  • 3
  • 20
  • 27
  • 2
    Appropriate to this question too: http://i.stack.imgur.com/AehwB.jpg – Dan Is Fiddling By Firelight Apr 08 '13 at 17:42
  • 2
    My password is five+9=XIV, I think it is much better than yours and it can be remembered easily. – Sulthan Apr 08 '13 at 20:23
  • Your password is very predictable because it uses an easy scheme. It is only 10 characters long taken from a relatively small charset. The above one sure uses only lowercase letters but it is really long STILL easy to remember. Yours has only the last one benefit. A simple and effective modification would be five+9='everythingbutXVI' – refex Apr 13 '16 at 10:44

6 Answers6

14

It entirely depends on your attack model. A 16-character word is likely to resist brute-force, but might fail immediately against dictionary attacks. How do you define the strength of that password? There's no single answer.

With the scheme you mentioned, a brute-force attack would be largely infeasible assuming your password is long enough. Even then, it's entirely reliant on whether the attacker is targeting your particular Unicode range, or if they're just going for ASCII.

In general, attackers in the "low hanging fruit" model are likely to guess passwords only in the ASCII range, because they're the most common. When 30%+ of passwords are in the top 10k list, why bother doing anything fancy?

However, if you're worried about attackers who know your language and are actively attacking passwords using the same model as you used to create your password, then you might run into trouble. Even so, if your words are sufficiently randomly chosen, and one of them isn't strictly a dictionary word (e.g. a word like "pwned") then you may well be safe. It's quite difficult to judge the strength of passphrases, since their security is largely reliant on unknowns such as the dictionary in use and the potential for mutations.

The truth is that we don't really know if they're secure against in-the-wild attackers in general, because we don't have any solid statistics to back up our intuition. The best advice I can give you regarding non-English passwords is to assume that they were English, and ask yourself whether you'd still trust them.

For further reading, we have a few related questions on the site:

Polynomial
  • 132,208
  • 43
  • 298
  • 379
12

When it comes to assessing the security of a password generation algorithm, it is customary, and also prudent, to assume that the attacker knows your method perfectly. Why do we assume that ? Because that's often true ! In particular in organizations (companies, administrations...) where the password generation rules are decided and published (and sometimes automatically enforced) by the local sysadmin.

In the "correct horse" method, you select some words at random in a given list of words (uniform selection for each word). The assumption above means that the exact list is known to the attacker. At that point, it changes nothing whatsoever that the words are in English, Russian or Sumerian. Nitpicking: if the words are "hard to type" on a given keyboard, then this might imply usability issues (especially since the password is typed "blindly" to thwart shoulder surfers).

Using Russian words instead of English words will bring some security only against attackers who are not serious enough to do their homework, and try a generic list of common words, instead of your list of common words. In some practical situations, this might given a gain; but don't count on it. Since using Russian words instead of English words won't lower the base security, there is no problem in using Russian words (except possible usability issues, which may or may not apply to your case); but it would be foolhardy to believe that it does substantial good either.

Edit: for more on the topic of calculating password entropy, see this answer. The important point is that password entropy is a property of the process by which the password is generated; and we assume that the whole process is known to the attacker, except the actual random choices, because it would be very hard to quantify how much unknown the process is. Could you say that "guessing that the words are Russian will cost the attacker an effort of X CPU-months worth of computations", for an even approximate value of X ?

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
  • 1
    What if we combine multiple languages? Even if the attacker knew he has to use a Russian-English dictionary, the dictionary itself got doubled. Would this be more secure than just writing in (non-English)? – VPeric Apr 08 '13 at 20:06
  • @VPeric: Yes, doubling the size of the dictionary increases the entropy per word by one bit. – Ilmari Karonen Apr 08 '13 at 20:10
4

I guess it will depend on the situation itself.

If an attacker obtains a password dump from a mostly English site like LinkedIn, my guess is that he will most likely run through the hashes using an English dictionary. My best guess is that your password will be impervious to dictionary attacks.

If an attacker obtains a password dump from a site in another language. It is very likely that he will run through the hashes using another language's dictionary instead.

If your threat model is a targeted attack against you, you can be very sure that the attacker will be using a dictionary in your native language. However, like the XKCD comic suggest, four randomly chosen words from a dictionary of a few thousand words is still incredibly strong. So you are probably safe.

0

The longer the password, typically the stronger. The time to directly brute force the password as the length increases goes up very quickly. once you start using a phrase, dictionary attacks are not really practical. Even on a upper/lower latin set, to make a dictionary of phrases would provide no real benefit over 10 characters, unless the phrases was extremely common.

At the same time, clearly do not write it down in Russian on paper assuming no one will figure out the transliteration in english. I would also recommend storing your passwords in KeePass or another secure storage and just using completely random character sets.

Ultimately, increased length increases the security and decrease the ability to guess or brute force, but completely random is almost always better.

Eric G
  • 9,691
  • 4
  • 31
  • 58
0

Using your native language is still using dictionary words. Languages for dictionary attacks are often selected using information about the account, e.g, your name. English has a very large vocabulary, so in some circumstances, it might actually be weaker to use your native language.

http://oxforddictionaries.com/words/is-it-true-that-english-has-the-most-words-of-any-language

mgjk
  • 7,535
  • 2
  • 20
  • 34
-4

Anything over 8 characters with 1UPPER, 1lower, and 1 number is good, ESSPECIALLY in plain text if the password contains words found in dictionaries stay away...

chrisc
  • 13
  • 2