8

This is more of a philosophical question.

Suppose that you are trying to choose a good password for a particular online service, say your bank's e-banking service. Now the bank has some restrictions on the passwords that it allows: you can only use (lowercase) letters and (decimal) digits, and the password can be up to 6 characters long. So what is a "good" way to select such a password?

One could do the following (call this scheme 1): select randomly 6 characters from the set [a-z0-9] with uniform probability. This scheme has the disadvantage that it may produce "weak" passwords such as those containing only letters or only digits (there is a 14.2% [1] chance of this happening).

So here's another idea (call this scheme 2): select randomly (and uniformly) one letter, one digit, and 4 characters from the set [a-z0-9]. Then use a random permutation of these 6 characters as the password. This guarantees that the password will have characters from both classes (letters and digits).

So the question is, which of the two scheme produces "better" passwords?

On the one hand, the second scheme produces passwords that will likely be more resilient to simple brute force attacks. On the other hand, the first scheme allows strictly more combinations than the second one: 2^31.0 [2] vs 2^30.8 [3]; in fact the set of passwords of scheme 1 is a superset of those of scheme 2.

Note: I'm looking at this from the perspective of the party that generates the password and not the party that sets the password policy.

[1] (26^6+10^6)/36^6 = 0.142

[2] log2(26^6) = 31.0

[3] log2(36^6-26^6-10^6) = 30.8 i.e. all combinations of 6 characters excluding the ones with only letters (26^6) and the ones with only digits (another 10^6)

pavlak11
  • 83
  • 4
  • The second scheme will not be more resilient to brute force than the first one, it just ensures that users won't be able to get really bad ones. If the implementation is leaked it would even be magnitudes easier to brute force. Guaranteeing that a password has at least x of a certain type has to be a property of the char set, not a decision. – AdHominem Mar 18 '16 at 18:18
  • 2
    Guaranteeing that a password has chars from both groups does nothing to resist brute forcing attacks. If the attackers are brute forcing and try the all numbers first, then all letters first, then the rest, they are simply ordering the way they exhaust the password space, not make it any easier to get to the end (which statistically is the only point). Why not just use the first method and throw away any output that doesn't have at least 2 letters and at least 2 numbers? – Jeff Meden Mar 18 '16 at 20:53
  • The average entropy decreases, but the entropy of the weakest passwords may increase. – user253751 Mar 18 '16 at 22:34
  • The second scheme can produce passwords like 'pass12'. A [top 250 password](http://www.thetechherald.com/articles/Do-you-use-any-of-these-passwords-Change-them-if-you-do/4002/). Very bad. – Neil Smithline Mar 19 '16 at 02:37
  • There _is_ a password with a good reason for not being allowed: the empty string. ​ That way, if the password field is left empty, then the provided "username" should not be stored. ​ ​ ​ ​ –  Mar 19 '16 at 12:19

3 Answers3

15

tldr; Restricting the password space does not help against brute force attacks, but may against intelligent attacks, but there is an even better way.

First off, you're asking the right question, but using the wrong terminology here:

...the second scheme produces passwords that will likely be more resilient to simple brute force attacks.

(I think you mean intelligent attacks.)

A brute force attack is one that attempts every possible combination without added intelligence. Brute force in your example would be something like:

a, b, c, ... y, z, 0, 1 ... 9,  
aa, ab, ac, ... ay, az, a0, a1 ... a9, ba, bb, bc ... 99,  
aaa ... 999, 
aaaa ... 9999,
aaaaa ... 99999,
aaaaaa ... 999999

Anything other than that would be considered an intelligence strategy. For example, before you try brute force, you may be able to save some time with this strategy:

  1. Attempt only numbers.
  2. Try words found in a dictionary.
  3. Try the top X most popular passwords.
  4. Try combinations of the username or user's initials, current year or last year, etc.
  5. Try brute force.

Your goal, as the creator of your password, is to make sure that the first 4 options always fail to discover your password. In other words, you need to make sure that the only way to discover your password is by brute force, and also make sure that if brute force is used, that it will take a long time before finding your password. However you do that is up to you, given the constraints of what types of passwords are allowed.

As for your particular question, I would approach it more like a random password generator- it generates one for you, you look at it, and if you believe that it could be guessed with a method other than brute force, you simply generate another random one until you're happy with it. (This would be Scheme 1 with a twist.) If you really wanted to quantify this, you could have your intelligence strategies built and ready to go (all strategies except for brute force). Then you can test them against your randomly generated password and if it is found, you generate another one until all intelligent strategies fail to discover your password. You then would be left with a password which is only discoverable by brute force. Using this method you can achieve better entropy than by restricting the password space in the way you described.

Note: with the password rules you supplied, my twist to Scheme 1 may end up looking somewhat like Scheme 2, however, as the number of allowed characters and password length increases, the Scheme 1 twist should provide more entropy.

TTT
  • 9,122
  • 4
  • 19
  • 31
  • 1
    It _might_ be worth clarifying that trying `9`...`a`, `99`...`aa`, etc. is also a possible brute-force attack, otherwise it looks like `999999` is a much more secure password than `aaaaaa`! – David Z Mar 19 '16 at 10:18
  • @DavidZ - Great point. The order of brute force attempts shouldn't matter. Though under normal circumstances your password length should be long enough that it's irrelevant. For example, without the length constraint, if you were worried that aaaaaa might be discovered much sooner than 999999, then just add another character and make it 7 chars long: aaaaaaa. – TTT Mar 19 '16 at 13:51
  • (continued) To really hit your point home, in the above case, it's certainly possible that there are 36 different threads of the brute force attack running simultaneously, each one beginning with a different initial character. – TTT Mar 19 '16 at 13:53
11

I think you have summarized the complexity of the problem nicely, and illustrated why both have pros and cons. The correct answer is of course Scheme 3: Use a longer password, or Scheme 4: Switch to a bank that allows more than 6 character passwords.

You have actually raised an age-old question, which you can find sprinkled around the questions on this site if you browse through the tag : with some (small) probability, a completely random password generator could come up with "password". By putting restrictions on randomness, you are increasing the strength of your worst-case password, at the expense of your entropy. If you can make a good argument for a restriction that does not reduce the entropy significantly, then I think it can be acceptable. Though the general answer is always: if you're worried about the entropy of your password space, use more characters.

Mike Ounsworth
  • 57,707
  • 21
  • 150
  • 207
0

answer to title question:

No, under no scenario but one does restricting the password space increase security. The one scenario where it does is when your restriction is the exclusion of known weak passwords, e.g. a blacklist. This is the (2018) recommended way of doing it, by the way. Forget all the complexity nonsense, that was IT securities equivalent to alchemy.

answer to the full-text question:

Your scheme two is actually weaker due to a common human fallacy. You may think that a random generator that can theoretically spit out "aaaaaa" is weaker than one that cannot. However, you are mistaken. Your generator that cannot generate this "weak" output actually has a reduced search space and any attacker who knows the algorithm behind the scheme has a big advantage over the entirely random scheme.

Note that "aaaaaa" might be the first try one a stupid brute-force attack, but the probability is really tiny and easily offset by the fact that the search space is so much larger. In fact, in a scheme that makes it impossible for the generator to create this output, an attacker aware of the scheme would not test for it, either.

"aaaaaa" is actually not a bad password and in any realistic attack scenario is likely to be tested by the attacker long, long after "passwd", "123456" and even all their possible permutations. An actual real-world attacker is much more likely to run or at least start off with a dictionary attack before he uses actual brute-force enumeration.

Tom
  • 10,124
  • 18
  • 51
  • Might want a better example than "aaaaaa" vs "passwd", [pwned passwords](https://haveibeenpwned.com/Passwords) currently returns 375,079 usages of "aaaaaa" and only 12,251 usages of "passwd". – AndrolGenhald Aug 02 '18 at 14:25
  • rofl, that's right. I got carried away a little there. You can use ":-):-)" which has been seen just 8 times, or ":-)(-:" which is really easy to remember, has special characters, and just one hit. :-) – Tom Aug 02 '18 at 15:16
  • The main point is that things that seem "not really random" to out stupid intuition actually are. If you see a long string of random numbers, you see pattern in there that you THINK should not be. But the opposite is true: They should. It is exactly because of randomness that they can spontaneously emerge. – Tom Aug 02 '18 at 15:17