3

Say we have amount random entry codes of length characters from an alphabet of size alphabet. The number of possible codes is then easily calculated as keyspace = alphabet ^ length.

Now take an attacker who is trying to gain entry using brute force random guessing of codes. At guess rate rate, the time in which they can all be checked (and entry is thus guaranteed) is easily calculated as keyspace / rate.

However, what I would really like to know is, how to calculate the time required for the attacker to have a given chance of finding any valid entry code. This is a system where a valid code is all you need, there is no secondary requirement like a username (think coupons). Once again, there are amount in keyspace.

E.g., how long would it take the attacker to have a 1% chance of finding a valid code? And 10%? And 25%? Etc. Or the inverse, given a specific amount of time, what are the chances?

2 Answers2

2

For the expected case (average) the number of attempts has a linear relationship to the probability of success being considered.

expected_time = probability * keyspace / ( rate * 2)

So if the keyspace consisted of 1 billion codes and the attacker could brute force 1 million codes per second to have a 10% chance of success the attacker would need 50 seconds.

expected_time = 10% * 1000000000/(1000000 * 2)

Now in your case you state there are a multiple valid codes (amount). Can these codes be attacked in parallel? Generally we try to avoid scenarios where that is possible. This is one reason why we add salt to passwords. It prevents an attacker from attacking all possible hashed passwords in parallel. Another example would be tying a code to a given userid. Even if userid is known this limits the attacker to searching for a specific code not any valid code. If no parallel attack are possible then the above formula is true but if the attacker can perform an attack against all possible valid codes with a single attempt then the expected time is reduced by the number of valid codes (amount).

expected_time = probability * keyspace / (rate * amount * 2)

Another way to look at it is the amount of valid codes reduces the size of the keyspace (assuming attacks can be done in parallel and any valid code is acceptable). Note in neither case does the birthday problem apply so there is only a linear reduction in expected time when the attacker can perform a parallel attack. In hashing terms this would be a preimage attack not a collision attack. Still we generally try to scenarios where parallel attacks are possible because we don't want to increase the efficiency of the attack (even linearly).

Gerald Davis
  • 2,250
  • 16
  • 17
  • 1
    The user wants the expected time for a given chance. So the expected time for a 10% "chance" of breaking a code, 25% chance, etc. – Gerald Davis Jul 13 '15 at 12:30
  • The codes are distributed uniformly, so if the keyspace is searched in order, not randomly, the probability of having found a code after having traversed half the space seems to me to be closer to 100% than 50%. That's just a guesstimate, but an educated one. Assuming `amount` is a decent number and not 1 or 2. – Bart van Heukelom Jul 13 '15 at 12:32
  • You are correct. I was thinking worst case. For expected case it would be reduced by half. It doesn't matter if amount is a large number of just one (other than a linear reduction in expected time if and only if parallel attacks are possible). Answer updated. – Gerald Davis Jul 13 '15 at 12:37
  • @BartvanHeukelom Also I would just clarify random vs in order doesn't change the expected time unless the attacker is aware of some bias which means the valid codes are not randomly distributed. – Gerald Davis Jul 13 '15 at 12:48
  • But, how can that not matter? As an analogy, take a beach littered with a few soda cans (distributed uniformly). If you sweep the beach from one end to the middle, you're bound to have picked up at least 1 can. But if you just grab in the sand at random locations, the chance of that is much, much lower, right? – Bart van Heukelom Jul 13 '15 at 13:11
  • I just thought about it again, and you're right, it doesn't matter. Doesn't feel right though. :p – Bart van Heukelom Jul 13 '15 at 13:12
  • If codes aren't distributed randomly, but in "chunks", then you shouldn't search randomly. If you search in one spot and don't find a chunk, it's unlikely that the code is "close" to that spot. If you know the exact size of the chunks, be sure to search location+chunksize/2 away from it. This reduces your search space. It's a lot like the strategy in Battleship. – Steve Sether Jul 13 '15 at 17:59
0

Going back to high school probability math, the following should be close.

Given amount valid keys in a space of size space, the chance of any random guess being correct is winChance = amount / space.

Given a number of guesses (which is easily calculated from rate and time), the chance of finding 0 valid keys is (1-winChance) ^ guesses. That makes the chance of finding at least 1 valid key:

1 - ((1-winChance) ^ guesses)

What I haven't taken into account is that after each failed guess, the probability of the next guess being correct slightly increases. I'm assuming that with a large enough space / small enough amount, this increase is negligible.