10

In the real world, millions of PGP keys are created every day, what is the probability (chance) of creating two identical keys? In different places, by different people?

the
  • 1,841
  • 2
  • 16
  • 33
coinalty
  • 103
  • 3

2 Answers2

11

If we assume that your random number generator isn't completely borked, then

what's the chance of two PGP keys being exactly identical?

can be answered with approximately something raised to the power of negative infinity. Or in other words: it's not going to happen.

We can work this out by assuming that an "average" PGP public key uses a 2048-bit RSA modulus. Because at least the first bit of the modulus must always be 1, the effective modulus length is one bit shorter. (The exact technical and mathematical discussion behind this is a little more involved, but this should be good enough for now.)

I don't know how common semiprimes (products of two prime numbers) are in that range, but let's be generous and put the probability of any random number in that range being a semiprime at 10^-10. 10^-10 is about 2^-33. This is the ratio of natural numbers in the desired range to actual valid RSA moduli in that range.

Hence, the combined effect is to effectively reduce the number of possible moduli by some 34 bits compared to if we were just to pick a number of sufficient length at random. (Because of the magnitude of these numbers, the decimal fractions simply aren't relevant here.)

Hence, we go from 2^2048 to 2^2014 (or thereabouts) possible RSA moduli.

Let's say ten million keys are generated every day. This is probably on the high side, but again, we're playing generous. Again, that's about 2^33 keys. Let's also say that we're doing this for 100 years, and only care whether two keys come out to have the same moduli (we don't need to store them, or compare them to each other, or anything else like that, because all of that adds overhead to our process that is customarily ignored in cryptographic theory). 2^33 * 365 (days) * 100 (years) ~ 2^48. We have shaved an additional effective 48 bits, coming out to 2^1966. Even if we assume billions of keys per day, that only shaves of an additional three orders of magnitude (corresponding to 30 bits) or so.

We conclude that the probability of, over 100 years and assuming that we properly generate ten million keys per day, leading to two keys randomly being generated to have the same modulus thus comes out to about 2^-1966 or 10^-592. The number 2^-1966 can be compared to that by the year 2040, we might just barely be able to count up to 2^138 in ten years. It's a long way up the exponential curve from 2^138 to 2^1966, and we'd still have to actually do something with each counted number (like, say, generate a RSA key).

Even just storing all 2048-bit RSA moduli, assuming the 10^-10 ratio mentioned above, would take about 2^2015 * 2048/8 ~ 10^609 bytes (10^610 bits) of storage. If we could somehow come up with a way of storing one bit per baryon in the entire observable universe and access all that storage instantaneously, then we would need only a measly about 10^530 universes to do this.

If you don't want to take the easy route, then even as hard as it is, factoring the modulus is going to be much, much easier than relying on random chance to give you a duplicate key. And besides rubber hose cryptanalysis, factoring rather than pure random coincidence is the threat you need to worry about. Figures vary between different estimates, but at present, 2048-bit RSA is estimated to give about 100-130 bits of security (2^100 ~ 2^130 work factor) against known methods of attack, which for the time being is sufficient unless your threat model includes long-term efforts by highly funded adversaries, in which case you might want to go with a modulus of 3072-4096 bits (4096 bits RSA providing about 110-140 bits of security, depending on who you ask).

user
  • 7,670
  • 2
  • 30
  • 54
  • thanks!, it makes much more sense to me right now, i always wondered how the internet and those large companies rely on Public key encryption for their daily activities, without encountering any problem,especially collision, and now i have a much broader view on this subject – coinalty Nov 06 '15 at 11:44
  • @coinalty A collision would also only be a problem if you know about it and have a way to exploit it for your purposes. Example: we know that, given the ability to pick two strings longer than 512 bits, there exists the possibility to pick two such strings such that the SHA512 hash of the two is the same, because SHA512 effectively maps *any* input to a 512-bit output (so there must exist inputs longer than 512 bits which give the same output or hash). We don't *worry* about this because in practice, *finding* those particular two strings is so hard as to make the attack completely infeasible. – user Nov 07 '15 at 14:16
5

The accepted answer is probably true, but I think it deserves some more attention from a different point of view.

  1. Here is a critique of PGP key generation well worth reading.

The whole security of the system depends on the fact that the program must be capable of generating each and every one of these 2**254 numbers with more-or-less equal probability.

(The specific number here is "outdated", the relevant part is that a safe key generation requires an even distribution.)

  1. As a follow up it's interesting to learn more about random number generator attacks.

  2. And then there's this question: Example of a backdoor submitted to an open source project?

  3. And finally on the CloudFlare blog: How the NSA (may have) put a backdoor in RSA’s cryptography: A technical primer:

  • A flaw in a random number generator allowed people to hijack Hacker News accounts.
  • A broken random number generator in Android allowed attackers to hijack thousands of dollars worth of bitcoins.
  • The version of OpenSSL on the Debian distribution had a random number generator problem that could allow attackers to guess private keys created on these systems

Combining the above it's not unlikely that some organisation(s) may in fact have the capability to generate a key identical to a key generated before in another place by someone else. It's hard to put a number on this chance but it's surely higher than "approximately something raised to the power of negative infinity".

the
  • 1,841
  • 2
  • 16
  • 33
  • 2
    As for point #1, you do realize that the number 2^254 as mentioned on the linked web page stems from generating a *512-bit* RSA key, right? 512-bit keys are *trivially* breakable these days, no need to resort to random chance of generating identical keys; RSA-155, a 512-bit RSA key, [was factored](https://en.wikipedia.org/wiki/RSA_Factoring_Challenge#The_prizes_and_records) in 1999. I will grant that the situation probably was somewhat different [18 years ago](https://web.archive.org/web/19970804151351/http://www.cs.man.ac.uk/~chl/critique.html); the relative effort then was much greater. – user Nov 07 '15 at 11:23