Given the various lengths of RSA key pairs (1024, 2048, 4096) what are the odds of two users having generated the exact same private key?
-
2Duplicate on crypto.se: [How many RSA keys before a collision?](http://crypto.stackexchange.com/questions/2558/how-many-rsa-keys-before-a-collision) – CodesInChaos Oct 15 '14 at 14:38
1 Answers
First, what matters is not having two identical private keys, if either of the prime numbers is the same in the modulus (which is part of the public key), it is easy to quickly recover the full private key.
(Aside -- Brief RSA primer: In RSA for an n-bit private key, the public key is composed of an exponent, typically e=65537 and a modulus, N, which is the product of two n/2-bit prime numbers. The private key consists of the private exponent d (calculated by d = e-1 mod (p-1)(q-1) which is efficiently found using the extended Euclidean algorithm) and the modulus, N. The public pair allows encryption of a message, m, into a ciphertext, c, via modular exponentiation: c = me mod N, and the private key allows decryption via m = cd mod N. This all works due to a fundamental part of number theory -- Euler's theorem and how to calculate totients).
If two moduli share a factor, then you can take the greatest common divisor of the two moduli efficiently (using Euclid's algorithm) and find the shared factor. Let, the two moduli share a prime factor q, so N1 = p1 q and N2 = p2 q, then gcd(N1, N2) = q. With the found prime factor, it is trivial to find the other prime factor (p1 = N1/q) and to recreate the private exponent. A 2012 study looked at 4.7 million distinct RSA moduli and found shared factors in 12720 of them (that is 0.3% of the 4.7 million RSA keys trawled from the web are broken from their analysis). (The paper also found several other problems with RSA in practice, like shared keys used on seemingly different entities.)
Now, choosing identical prime numbers is extremely low probability if your machine has a high amount of initial entropy and there is no flaw in the random number generation. Both of these assumptions aren't always true; see the Debian OpenSSL problem keys from 2006-2008 and note that many people create keys for systems on brand new (virtual) machines that haven't had enough time to establish a lot of entropy.
Roughly, the strength of a 1024-bit key relies on the choice of the two 512-bit prime numbers that comprise the modulus. By the prime number theorem there are roughly (2512/ln 2512 - 2511/ln 2511) ~ 10151 distinct 512 bit prime numbers. From the birthday paradox, you should expect to have a collision of a prime number showing up more than once with about 50% probability, after generating sqrt(10151) ~ 1075 distinct prime numbers. That is if some government generated a billion 512-bit prime numbers per second per computer and used a billion computers and had this run for a billion years they would only have generated about 1034 prime numbers, and the probability of a collision with some external key would be essentially zero, the same chance as playing powerball exactly 10 times on consecutive weeks and buying exactly one ticket and winning the jackpot every single time without cheating. (It doesn't get significant chance of a single prime collision until nearly 1040 = 10000000000000000000000000000000000000000 times more prime numbers are generated).
However, as the 2012 paper shows, this is not true in the real world as RSA keys are generated by flawed methods in practice at a rate of at least 0.3%.
-
I would like to add that in practice two certificates with different RSA keys yet colliding fingerprints (which are typically much shorter than the keys themselves) are problematic in their own way. – David Foerster Oct 14 '14 at 22:55
-
You seem to think that 10^34+10^40 is nearly 10^75 - this isn't true; 10^75=10^34*10^40*10. The government you describe would actually have vastly more difficulty generating all 512-bit prime numbers than you described. – Brilliand Oct 14 '14 at 23:09
-
1@DavidFoerster - Agree colliding fingerprints are a problem, though with SHA-2 signed certificates (the minimum recommendation these days) you have 256-bit which provides 128-bits of security against collision attacks, assuming SHA-2 doesn't have major weaknesses above brute force. This is the reason SHA-1 are being deprecated as it is feasible for a collision attack on a 160-bit SHA-1 key from rogue government-type attackers to [fake an intermediate certificate authority via a collision attack](http://security.stackexchange.com/questions/65275/why-is-a-csr-hashed/65279#comment105972_65279)) – dr jimbob Oct 14 '14 at 23:23
-
@Brilliand - Yes that was my intention -- apparently I left out the word times. 10^34 * 10^40 != 10^75, but 10^75 isn't some magic spot but roughly where you start to worry; sqrt(N) is easy to remember point for collision attacks for mental math. Doing the math, at 3x10^34 primes, there's a 10^-81 chance of collision (similar to chance of winning powerball jackpot 10 times in a row playing only 1 ticket each time), at 10^74 a 0.03% chance, at 10^75 a 2.6% chance, at 10^75.5 there's a 23% chance, at 10^76 there's a 92.9% chance, and at at 10^77 its 100% chance (off from 1 by 10^-114 %). – dr jimbob Oct 14 '14 at 23:56
-
@drjimbob: I was not thinking of intended collision, since the question is about random collision. – David Foerster Oct 15 '14 at 00:11