What is an OpenPGP Key ID collision?

Question

Recently an Evil32 project group announced that they have found collisions for all OpenPGP keys from public keyservers (https://evil32.com/).

What does the term collision refer to?

score 21 · Accepted Answer · edited Oct 07 '21 at 06:58

They're referring to short key IDs, which are considered vulnerable to collision attacks for quite some time now. The problem is already referenced in RFC 4880, OpenPGP, 3.3. Key IDs:

A Key ID is an eight-octet scalar that identifies a key. Implementations SHOULD NOT assume that Key IDs are unique.

OpenPGP Fingerprints and Key IDs

Each OpenPGP key has a fingerprint attached, calculated mainly from its public key packet which also contains the creation time. The calculation is defined in RFC 4880, OpenPGP, 12.2. Key IDs and Fingerprints.

There are short and long key IDs, which resemble the lower 32 respective 64 bits of the fingerprint. For example, looking at the IDs of my OpenPGP key:

fingerprint: 0D69 E11F 12BD BA07 7B37  26AB 4E1F 799A A4FF 2279
long id:                                    4E1F 799A A4FF 2279
short id:                                             A4FF 2279

Fingerprints and key IDs are used, as sharing and comparing a whole key with usually 1024 to 8096 bits (adding some more for headers like the creation date) is very impractical.

Collision Probability

While shorter IDs are easier to share and compare, they also increase the chance of collisions. This is growing exponentially with the lenght, in numbers. This is the number of possible key IDs for each length:

2^32  =                                        4294967296
2^64  =                              18446744073709551616
2^160 = 1461501637330902918203684832716283019655932542976

The collision probability is the inverse of the numbers, being the chance of generating two different keys having the same ID.

Finding collisions for all publicly available OpenPGP key data requires finding collisions for barely 4 million keys. The project actually limited itself to the strong set (which contains the largest set of keys, that are all linked together, possibly with transient edges), and this currently contains about 55.000 keys.

Key collisions are still possible for long key IDs and fingerprints, but unlikely to very unlikely. The number of possible fingerprints is even larger than the number of possible UUIDs and these are usually expected to be unique. The same number of IPv6-adresses is available, techtarget.com has some examples to get an impression of numbers that large.

Enumerating Keys

Generating keys with reasonable security takes some time, making it impractical to enumerate enough keys to find those four million collisions. But that's why I already mentioned the key generation time above: with a single key, you can create lots of different OpenPGP keys from a single (lets say, RSA) key without the expensive key generation, you simply iterate over the time. This was already proposed by Micah Lee in his Talk on "Trolling the Web of Trust".

Regarding the UNIX timestamp to be stored as a 32 bit number, and assuming SHA-1 as a good hash function has nearly equal distribution of result values, iterating of a single RSA key with different timestamps should suffice to generate all possible short key IDs (in reality, you will have to repeat this for a number of keys). If you want to only generate reasonable keys created within some reasonable time span (say, the last few years, definitely not before PGP was released and not in the future) you will have to use some more keys, but generating some more keys is still doable in a rather short time.

What to do now?

Don't trust short key IDs, never use them for verification. You never should've done.
Use long key IDs instead (or even better, the whole fingerprint), if you want to share your key. To be very sure at verifying a key, better even compare the full fingerprint.
If you're developing software that deals with OpenPGP keys, even refer to the whole fingerprint (a computer doesn't really care about whether to compare 32, 64 or 160 bits, but remember the collision probabilities!).
Check your own software on whether it follows these recommendations.

The purpose of a key ID is to assist in *searching* for a key, not in validating it. I might have my key ID printed on my business cards. Putting the key ID there notifies people that I'm prepared to deal with PGP crypto and also provides an easy way to search for my key on key servers. Even if there are duplicates, only one will (probably, unless something nefarious is going on) match my email address. In the case of nefariousness, presumably one would a) trust none of the keys and b) tell me. (People can find keys by searching on email addresses, too, but that doesn't give notice.) — Bob Brown, Nov 30 '14 at 23:58
You're right, I should have emphasized that a little bit more. I added a minor edit to my answer. — Jens Erat, Dec 01 '14 at 17:59
"unless something nefarious is going on" if noone did anything nefarious we wouldn't need openpgp in the first place. — Peter Green, May 30 '16 at 20:46
Seems like even long key IDs should be avoided as any form of 'sharing' ..? — user2864740, Jan 14 '20 at 20:54
For now, the long key ID should usually be sufficient and provide plenty of entropy. This might change in futue though (for example, if new attack vectors are found). The whole fingerprint should generally be used if automated systems compare IDs (a computer doesn't care about the length) and at least referencing the whole fingerprint (for example on a business card, add the whole fingerprint and maybe highlight the long key ID part) might be a good thing to do. — Jens Erat, Feb 09 '20 at 10:35

score 1 · Answer 2 · answered May 30 '16 at 21:09

Openpgp uses key ids to identify keys. A "key ID" is a portion of the hash of the public key. There are two types, "short" (32-bit) and "long" (64-bit).

For short key IDs it's fairly easy to construct a new key that has the same short key ID, and same user visible details as an existing key. It's common (bad but common) practice to identify keys by their short key ids.

If you are building a system around openpgp you need to carefully check that short key IDs are NEVER used as a means of indicating which keys are trusted.

Even gpg itself messed this up, when requesting a key from a keyserver it would request by short key ID (even if the user specified a long one). If you use a "keyring" file as a list of keys trusted for a given purpose this makes it really easy to end up trusting a key generated by an attacker.

Even if you have an updated gpg it's almost certainly a good idea to only do "recv" and "refresh" operations on a keyring where the presense of untrusted keys is not a problem, then carefully inspect them before copying them to a "trusted keys only" keyring.

http://www.asheesh.org/note/debian/short-key-ids-are-bad-news.html

What is an OpenPGP Key ID collision?

2 Answers2

OpenPGP Fingerprints and Key IDs

Collision Probability

Enumerating Keys

What to do now?

Linked