I have two pieces of ciphertext encrypted with a stream cipher using the same key.
How do I recover the plaintext of both ciphertext messages without knowing the key used?
I have two pieces of ciphertext encrypted with a stream cipher using the same key.
How do I recover the plaintext of both ciphertext messages without knowing the key used?
If the two encrypted messages are using the same stream cipher and the same key,
C1 xor C2
results in M1 xor M2
where C1 and C2 are the respective ciphertext and M1 and M2 are the corresponding plaintext.
You can then recover the plaintext using a technique known as crib dragging. You take a common word or phrase that may appear in the plaintext (such as " the ") and xor that against the result of M1 xor M2
. If one of the plaintexts had the text of the crib (" the " in our example), then the result of the xor is what the other plaintext had in that position. If neither plaintext contains the text of the crib, it is very likely that the result of the xor is just gibberish.
You just continue this technique until you recover enough of the plaintext to intelligently fill out the rest.
This is known as OTP key reuse attack; you can find the answer ("cribtext drag") in here. The more messages you have (the more the key has been reused), the better. With a large enough corpus you may not even need cribtext dragging at all.
That is, you take a guess of a common phrase that may appear in one of the plaintexts (the classical example against ASCII english is the 5 letter " the "), and exclusive-or that against the XOR of the two original messages in various locations. If one of the plaintexts had the text of the crib (" the " in our example), then the result of the exclusive-or is what the other plaintext had in that position; if neither plaintext had that, it's likely that the result of the exclusive-or is just gibberish. And, once you have a plausible short section, you can extend it (for example, if you know that one of the plaintexts is " na**", you can go through the dictionary of all words that start with "na", use those as cribs, and see which makes the other plaintext make sense).
Of course, you can only use the shortest common length of several messages: if you have one 1500 bytes, one 1000, and one 500, you have a three-reuse for the first 500 bytes, a two-key for the next 500, and the last 500 can't be attacked.
Unless the OTP is also reused "in time", i.e. periodically (it's no longer a OTP, but then one might argue that it wasn't in the first place, since it is being reused...). This kind of error was done on a brand of encrypted hard disks, every one of which had a different OTP key that was then used for every single sector (including zeroed ones - an extreme form of crib), leading to an effective encryption strength of nil. Then, if the original OTP sequence is all included in messages for which you have duplication, you can recover the key from those, and then happily go on decrypting everything else.