Why is one-time pad informationally secure?

Question

I am reading "Serious cryptography" and he wrote the following:

Informational security is based not on how hard it is to break a cipher, but whether it’s conceivable to break it at all. A cipher is informationally secure only if, even given unlimited computation time and memory, it cannot be broken. Even if a successful attack on a cipher would take trillions of years, such a cipher is informationally insecure.

Then, he proceeded to write that the one-time pad is informationally secure.

I don't understand this at all. If we see a cyphertext, such as 00110, we know that the corresponding plaintext has 5 bits as well, and the cypher key will also have 5 bits, thus 2^5*2^5=1024 possible combinations. Bruteforcing 1024 will yield a result. Even if the cyphertext is huge and bruteforcing won't be practical, it is still theoretical possible, no?

If it is theoretical possible, wouldn't it deem the one-time pad as informationally insecure?

What am I missing here?

In Cryptography, usually, the length of the message is not assumed to be secret. — kelalaka, May 06 '22 at 18:49
If the message length reveals information, that's the domain of _traffic analysis_ rather than cryptography. The usual defence is to pad messages to a fixed length (and perhaps to send messages at a constant rate, using empty messages as needed). — Toby Speight, May 06 '22 at 21:27

score 95 · Accepted Answer · 2022-05-05T14:06:58.640

95

Bruteforcing all 1024 possibilities will yield a result.

Yes, a result. But even if you iterate all 32 5-bit combinations, you still have no idea which one the "correct" one is. That's what makes it informationally secure: Even if you iterate every key (and thus, every possible message), you don't know which of these messages is the one that was sent.

For example, imagine the following message:

ATTACK AT 8

It's 11 characters long, and you could conceivably iterate the space of all 11-character strings. This will yield results such as:

PIZZA TIME!
YOZNACKS :)
ATTACK AT 3
ATTACK AT 8
IN UR BASE!

Now you look at all these messages and you still are none the wiser. In fact, you don't even need the ciphertext at all. Knowing the length of the message, you can simply iterate the entire message-space and all you will know for sure is that one of these messages must be the correct one.

To break it down even further, imagine your message is just a single bit, either 1 or 0. To encrypt it, you either decide to "flip" it or not (meaning, you XOR it with 1 or 0).

This leaves us with the following 4 states:

Message 0, Key 0 => Ciphertext 0
Message 0, Key 1 => Ciphertext 1
Message 1, Key 0 => Ciphertext 1
Message 1, Key 1 => Ciphertext 0

You then present the ciphertext to the attacker, while keeping the key secret. Say, you present 1. According to the table, the message was either 0 with a key of 1, or the message was 1 with a key of 0. But since the attacker does not know the key, they only have a 50% chance of guessing the correct bit - which is the best they can do in a perfect system.

For every bit in the message, the chance of successfully guessing the message is divided by half. And again, you can only guess the message - there is no indication whether your guess is correct or not.

edited May 05 '22 at 14:06

answered May 05 '22 at 12:30

1

Ah I understand that now. I guess I didn't understand what the author meant by a "successful" attack; I guess I thought that a successful attack just meant "seeing" the result, but that doesn't make much sense. Thanks! – YozNacks May 05 '22 at 12:48
45

@YozNacks Think about it like this: If I picked a card out of a 52 card deck, then handed you the deck and you looked at every card, you could also "see" my card - but you wouldn't know which one of these would be *my* card. (It was the 6 of Spades by the way :P) – May 05 '22 at 12:51
6

Minor nitpick: Given a 5 bit ciphertext, you don't need to iterate through all 1024 possible message-key pairs, you just need to iterate through the 32 possible keys. Of course, it doesn't really matter either ways. – nobody May 05 '22 at 13:19
@MechMK1 yeah, that makes sense ahah – YozNacks May 05 '22 at 13:23
1

@nobody Somehow in my mind I was like 2^5 = 1024. I shouldn't write answers when tripping on mushrooms. – May 05 '22 at 14:07
If you know that the short plain text is human language, it’s right over all one time pad keys and narrow things down to plain text that looks like human language. IIRC Bunny Huang used this approach when decrypting the Xbox encrypted ROM (or similar). I suppose this argues for compressing before encrypting. and also for not signing the plaintext before encrypting. - not doing anything that would allow you to detect valid decrypted plain text. – Krazy Glew May 06 '22 at 03:39
9

@KrazyGlew That doesn't quite help with a one time pad. You just end up with a bunch of human language plaintext's that are all equally likely, and there is no way to tell them apart. It does work for other real world ciphers, because normal ciphers are not information theoretically secure. – nobody May 06 '22 at 05:29
6

@KrazyGlew If you brute-forced all possible OTP keys you will get all possible plaintexts, including every possible sequence of English words. A signature doesn't help either, since the results will include every possible message with a valid signature. The OTP ciphertext alone gives you zero information about the plaintext. You might as well just iterate over every possible string. – eesiraed May 06 '22 at 05:33
1

Yes, I was not thinking correctly. – Krazy Glew May 06 '22 at 05:34
4

If you know that a certain phrase must be in the message, then your decoded message is the one containing that phrase. That's how Alan Turing broke enigma - he looked for messages ending in "Heil Hitler"! – Oscar Bravo May 06 '22 at 09:38
10

@OscarBravo No, the Enigma was broken because they re-used the same key for two different messages. Even if you know my message ends in "Heil Hitler", you still get every possible message before that. – May 06 '22 at 10:57
2

Ah yes, the Library of Babel... https://en.m.wikipedia.org/wiki/The_Library_of_Babel – Oscar Bravo May 06 '22 at 12:38
@OscarBravo, or if you know the message is a gzip encrypted file, but it still doesn't help you, since the message can be _any_ gzip encrypted file of the given length. Same for any structure, including the HH trailer. Analyzing the message length to tell the difference between some plausible options is a different issue and perhaps needs to be worked around with padding, whatever the cipher. (Also, Enigma isn't/wasn't an ideal OTP cipher, but more like steam cipher with weaknesses.) – ilkkachu May 06 '22 at 12:38
4

@OscarBravo This may work for a non-OTP encryption scheme but it does not work for OTP. In OTP you will find "Heil Hitler" at every single position you look, and you will also find "Heil Churchill" (at positions where there is enough space for that to fit) and "Allahu Akbar" and "In God We Trust" and "WARNING FEB 2020 WUHAN PANDEMIC PREVENT TRAVEL IMMEDIATELY" and all sorts of other messages, and none of these will give you any information about any other part of the message. (For example, you cannot guess that if you happen to see "Heil Hitler" the rest of the message is in German.) – user253751 May 06 '22 at 13:23
@user253751 You will also see an explanation as to why you will see every possible message among one of the messages, in every language - some with typos, some without, and some with unnecessarily juvenile puns as well. – May 06 '22 at 15:21
8

This is a good explanation, and from an information-theoretic standpoint, the most important part is hidden in the note "in fact, you don't even need the ciphertext at all". In other words, having the ciphertext without the OTP *does nothing to change your estimate of the probability of a given plaintext*, or in other other words: it gives you zero information. – hobbs May 06 '22 at 16:45
1

@hobbs: You do want the message length, typically from intercepting the ciphertext. Otherwise your set of possible messages won't include the actual one. But you could imagine cases where you e.g. detect radio activity but it's too weak to get an accurate copy, just detect how long it was transmitting for. So yeah, if you know the messages you're interested in used OTPs correctly, only traffic analysis is useful, not the ciphertext. – Peter Cordes May 07 '22 at 04:49
2

@MechMK1 The Enigma was broken various ways at different but the Turing bombe indeed was a known plaintext attack. – richardb May 07 '22 at 07:00
@MechMK1 "*No, the Enigma was broken because they re-used the same key for two different messages*" -- I think you're thinking of Lorenz, not Enigma? – canton7 May 08 '22 at 10:05

Drunix · Answer 2 · 2022-05-09T14:55:33.070

MechMK1 answer is a very good explanation and I just want to make some additions:

Yes, if the cipher text is 5 Bits your adversary has learned that the message was 5 bits. Especially, he has learned, that there was communication at all, which maybe something you might want to avoid. That is why in such cases one performs dummy communication to keep the line busy.

The term "informationally secure" refers to the content of your message and what the adversary might theoretically learn about the plain text by observing the cipher text. To quantify this, one can use the concept of mutual information. Mutual information can be thought of as follows (I find the definition I(X;Y)=H(X)-H(X|Y) being the most convenient for this purpose) : You have some uncertainty, what the message is (usually measured by its entropy). Now you observe the cipher text. Afterwards your uncertainty about the message is reduced to the conditional entropy given you know the cipher text. The difference between those twos, namely the mutual information, quantifies how much learned. For one time pads, this difference is 0, which means you haven't learned anything. Note that we have no requirement that the plain text message are somehow equidistributed, only the key.

Regarding you brute force method: You could do that, but you still haven't learned anything. All possible decryptions are equally likely.

score 1 · Answer 3 · answered May 08 '22 at 03:01

Instead of thinking of decrypting a single message, think of that idea of "breaking" the code. Suppose we have an algorithmic cypher, a hundred messages encrypted using it, and a mega-powerful science fiction computer (simulating "unlimited computation time and memory"). It blasts through gagillions of cyphers and finds one with reasonable results for all 100 messages. We think we've "broken" the cypher, in the sense that we can quickly decode the 101st message using our guessed algorithm. We could give our solution to anyone to let them easily decode future messages.

Now try that on a 1-time pad and you'll see it won't work. Let's go nuts and say a spy has given us everything -- the real contents of the first 100 messages together with confirmation of the first 100 entries in the pad. How can we use that to decode the 101st message? Well, not at all. The 101st entry in the pad is independently random. It's like looking at a sequence of 100 numbers, being asked to guess the next, and being told "BTW, these are all random". This is what the author means by writing it's not conceivable to break a 1-time pad in the sense of breaking an algorithmic one.

Why is one-time pad informationally secure?

3 Answers3