38

It's been recommended to use OAEP when padding messages to be encrypted via RSA, to prevent known plain text attacks.

Can someone elaborate this in better detail? I'd specifically like to know the weakness in the previous scheme, both from a theoretical as well as practical/real-world perspective.

DeepSpace101
  • 2,143
  • 3
  • 22
  • 35

3 Answers3

31

The operation at the core of RSA is a modular exponentiation: given input m, compute me modulo n. Although in general this is a one-way permutation of integers modulo n, it does not fulfill all the characteristics needed for generic asymmetric encryption:

  • If e is small and m is small, then me could be smaller than n, at which point the modular exponentiation is no longer modular, and can be reverted efficiently.

  • The operation is deterministic, which allows for exhaustive search on the message: the attacker encrypts possible messages until a match is found with the actual encrypted message.

  • The modular exponentiation is malleable: given the "encryption" of m1 and m2, a simple multiplication yields the encryption of m1m2. This is akin to homomorphic encryption, which can be a good property, or not, depending on the context.

For these reason, the integer m which is subject to RSA must not be the data to encrypt alone, but should be the result of a transform which ensures that m is "not small", contains some random bytes, and deters malleability.

The "v1.5" padding in PKCS#1 does the job reasonably well, subject to two (known) caveats:

  • A decryption engine can be turned into a padding oracle if the attacker can submit malicious messages to decrypt, and observe whether the decryption engine found a correct padding structure or not. This is Bleichenbacher's attack, which could work against SSL server, requiring about one million aborted handshakes to recover the SSL symmetric key. SSL servers have since been patched (if decryption fails to find a padding, the engine nevertheless continues with a random blob instead of the "pre-master secret"), but this highlights a potential problem with PKCS#1 v1.5.

  • This padding scheme was never "proven". Security proofs are a neat thing to have. In my opinion, this padding got something better, which is that it has been deployed in the field and widely used for almost 20 years; but many people prefer it when there are some mathematics which corroborate experience.

OAEP aims at making things better on these two points. It turned out that for padding oracles, things are not that clear; and the first proof was shown to be somewhat wrong by Shoup. The proof was "repaired" by Fujisaki, Okamoto, Pointcheval and Stern in the case of RSA (OAEP was designed as a generic padding scheme, but we now have a proof of security only when used with RSA).

To sum up, OAEP tries to improve over the previous padding for resistance against chosen ciphertext attacks (not chosen plaintext attack: since the public key is public, anybody can encrypt anything at will), but the v1.5 padding was already heuristically good at it, up to a million or so decryption, which is good enough for many purposes, and can be patched for others (as was done for SSL). OAEP does better if correctly implemented, which is not as easy as usually believed.

David
  • 15,814
  • 3
  • 48
  • 73
Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
  • I'd like to understand your comment about OAEP not being easy to implement. What are the caveats? – user239558 Oct 17 '13 at 19:22
  • The tricky point is that code must react properly when, upon decryption, a non-conforming structure is found. Otherwise Bleichenbacher's attack applies. Since OAEP implies more structure than v1.5, this extends the scope of the code which must be free of timing-related leaks. – Thomas Pornin Oct 17 '13 at 19:33
  • @ThomasPornin "requiring about one million aborted handshakes to recover the private key" I think you mean the plaintext (which could be the private key used in the following exchange, I think that's what you meant) – David 天宇 Wong Jun 22 '15 at 20:25
  • Yes, the attack results in the decryption of one encrypted message, i.e., in a SSL context, the "pre-master secret" for one connection, resulting in decryption of that connection (actually, all connections that use the same SSL session). – Thomas Pornin Jun 22 '15 at 22:18
  • At least for the *chosen plaintext attack* it seems like adding a simple hash/hmac would fix PKCS#1 v1.5 since you could check that to see if the encrypted blocks had been tampered with. – Xeoncross May 24 '16 at 20:50
  • 1
    @Xeoncross That implies that both parties have somehow agreed upon a symmetric authentication key for the HMAC, which is often not the case; in TLS you would use RSA to exchange such a value (or at least to provide authenticity to the exchange, in the case of DH key exchange), which brings us back to square one. – Polynomial Jul 11 '16 at 16:13
11

I wanted to dig deeper, so ended up reading Chosen Ciphertext Attacks Against Protocols Based on the RSA Encryption Standard PKCS #1 by Daniel Bleichenbacher.

The main weakness exists because PKCS#1 padding enabled some assumptions to be made. Those assumptions then can be exploited to design an attack. Check the paper, it's a clever attack! The attack is built in 4 stages, each stage progressively extracting more information than the previous. It's highly iterative but within the scope of practicality (1 day for typical SSL session key).

  • Stage 1: You blindly try some cipher text that simply passes the padding check on the remote side after decryption
  • Stage 2-4: Latch onto the results of the above and progressively narrow down the search space.

The latching/narrowing for stage 2-4 is possible due to two reasons:

  1. Exponentiation can be manipulated very comfortably, especially with inverses of integers in a finite field. This is a property of RSA/modular exponentiation itself - not padding per se. For example, one quote in the paper:

    An attacker who wishes to find the decryption m = c^d (mod n) of a ciphertext c can chose a random integer s and ask for the decryption of the innocent-looking message c' = (s^e)*c mod n. From the answer m' = (c')^d, it is easy to recover the original message, because m = m's^−1 (mod n).

  2. One can map direct mathematical relationships between chosen ciphertext and the decryption result at the remote side. This is homomorphism where an operation on the input produces the same operation on the output. You don't know the exact result but can deduce that "I don't know what the value is but it's now 2x it's previous value". The actual value itself is found via drawing a range and then narrowing the range iteratively.

OAEP changes things a bit because it adds some hashing (unbalanced feistel hashing) in between the RSA decryption (modular exponentiation) and the padding check. This basically disrupts the #2 point above and the attack cannot proceed to stage 2-4 because the knowledge from stage 1 cannot be passed over - the hashing in between from OAEP wildly distorts the ability to understand what the remote party sees upon OAEP decoding. The only way possible would be if you can deterministically invert the hash function to proceed with the informational gather stages. But then that means the hash is broken.

DeepSpace101
  • 2,143
  • 3
  • 22
  • 35
1

So RSA is maleable, meaning that an attacker can modify messages in a controlled way, namely multiplying them by known constants. As a result we need some kind of padding to hopefully prevent this: we don't want to be asked to by the attacker "What is this message?" and give him information he needs to decrypt another message. (There are other attacks: a survey is http://crypto.stanford.edu/~dabo/papers/RSA-survey.pdf)

Ergo padding is required. The first padding scheme put a constant byte first, then random nonzero bytes, then a zero byte, and the remainder of the number was the message. This looked like it was good.

Sadly this implies that an attacker can learn if the first byte of rM, M any message, r any number is a given constant. This turns out to be enough to factor the key, as show way back in the late 90's. www.bell-labs.com/user/bleichen/papers/pkcs.ps is the paper.

SSL v3.0 was insecure, as were several other deployed protocols, making this a rather serious issue.

Watson Ladd
  • 136
  • 1