6

I have a number of files encrypted with a key derived from a password. In line with standard practice, I use a random salt and password and do many PBKDF2 iterations to obtain an encryption key and IV. I then use this to encrypt the data with AES-256 in CBC mode.

In what ways could the security of my data be compromised if an hypothetical attacker knows that the first bytes of every plaintext file are a fixed string such as "abcdef"?

Jonathan Garber
  • 518
  • 3
  • 15
user21203
  • 273
  • 1
  • 5
  • 11
  • Also you could guess the first bytes of the file by the file extension. For instance ".class" files are prefixed with `0xCA 0xFE 0xBA 0xBE`... – tiktak Feb 11 '15 at 09:08

2 Answers2

7

If knowing part of the plaintext gives an advantage to the attacker in its efforts to guess or recompute other encrypted bytes, or the key itself, then this is considered a serious weakness of the encryption algorithm. No such weakness is known for AES.

The paragraph above needs some precisions. Indeed, if I, as an attacker, is given the knowledge that some plaintext bytes are "Pope Bened.ct XVI is resigning", with the "." being a byte unknown to the attacker, then I can guess with relatively high probability that the unknown byte actually encodes an "i". So the correct definition would be: if the attacker, given knowledge of some plaintext bytes and the whole encrypted file, can guess the missing bytes with higher probability of success than the same attacker who knows the same plaintext bytes but not the encrypted file, then the encryption algorithm can be considered as broken. AES is not considered as broken, so that's OK.

Another point is about active attacks. An attacker may want to modify the data so as to induce an honest system to work over fake data; and the behaviour of that system can give a lot of information about the unknown bytes (a variant of this attack is what is used in the BEAST attack on SSL/TLS). Knowledge of some plaintext bytes makes such attacks easier. AES-CBC, by itself, does not protect against active attackers. To defeat active attackers, you need to apply a MAC. Combining encryption and MAC is not easy; you'd better replace CBC with an encryption mode which includes a MAC and handles the hard work (e.g. GCM or EAX).

(The definition above is about "known plaintext attacks", where the attacker knows part of the plaintext. For active attacks, we would talk about "chosen plaintext attacks", where the attacker gets to choose part of the plaintext, and "chosen ciphertext attacks", where the attacker alters the ciphertext and observes more or less directly the result of decryption. Properly applied and verified MAC gives reliable protection against chosen ciphertext attacks.)

Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • Aren't active attacks used against "streaming" data such as through an active network connection? How useful are active attacks against a *given encrypted file* (assuming still that we know the "abcdef" of the original)? – user21203 Feb 28 '13 at 15:00
  • 2
    Active attacks suppose that the attacker can modify data and _then_ observe what happens when one of the target systems processes that data. When the attacker just gets a read-only peek at the data and then has to work alone on his own machines, then that's a context where only passive attacks are possible -- it just happens that such contexts are rather rare. – Tom Leek Feb 28 '13 at 15:30
  • Can you elaborate on whether what you mention is just theoretical or has actually been successfully carried out in practice? For instance, if *given* an encryption program that encrypts files with AES-256, CBC mode, and a file encrypted with this program, are there active attacks against that file? (the cracker have access to the program that created the file of course). If this is the case, wouldn't this imply that AES-256 CBC is in fact a non-secure crackable algorithm that shouldn't be used anymore? Why do people use it? – user21203 Feb 28 '13 at 15:50
  • Another question I have is that if I use HMAC SHA-1 in PBKDF2 to generate the key <-- this HMAC is totally unrelated to the one you mention in the context of CBC,GCM,EAX? – user21203 Feb 28 '13 at 15:51
  • 1
    People use CBC out of tradition; better modes are newer and not as widespread in cryptographic libraries. Active attacks on data files or streams which do not have proper integrity checks are a reality. – Tom Leek Feb 28 '13 at 16:23
  • 1
    The HMAC used internally in PBKDF2 is indeed totally unrelated to a MAC computed over the encrypted data. HMAC just happens to be a convenient cryptographic element for building algorithms (i.e. HMAC is a reasonably good emulation of a "random oracle"). – Tom Leek Feb 28 '13 at 16:24
  • 1
    @user21203: Just to clarify Tom's point, the [HMAC](http://en.wikipedia.org/wiki/HMAC) construction (as commonly used in PBKDF2) is indeed a MAC, and can be used to verify the integrity of ciphertext. It's just that PBKDF2 uses it for a completely different purpose. (Specifically, PBKDF2 needs a keyed [pseudorandom function family](http://en.wikipedia.org/wiki/Pseudorandom_function_family) with arbitrary-length inputs. Any such PRF family can be used as a MAC, but not all MACs are PRFs. As it happens, though, HMAC, used with a good hash function, _is_ indeed a PRF.) – Ilmari Karonen Feb 28 '13 at 18:07
1

This is a case of Known plaintext attack. This situation can be exploited if older ciphers are used, but there is no known vulnerability of this type in AES.

Even if the plain text is the same every time, the IV is not. Brute-forcing would not be faster in this particular case.

Tinned_Tuna
  • 1,018
  • 7
  • 12
Dinu
  • 3,166
  • 14
  • 25