23

I've been reading up on Authenticated Encryption with Associated Data. The linked RFC states:

Authenticated encryption is a form of encryption that, in addition to providing confidentiality for the plaintext that is encrypted, provides a way to check its integrity and authenticity.

My understanding is that simply encrypting the data, even using a symmetric shared key, with something like AES or 3DES should be sufficient to verify the data has not been tampered with in transit. If it had been, the message simply would not decrypt.

So, why is a separate authentication necessary? In what situation would you need to have a separate authentication (even using plain HMAC with encryption)?

Jonathan
  • 345
  • 1
  • 2
  • 6
  • 6
    If I remember my cryptography correctly, encryption without authentication does not protect communications from a replay attack. – Peter Smith Apr 01 '13 at 14:26

6 Answers6

30

Encryption DOES NOT automatically protect the data against modification.

For example, let's say we have a stream cipher that is simply a PRNG (random number generator), where the key is the seed. Encryption works by generating random numbers in sequence (the keystream) and exclusing-or'ing them with the plaintext. If an attacker knows some plaintext and ciphertext bytes at a particular point, he can xor them together to recover the keystream for those bytes. From there, he can simply pick some new plaintext bytes and xor them with the keystream.

Often the attacker need not know the plaintext to achieve something. Let's take an example where an attacker simply needs to corrupt one particular field in a packet's internal data. He does not know what its value is, but he doesn't need to. Simply by replacing those bytes of ciphertext with random numbers, he has changed the plaintext.

This is particularly interesting in block ciphers where padding is used, as it opens us up to padding oracle attacks. These attacks involve tweaking ciphertext in a way that alters the padding string, and observing the result. Other attacks such as BEAST and the Lucky Thirteen Attack involve modification of ciphertext in a similar way. These tend to rely on the fact that some implementations blindly decrypt data before performing any kind of integrity checks.

Additionally, it may be possible to re-send an encrypted packet, which might cause some behaviour on the client or server. An example of this might be a command to toggle the enabled state of the firewall. This is called a replay attack, and encryption on its own will not protect against it. In fact, integrity checks often don't fix this problem either.

There are, in fact, three primary properties that are desirable in a secure communications scheme:

  • Confidentiality - The ability to prevent eavesdroppers from discovering the plaintext message, or information about the plaintext message (e.g. hamming weight).
  • Integrity - The ability to prevent an active attacker from modifying the message without the legitimate users noticing. This is usually provided via a Message Integrity Code (MIC).
  • Authenticity - The ability to prove that a message was generated by a particular party, and prevent forgery of new messages. This is usually provided via a Message Authentication Code (MAC). Note that authenticity automatically implies integrity.

The fact that the MAC and MIC can be provided by a single appropriately chosen HMAC hash scheme (sometimes called a MAIC) in certain circumstances is completely incidental. The semantic difference between integrity and authenticity is a real one, in that you can have integrity without authenticity, and such a system may still present problems.

The real distinction between integrity and authenticity is a tricky one to define, as Thomas Pornin pointed out to me in chat:

There's a tricky definition point there. Integrity is that you get the "right data", but according to what notion of "right" ? How comes the data from the attacker is not "right" ? If you answer "because that's from the attacker, not from the right client" then you are doing authentication...

It's a bit of a grey-area, but either way we can all agree that authentication is important.

An alternative to using a separate MAC / MIC is to use an authenticated block cipher mode, such as Gallois/Counter Mode (GCM) or EAX mode.

Polynomial
  • 132,208
  • 43
  • 298
  • 379
  • There are actually four primary properties. The fourth is *availability*. It doesn't matter how well you do on the other three pillars if an attacker can take your service down at will so no one can use it. Excellent answer though! – Rein Henrichs Apr 02 '13 at 02:35
  • Thank-you for the incredibly detailed response. In my particular use case, the data is not *particularly* sensitive, so an HMAC to verify authenticity (integrity?) is likely sufficient. This is great information though. – Jonathan Apr 02 '13 at 11:17
3

Encryption and decryption just transforms bytes. You say that when the password is wrong "the message simply would not decrypt", but that's not true: the result will just not be the same as the original. Try some online encryption tool like this one. If you encrypt "example" with the password testonetwothree1, you get Rg2iS8PvYsIUgmEynHP62g== as result. If you now decrypt the same ciphertext with the password testonetwothree4, you get "JÙ] i.¦WÆÏ*q" as result.

So why does that matter? If the decryption is garbage, how is that useful to the attacker?

Imagine you have a message that says "attack at 10:00". Encrypted with Caesar encryption, it is something like aWJiaWtzIGliIDEzOjAw. Your enemy might know that you are sending the attack message, but does not know the password. What they can do, though, is change a byte. If the message is changed to aWJiaWtzIGliIDazOjAw (the E is replaced by a), it suddenly says: "attack at 03:00". Similar scenarios could be encrypted commands, which an attacker can use to change the behaviour of the receiving computer in certain ways.

As you see, modifying the ciphertext, even blindly, can be an advantage to the attacker. You want to have authenticated encryption, where a modification is detected.

This does not just apply to Caesar: some encryption methods allow you to change individual bytes (stream ciphers or a block cipher in CTR mode). Others allow you to change only a whole block (most other block cipher modes). This is not a flaw in the encryption, but a result of not authenticating your encryption. If you can't check what it should have been, you can't blame the encryption (which just provides confidentiality, not integrity) if the result turns out to be different. If you want both, you need to use an algorithm that does both.

There are two ways of doing this:

  • Authenticated encryption algorithms do everything at once. You give them a key and some data, and out comes a piece of ciphertext that, when changed, will be detected. Examples include GCM mode and OCB mode.
  • Adding an authentication code, such as an HMAC. Note that there is a certain order in which you should do things, and that you should be careful with timing attacks, so this is a little trickier than the dedicated algorithms. The reason I mention it is because it's the traditional way of doing things and newer algorithms might not be available in your favourite crypto libraries.
Luc
  • 31,973
  • 8
  • 71
  • 135
2

My understanding is that simply encrypting the data, even using a symmetric shared key, with something like AES or 3DES should be sufficient to verify the data has not been tampered with in transit. If it had been, the message simply would not decrypt.

Your understanding is wrong.

Why would a message not decrypt if someone flipped a bit?

Even this Wikipedia page clearly says:

The block cipher modes ECB, CBC, OFB, CFB, CTR, and XTS provide confidentiality, but they do not protect against accidental modification or malicious tampering.

I'm linking to that page because the answer does depend on the mode of operation, and some (like GCM) do have build-in MACs.

More specifically, unless there is a checksum or signature, an encryption algorithm is basically converting bytes into other bytes. Just like the caesar cipher would turn A => K and B => L, modern encryption does basically the same thing, just more complex. And if someone changes the K in the ciphertext to an L, then the decryption will happily decrypt it to a B instead of the original A.

Malicious attacks of this kind are often unfeasible simply because without knowledge of the plaintext and the key it is very hard for Eve to know which bits to flip in order to get a predictable change in the decrypted message, but without a MAC, nothing stops her from flipping bits at random and hoping for the best.

Tom
  • 10,124
  • 18
  • 51
1

Encryption make sure the data can't be read or tampered with. Authentication makes sure you know where the data came from.

A more practical view: My bank uses an Encrypted link (https) between my web browser and it's web sever, but without Authentication it doesn't let me see what isn't in my bank account.

jwernerny
  • 161
  • 5
  • 2
    That's a different kind of authentication. –  Apr 01 '13 at 13:54
  • 1
    The authentication is what allows you to know _for sure_ that you are talking about your account to your bank, and not actually to some random cybercriminal. You might want to know that sort of thing… – Donal Fellows Apr 01 '13 at 16:57
1

As far as I know, AES and 3DES are block ciphers used in conjunction with some mode, like CTR, CBC and those modes(alone) are only semantically secure under adversaries that can only eavesdrop. If adversary can tamper data being transmitted, we must ensure integrity and it cannot be guaranteed from the above constructions. Auth. encryption guarantees that adversary that can tamper your data, won't be able to create encrypted data that properly decodes to valid plain text. In other words, the output of a auth.encryption scheme is either a valid plain text, or a symbol that represents invalid output. If you get the symbol that represents invalid output, you know data was modified.

Lehrling
  • 35
  • 4
-1

The addition of authentication would prevent an unauthorized user from submitting data encrypted with the (stolen) symmetric shared key, basically adding an additional secret to the mix.