Authenticated Encryption vs. contained and encrypted checksum/hash?

Question

So I read through http://en.wikipedia.org/wiki/Authenticated_encryption and http://www.cryptopp.com/wiki/Authenticated_Encryption and I don't seem to be following the concept.

From the simple examples provided, it seems that Authenticated Encryption aims to prove that the message has not been altered - i.e. that an attacker has not modified the payload of an encrypted container by way of a manipulation that doesn't require the key.

My question is this - if the goal is simply to ensure that the payload/contents haven't been modified (again, the attacker doesn't have the key, they are modifying the message at the binary level) then why wouldn't the inclusion of a trailing hash of the payload within the encrypted container be sufficient for verification?

In the case of an encrypted AES message, the message is received, the pre-shared symmetric key is used to decrypt, and the message (data) and hash evaluated. The payload/data is hashed and compared to the received hash - if they match, then we have an extremely high probability that the message was not tampered with.

So... my question is this. That seems really simple, and yet people far more experienced and intelligent than myself have spent a lot of time on Authenticated Encryption - what am I missing?

Thanks in advance.

score 9 · Answer 1 · edited Apr 13 '17 at 12:48

Overview. Your proposal is not secure (see below for cryptanalysis). That's why it is not a reasonable alternative to authenticated encryption.

A little more background. Why do we need authenticated encryption? It is because encryption without authentication is not secure. Many developers don't know this, so they end up with an insecure use of cryptography.

It is possible to manually use both an encryption algorithm and an authentication algorithm. For instance, you can use the encrypt-then-authenticate construction using a secure encryption scheme and a secure message authentication (MAC) algorithm. However, this requires some extra effort from the developer.

Authenticated encryption was designed as a single primitive that is easy for developers to use, and that provides all the necessary authentication (so you don't have to do some extra stuff to provide security). Thus, it is helpful for security. Some authenticated encryption schemes also have the benefit of better performance than separately encrypting and then authenticating, but that is a secondary consideration.

Please read Don't use encryption without authentication for more detail on this subject.

Cryptanalysis of your scheme. You proposed that we append a hash of the message before encrypting: in other words, C = Encrypt(K, M||H(M)), where || represents concatenation of bit strings. This scheme is not secure against chosen-plaintext attacks, with many encryption algorithms. For instance, I'll show an attack against your proposal, if you are using CBC mode encryption (though the attack applies to many other modes of encryption as well).

Note that if a man-in-the-middle truncates a ciphertext that was generated with CBC mode (at a block boundary), the recipient won't notice the truncation, and after decryption will receive a truncated version of the message that the sender was trying to send. So, with that background, here's the attack. Let Alice be the sender, Bob the recipient, and assume a chosen-plaintext setting.

The attacker chooses some message M that he wishes Alice would send: but Alice refuses to send it (maybe M says "transfer $100 from my account and give it to the attacker", or something). The attacker constructs some other value X such that Alice is willing to send M' = M||H(M)||X (maybe X says "just kidding! don't you dare"). The attacker convinces Alice to encrypt and send M'. This means that Alice is going to transmit the ciphertext C' = Encrypt(K, M'||H(M')) = Encrypt(K, M||H(M)||X||H(M')). The attacker plays man-in-the-middle, captures C', and truncates it while it is in flight. Let's call C'' the truncated ciphertext. Bob will receive C'', decrypt it, and then check the hash. If the attacker chose the truncation point correctly, then after decryption Bob gets M||H(M), checks the hash, sees that the hash is correct, and concludes that Alice must have sent M.

In other words, at the conclusion of this attack, Bob concludes that Alice authorized transmission of the message M -- but she never did. (She authorized the transmission of some other message, but not M.)

Whether or not this is a serious security vulnerability will depend upon the context in which it is used, such as the format of the message M. But, empirically, in at least some applications, this kind of attack could pose a serious danger to the application. Therefore, cryptographers consider that this is not a good general-purpose scheme.

Given that there are good schemes out there which have been carefully vetted and proven to be secure for general-purpose use, cryptographers would recommend that you use one of those general-purpose schemes (such as authenticated encryption, or the encrypt-then-authenticate construction) -- and in particular, don't use the one you mentioned.

score 2 · Answer 2 · answered Apr 16 '12 at 05:08

You may want to look at this cryptography course from Stanford University. There is a lecture on Authenticated Encryption and it is shown why you should use Authenticated Encryption and how you should use it. Also it is shown how different approaches may be flawed (i.e. SSH or SSL).

You should use Authenticated Encryption and you should use it correctly which also means that you shouldn't invent it or implement it by yourself.

Using Authenticated Encryption is as easy as using proper encryption mode (i.e. CCM, GCM, EAX). Unfortunately it is quite common for frameworks not to include those modes of operation so you may need to use additional libraries.

Authenticated Encryption vs. contained and encrypted checksum/hash?

2 Answers2

Linked

Related