108

After reading the selected answer of "Diffie-Hellman Key Exchange" in plain English 5 times I can't, for the life of me, understand how it protects me from a MitM attack.

Given the following excerpt (from tylerl's answer):

  1. I come up with two prime numbers g and p and tell you what they are.
  2. You then pick a secret number (a), but you don't tell anyone. Instead you compute ga mod p and send that result back to me. (We'll call that A since it came from a).
  3. I do the same thing, but we'll call my secret number b and the computed number B. So I compute gb mod p and send you the result (called "B")
  4. Now, you take the number I sent you and do the exact same operation with it. So that's Ba mod p.
  5. I do the same operation with the result you sent me, so: Ab mod p.

Here are the same 5 steps with Alpha controlling the network:

  1. You attempt to send me g and p, but Alpha intercepts and learns g and p
  2. You come up with a and attempt to send me the result of ga mod p (A), but Alpha intercepts and learns A
  3. Alpha comes up with b and sends you the result of gb mod p (B)
  4. You run Ba mod p
  5. Alpha runs Ab mod p

During this whole process Alpha pretends to be you and creates a shared secret with me using the same method.

Now, both you and Alpha, and Alpha and me each have pairs of shared secrets.

You now think it's safe to talk to me in secret, because when you send me messages encrypted with your secret Alpha decrypts them using the secret created by you and Alpha, encrypts them using the secret created by Alpha and me, then sends them to me. When I reply to you, Alpha does the same thing in reverse.

Am I missing something here?

orokusaki
  • 1,342
  • 2
  • 10
  • 13
  • 27
    You are completely right that Diffie-Hellman key exchanges are vulnerable to a MITM attack, just as you described. – zinfandel Jun 15 '15 at 20:48
  • 2
    Yes but if you describe it another way, then it will not be vulnerable to mitm. – munchkin Jun 17 '15 at 10:02
  • Also related: [How is it possible that people observing an HTTPS connection being established wouldn't know how to decrypt it?](http://security.stackexchange.com/q/6290/29865) – Ajedi32 Jun 18 '15 at 14:46

6 Answers6

166

Diffie-Hellman is a key exchange protocol but does nothing about authentication.

There is a high-level, conceptual way to see that. In the world of computer networks and cryptography, all you can see, really, are zeros and ones sent over some wires. Entities can be distinguished from each other only by the zeros and ones that they can or cannot send. Thus, user "Bob" is really defined only by his ability to compute things that non-Bobs cannot compute. Since everybody can buy the same computers, Bob can be Bob only by his knowledge of some value that only Bob knows.

In the raw Diffie-Hellman exchange that you present, you talk to some entity that is supposed to generate a random secret value on-the-fly, and use that. Everybody can do such random generation. At no place in the protocol is there any operation that only a specific Bob can do. Thus, the protocol cannot achieve any kind of authentication -- you don't know who you are talking to. Without authentication, impersonation is feasible, and that includes simultaneous double impersonation, better known as Man-in-the-Middle. At best, raw Diffie-Hellman provides a weaker feature: though you do not know who you are talking to, you still know that you are talking to the same entity throughout the session.


A single cryptographic algorithm won't get you far; any significant communication protocol will assemble several algorithms so that some definite security characteristics are achieved. A prime example is SSL/TLS; another is SSH. In SSH, a Diffie-Hellman key exchange is used, but the server's public part (its gb mod p) is signed by the server. The client knows that it talks to the right server because the client remembers (from a previous initialization step) the server's public key (usually of type RSA or DSA); in the model explained above, the rightful servers is defined and distinguished from imitators by its knowledge of the signature private key corresponding to the public key remembered by the client. That signature provides the authentication; the Diffie-Hellman then produces a shared secret that will be used to encrypt and protect all the data exchanges for that connection (using some symmetric encryption and MAC algorithms).

Thus, while Diffie-Hellman does not do everything you need by itself, it still provides a useful feature, namely a key exchange, that you would not obtain from digital signatures, and that provides the temporary shared secret needed to encrypt the actually exchanged data.

Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • 53
    I absolutely loved this description: "Thus, user "Bob" is really defined only by his ability to compute things that non-Bobs cannot compute." This is a great, concise way of explaining identity that's not always intuitive. – John Feminella Jun 16 '15 at 02:33
  • 9
    Good answer. One detail I would have included is that DH without authentication can still provide some security. DH makes your communication safe against passive snooping, so an adversary would have to switch to active MITM. If even a small percentage of connections are authenticated any large scale active MITM would be noticed, which provides some security even for the unauthenticated connections. This relies on authenticated and unauthenticated connections being indistinguishable to a passive adversary. – kasperd Jun 16 '15 at 05:04
58

Tom has provided a good explanation as to why Diffie-Hellman cannot be safe against man-in-the-middling. Now this answers the OP's original question but probably leaves some readers with the (reasonable) follow-up question: Why don't we just use public-key (asymmetric) cryptography to ensure the confidentiality of our messages, and drop D-H altogether? There are some reasons not to do this:

  • There are algorithms that support only signing, but not encrypting messages (ECDSA, for example)
  • Symmetric encryption and decryption is a lot faster than doing it asymmetrically
  • Probably most important is that we want to ensure forward secrecy. After all, it's not impossible that the private key of one of your communication partners is compromised at some point. Now if you only relied on asymmetric encryption, all messages you ever sent to that partner could be decrypted by the attacker in retrospect. In contrast, if we use Diffie-Hellman - and to be precise, ephemeral Diffie-Hellman, we generate a new D-H key pair for each communication session and throw it away ( = do not store it) afterwards, meaning it is impossible to decrypt our messages at a later time.
zinfandel
  • 1,233
  • 8
  • 10
3

After a DH key exchange, both parties know what key they've computed. If no man-in-the-middle has infiltrated the connection, both parties will have the same key. If the connection has been breached, they will have different keys. If there is a means by which one party can ask the other what key it is using, the man in the middle will only be able to remain undetected if it is able to respond in the same fashion as the legitimate party would have done. While the question is often answered using a digital signature, to make impersonation difficult, it the question may also be asked/answered via things like voice communication. If a voice application shows the participants the current encryption key and a participant arbitrarily selects a range and a popular movie star (e.g. Marilyn Monroe), and asks the other to read the fifteenth through twenty-fifth digits in their best Marilyn Monroe voice, a real participant who has the numbers in front of him would be able to do so quickly and fluently and, in the absence of an MITM attack, the digits would match those seen by the first party. A man-in-the-middle attacker would have no problem detecting the question, and--given time--might be able to forge a voice file of the legitimate communicant doing a bad imitation of Marilyn Monroe saying the appropriate digits, but would have a hard time doing that as quickly as the real one.

In short, DH by itself can be robust against MITM attacks if there each participant knows something the other participant will be able to do with a number more efficiently than an attacker. Other protocols are generally used in conjunction with DH, however, because it is useful for the authentication process to be automated, and most forms of authentication that wouldn't rely upon encryption (things like voice, phrasing, etc.) require human validation. Further, it's often necessary for entities to solicit communication from strangers. If want to talk to an Acme Bank representative, a man-in-the-middle impostor could set up a fake "Acme Bank" office and take my call, and have someone else in a phony living room relay everything I say to the real Acme Bank, and nobody would be the wiser. If I have no idea how well or poorly a real Acme Bank employee would be able to imitate Marilyn Monroe, I'd have no way of knowing that an impostor's imitation wasn't the same.

supercat
  • 2,029
  • 10
  • 10
2

I have to make more precise one point in Tom Leek's answer: "In SSH, a Diffie-Hellman key exchange is used, but the server's public part (its gb mod p) is signed by the server."

Actually, the whole DH key exchange is signed. Signing only gb mod p is not sufficient: one could spoof SSH server by just connecting to it and replaying [SSH-TRANS] packet later. That does not prove knowledge of the current session's data; it omits ga mod p, the SSH ID strings, and protocol negotiation.

Authentication is done after the two parties exchange SSH ID strings and protocol negotiation messages, and the client sends a Diffie-Hellman key exchange message containing ga mod p.

The server computes gb mod p and hashes all the important information: H = hash(context || sig_pub || ga mod p || gb mod p || gab mod p) with context = SSH ID strings || protocol negotiation messages.

Then it signs that handshake hash: sig = signature(sig_priv, H) and sends back (sig_pub, gb mod p, sig) to the client.

That way, nobody can spoof anything except if they have broken the signature algorithm.

dave_thompson_085
  • 9,759
  • 1
  • 24
  • 28
2

DH is not generally resistant to Man in the Middle attacks.

If Alice and Bob (A<->B) can set up a shared secret. Then Frank can setup a shared secret with Alice (A<->F) At the same time Frank can set up a second (different) shared secret with Bob (F<->B). Frank can then decrypt A-> F messages and re-encrypt and send to bob F-> B & vice versa.

*So you need some way of ensuring that the message actually came from (was signed by) Alice. Either with a previously shared secret (delivered through some other channel) Or using a Certificate Authority (to proxy trust). Or some other method.

If you only trust a CA a little then Alice can setup a DH shared secret with Bob, signing the message with the cert from the CA. Bob checks the messages were signed by the CA. Frank can't fake the messages, since they don't have the required Cert.

Now Alice and Bob have a shared secret. Frank could not fake their way into the middle. However the CA had no part in creating the shared secret (only signing the parts sent along the way) So the CA can't play as a bad actor either. Even if Frank threatens them with a $5 wrench.

*Slightly simplistic but that is the general idea.

DarcyThomas
  • 1,298
  • 1
  • 10
  • 15
0

This here is where the linguistic descriptive abilities of the English language fails.

Diffie-Hellman is resistant to MITM if an out-of-band third party entity is able to help in distributing keys and/or verifying the identity.

You'll find in the literature, what it mostly defines is a connected web of trust surrounding the identity of the recipient or, in most cases, the person attached to the key or certificate.

merlindru
  • 3
  • 1
munchkin
  • 393
  • 1
  • 5