The analogy of the sending of the box is not quite accurate in any public/private key algorithm. It is only as good as it is something that would work in the physical world. With that out of the way, lets get to Diffie-Hellman key exchange.
Diffie Hellman does not actually exchange keys, it exchanges some information to generate the keys. However it only exchanges part of this information, the rest is kept by each user. Lets go through it, we'll call our users Alice and Bob ;) :
- Alice and Bob agree to use a prime number as a modulo p, and a base which is a primitive modulo root of p which is g. These mathematical properties must hold for the discrete logarithm problem the mathematics is built on to hold.
Alice and Bob each choose an integer they keep to themselves, lets call them a and b, then each of them runs the same calculation on that integer, first they calculate g to the power of the integer modulo p. And they send this to each other.
ga mod p
gb mod p
Now here is the clever bit, neither now can find out what the others secret integer is, but if they take what they got from the other one and raise it to their own integer and calculate modulo p of that, they both get the same number S:
(ga mod p)b mod p = gab mod p
(gb mod p)a mod p = gba mod p
4) Shared number is now used to derive some encryption key for further information.
Without knowing one of those secret integers an attacker can not generate the same number and therefore the same key. Schors algorithm on a quantum computer can solve the discrete logarithm problem to get back those secret integers from the data that was exchanged, but past that or getting Alice or Bob's machine to send them (in which case you can get them to send the derived key, or the unencrypted messages sent/recieved instead).
However if that information did not actually come from the other party but the attacker themselves then without other checks (this is what SSL/TLS certs are for) the attacker can decrypt, re-encrypt and relay the data on.