I'm sure there's plenty of obvious reasons for this, but why can't an SSL session be started with one round-trip?
It seems like this would be enough:
- Client sends their public key
- Server responds with:
- Its certificate (public key + whatever is needed to verify it)
- A symmetric key to use (encrypted with the client's public key / signed with the server's private key)
The problems I can think of are:
- You may need to know the version of the protocol beforehand.
- It seems like the protocol could be designed so that the client first sends a version, then their public key, and if the server doesn't understand it, it sends a "try this version" message", which would be faster in the best case (client and server both understand the same protocol), and just as fast as SSL in the worst case ("version 2...my public key", "I don't understand, use version 1", "version 1...my public key", "my certificate...encrypted key).
- To make the protocol more backwards compatible, you could have two versions, one for the public keys to be exchanges and one for everything else. Only the public key version would break compatibility and require the second round-trip.
- Obviously this wouldn't be backwards compatible with SSL now, but why wasn't it there in the first place?
Maybe more work for the server, since it needs to (1) generate a cryptographically secure symmetric key, (2) encrypt that key using public key encryption, (3) sign the result of that using public key encryption. Is the problem that it would be too easy to overload the server?
Something I'm missing?
This has just been bothering me all day, since it seems so obvious, but there has to be some glaring flaw that I'm missing (and that's why I'm not a security researcher).
EDIT:
A link posted by @lour has this quote:
In a TLS handshake, the "Finished" messages serve to validate the entire handshake. These messages are based on a hash of the handshake so far processed by a PRF keyed with the new master secret (serving as a MAC), and are also sent under the new Cipher Spec with its keyed MAC, where the MAC key again is derived from the master secret. The protocol design relies on the assumption that any server and/or client authentication done during the handshake carries over to this. While an attacker could, for example, have changed the cipher suite list sent by the client to the server and thus influenced cipher suite selection (presumably towards a less secure choice) or could have made other modifications to handshake messages in transmission, the attacker would not be able to round off the modified handshake with a valid "Finished" message: every TLS cipher suite is presumed to key the PRF appropriately to ensure unforgeability. Once the handshake has been validated by verifying the "Finished" messages, this confirms that the handshake has not been tampered with, thus bootstrapping secure encryption (using algorithms as negotiated) from secure authentication.
So why not just have the server and client both sign their cipher-negotation messages with their private keys? I know, more work, but still faster than an RTT, right?