How the NSA can break trillions of encrypted Web and VPN connections

Question

Based on Diffie-Hellman's negotiation of key exchange, why would it turn out that only a few primes are commonly used? What and who determines the prime?

Would it not be safer to use a safe prime? (example: p = 2q + 1)

*The question is based on this article:

http://arstechnica.com/security/2015/10/how-the-nsa-can-break-trillions-of-encrypted-web-and-vpn-connections/?utm_source=pocket&utm_medium=email&utm_campaign=pockethits

It would be safer to use 2048-bit DH group. See [What Diffie-Hellman parameters should I use?](http://crypto.stackexchange.com/a/29927/4123). — Xander, Oct 19 '15 at 23:13
@Xander I agree, but eventually, it will be the same issue with 2048, unless it gets rectified, no? — rhymsy, Oct 20 '15 at 03:56
@rhymsy keep in mind that as hard as it was for even NSA to theoretically calculate all 1024-bit DH key exchange events with certain values given, to do so with 2048-bit keys would be 2^1024 times as hard. Unless the NSA has tech most people would not even write in a piece of sci-fi on account of its seeming unrealistic, they cannot do it. They may well have other ways to break these key exchanges, now or in the future, but precomputation on 2048 bits would take unbelievable amounts of processing and storage. — WDS, Oct 20 '15 at 04:17
Shor's algorithm on a quantum computer can make short work in factoring the primes. Depending on how much quantum computing power they have access to or can eventually get access to, this is going to be a reality at some point that they can break even 2048 bit encryption. — David-, Oct 20 '15 at 15:12
@David- Which of the primes involved in Diffie-Hellman would it be useful for them to factor? — kasperd, Oct 20 '15 at 16:04
@kasperd The most common are the most useful and it appears they already have done so with those groups. Seems to me that for the purpose of eavesdropping they would go from most common to least common. — David-, Oct 20 '15 at 16:44
@rhymsy 2048 bit keys are not 2^1024 times harder. They are 10^9 harder. — David-, Oct 20 '15 at 16:44
@David thank you for the correction, but WDS posted those values. — rhymsy, Oct 20 '15 at 16:50
@rhymsy oh, haha, I see that. Sorry you had to witness my blonde moment. :D — David-, Oct 20 '15 at 16:51
@kasperd From this paper. Its a very good read: https://weakdh.org/imperfect-forward-secrecy-ccs15.pdf — David-, Oct 20 '15 at 17:00

Andrew Philips · Answer 1 · 2015-11-20T00:11:47.710

Edit #2 (11/09/15) - I read through a few articles and a paper. See down at bottom for more comments.

I read the article, although I have no direct data on the following, so guess work on my part.

There are a combination of factors going on here.

Nature of DH exchange.
Pre-provisioned Keys.
Unchangeable of Keys.
Known attacks on DH algorithm.

DH Exchange

The exchange requires that the two parties have the Base Prime. It ought to be a Safe Prime (or Sophie-Germain Prime). If the client doesn't have it, the server will send it over. Both parties should use secure random number generators to create their exponents. Computations should be structured such that successes and failures take the same amount of wall clock time.

Pre-Provisioned Keys

Generating large Primes is not hard, but it is for the uninitiated. Having a piece of software that does it for you is best. That software must use a secure random number generator. Most people will just use whatever keys (primes) that have been pre-provisioned out of habit, ignorance or laziness.

Unchangeable Keys (Primes)

It's entirely likely that much of the software in question wasn't engineered to allow a Prime number change. Engineers who author the software are often as ignorant of proper security methodology as the users. I don't mean that pejoratively; ignorant as in "lack knowledge".

Known DH Attacks

Where to start!

A shorter key (768) was well within the ability for a couple of graduate students to compute partial discrete log matrices and attack Sun's network password authentication scheme a couple of decades ago. See reference at end of this Answer. I wouldn't be surprised if the NSA (or anyone) has the ability to compute partials for known 1024 bit keys, today.

The protocol can be attacked by replacing the Prime in transit to the client if the client doesn't have the Prime and requests it from the server.

Even better, DH is subject to man-in-the-middle attacks, especially with a client that requests the Prime from the server, but even if it doesn't. The DH algorithm relies on secure exponents being computed. However, it doesn't rely on any other data - server and client need not prove each other to establish a secure session. If somehow the client's password (or a hash of it) were mixed into the DH key negotiations without any information leakage and retaining perfect forward secrecy, then the client could be assured the server was legitimate. Look up the SPEKE and DHEKE algorithms for a start on this. In any event, NSA as MITM creates two DH sessions, one with the client and the other with the server and records all data while piping between the two. Easy peasy.

The article states that NSA relied on well known Primes to attack the sessions. I refer back to the paper I mentioned about computing discrete log partial matrices. If possible now with h/w advances, 1024 bit keys wouldn't provide secrecy from observer attacks.

Who's to say the machines running the DH algorithms weren't compromised in some other way? Perhaps, they leaked key data or gave it up when queried in the right way? There has been so much information published about how every piece of computational h/w can be compromised by the NSA, perhaps they didn't bother breaking the algorithm unless they needed to, and instead broke into the machines and got the session encryption keys directly?

Edit

Other DH attacks:

Introduce a Prime that is not Safe (doesn't conform to Sophie-Germain Prime criteria of P=2Q+1, where Q & P are Prime). An unsafe Prime P (Q is not Prime) limits the range of the DH function, thereby simplifying observer attacks (discrete log calculations), sometimes trivially. In the context of the algorithm g^(A*B)%P should map 1:1 or 2:1 over the field of P; a bad P can map large swaths of numbers to a computationally feasible small set.
Rely upon poorly coded DH algorithms which allowed (or forced!) small exponents, making attacking the session easier.
Rely upon poorly coded secure random number generators and/or poorly seeded RNGs (ones that used current time or process ID).
Timing attack in which an observer watches how hard (measured in time) a participant works to compute a result and guesses based upon the work properties of the keys (uses that information to break the keys). This is a general attack, not specific to Diffie-Hellman.

Edit #2

I read an article linked through and the twelve author paper posted.

As suspected above, here's the gist of it. At the end I'll add a little bit more on TLS+DH+RSA cipher suite, just to show another vulnerability.

First off, it appears likely that the NSA has broken a number of the commonly used DH Groups (Safe Prime P + Generator g) used for Diffie-Hellman key exchange. As I indicated above, at least a decade or so ago, this attack was completed by a couple of graduate students (still don't have reference) on Sun's 768 DH key used for network password authentication. The attack is an observer attack, so no man-in-the-middle needed. The work required to break any particular Group is quite large. However, once broken, all exchanges using that can now be compromised.

Theoretically, DH key exchange provides Perfect Forward Secrecy (vs. RSA which does not). Once the a DH group is broken, any exchange using that group not is compromised going forward, it's also compromised going backward (ie. recorded exchanges can be broken), so no more perfect forward secrecy for that group. It also means that a dedicated attacker could break uncommon DH keys. This is assumed to be possible for 1024 bit keys, so up your key lengths to 2048 for DH. You should already be using larger key lengths for RSA.

Next, the researchers make a number of notable points. The paper covers Logjam, which I won't explain in detail except to say it's a MITM, cipher suite downgrade attack. Clever, but not algorithmically complex.

I strongly urge anyone interested to read the paper. It's fantastic. Here's the second paragraph of the Abstract.

We go on to consider Diffie-Hellman with 768- and 1024-bit groups. We estimate that even in the 1024-bit case, the computations are plausible given nation-state resources. A small number of fixed or standardized groups are used by millions of servers; performing precomputation for a single 1024-bit group would allow passive eavesdropping on 18% of popular HTTPS sites, and a second group would allow decryption of traffic to 66% of IPsec VPNs and 26% of SSH servers. A close reading of published NSA leaks shows that the agency’s attacks on VPNs are consistent with having achieved such a break. We conclude that moving to stronger key exchange methods should be a priority for the Internet community.

That pretty much says it all. Fixed keys, broken by the NSA allows observer (passive) breaking of many of the secure networking protocols. So, don't use pre-provisioned or common keys (if possible) and/or up your key lengths.

As for the TLS+DH+RSA cipher suite, is this safe at higher key lengths?

I would hazard that it is safe from an observer attack, however, it's still subject to man-in-the-middle attacks and here is how.

In this mode, the Server signs its DH ephemeral with its RSA key (Sign((g^B)%P,RSA_Key_Server)). That means, the server's portion of the DH exchange is authenticated. Assuming the signature is valid when it reaches the client and the Server certificate has not been revoked, the client can proceed safely with the DH exchange knowing only the server can complete it.

How, then, is a MITM possible? Two ways, one more likely than the other. Both rely on the client failing to do a proper signature check.

First, the client has too many root certificate authorities in its key store (many browsers have over 100). Some of these root CAs are not trustworthy. MITM gets a server cert for the remote site in question, sits in the middle and signs with its own cert (Sign((g^B2)%P,RSA_Key_Middle_Server)). The client will validate the signature, assume the server is authentic and complete the DH exchange with the server. Meanwhile, the middle server can run its own client side of the exchange with the real server and watch all data in between.

Second, I think this is less likely. The client fails to check for a revoked certificate and the MITM has the revoked private key, so it can sign as if it were the original, real server. Client won't know because it didn't check revocation. Middle server, again runs another DH exchange with remote, real server and sits in the middle.

Moral of the story is watch your client configurations and up your key lengths.

EDIT #3: Found the reference to the DH attack a couple of decades back:

B.A. LaMacchia and A.M. Odlyzko. Computation of discrete logarithms in prime fields. Designs, Codes, and Cryptography, 1:46-62, 1991.

About MITM attacks, doesn't TLS authenticate connections with RSA or ECC ? — Hey, Nov 05 '15 at 17:42
TLS w/ RSA and ECC are still subject to MITM under the correct circumstances; circumstances that are far more common than you might imagine. See this other [answer](http://security.stackexchange.com/questions/104630/can-man-in-the-middle-attacks-affect-anyone-not-connected-to-the-attackers-netwo/104680#104680). — Andrew Philips, Nov 05 '15 at 19:28
Also, the OP asked about DH, which doesn't use server authentication (certificates) and, therefore, is always subject to MITM attacks. There is a body of research on *Zero Knowledge Password Proofs*, two of which I mentioned above (SPEKE and DHEKE). ZKPPs address the problem of server authentication. However, due to h/w accelerators at the network edge and other security issues (feeding pwd verifiers to the same net edge), I don't believe ZKPPs will ever be used in the field. Which is too bad, as they would finesse a lot of these problems. — Andrew Philips, Nov 05 '15 at 19:49
How much harder is a 1024 bit key than a 768 bit key to compute partial discrete log matrices? — Steve Sether, Nov 05 '15 at 22:00
From this [paper](https://weakdh.org/imperfect-forward-secrecy-ccs15.pdf), it appears to be about 10^5 times harder. 2048 is 10^9 harder yet again. Read the paper and skim the complex math. Great stuff in there. Section 4.1 (read all of page 7) has what you need. The essence of the attack is the burned in nature of the Primes. Precomputations break all sessions using that prime. Article estimates NSA can passively break 100,000 VPN sessions per hour! — Andrew Philips, Nov 05 '15 at 23:51

score 1 · Answer 2 · edited Mar 17 '17 at 13:14

One method of breaking them is by finding ways around it. There are almost always ways around all types of security. I subscribe to the mentality of, "If man can build it, man can destroy it."

You can Install backdoors in common software such as Adobe Flash, PDF and, of course, in Java. Let's not forget Windows services. Then you have things like hardware that come pre-installed with back-doors.

All you have to do is implement something that allows you to run arbitrary code when properly executed. There are many ways to do this on many platforms and in pretty much any programming language.

Purposely implementing security flaws is one of the things that certain agencies really like doing.

Then encryption isn't even needed. Who cares about your quad-layer Dr. Dalek encrypted VPN connection when there are many different ways around it?

+1 For the idea that one doesn't need to break the encryption on communications between endpoints A & B to see what they are saying if one can simply get malware on A, B, or both. That said, I tend to think that the backdoor issue tends to be one where a kernel of substantive threat has been built into to a gigantic menace by interested parties. For a huge gov agency that can basically spend whatever it wants on developing & acquiring zero-day flaws/exploits, straightforward phishing & wateringhole attacks will work with a very high success rate in breaching non-high security targets. — mostlyinformed, Nov 20 '15 at 20:09