12

We're deploying ipsec on embedded devices and getting catastrophic performance from the diffie hellman 2048 group in ike.. afterwards the shared securet is used for 3des, sha1. ipsec negiation is well over 20s for a single tunnel.. the network stack is using openssl to the negotiation

Am I correct to say that 1024 is sufficiently secure for use in DH, in this context ? My understanding is that dh 1024 is ok to negotiate ipsec sa with lifetime of 24 hours until such a time as dh 1024 can be forced in < 24 hours. That is very unlikely to be before 2030

There's no confidentiality requirement so we dont care if someone can decrypt the traffic after 24 hours, as long as they cant re-inject it or tamper with current packets that's fine.

The reason given to use dh 2048 is because "1024 is no longer secure".. I've nothing against dh 2048 but given the performance problem i'm trying to push the 1024 group. I dont want to do that if I've missed some problem though.

The advice that 1024 is insecure relates to long lived (1-2 year) RSA keys. As DH negotiates short lived (24 hour) session keys, DH in 1024 group is ok for us.

In the use of dh in 2048 group, then the resulting shared secret has 160 bits of security which cant be forced for 10 - 15 years. 10-15 years because rsa 2048 is considered safe until 2030 and I'm assuming dh 2048 is similar

If dh in 1024 group is used, then the resulting shared secret is probably safe for a few years. The RFC 2409 which defines dh 1024 for ike says it generates more than 160 bits of security. It should however be twice the number of security bits needed, we need 224 as 224 = 2 * 112 and 3des needs 112. So what is meant by "more than" in the rfc.

By default, openssl generates the dh secret as a random number of length equal to the dh shared prime, so i'm thinking it will be at the large end of what the "more than" means.. I'm not sure how to verify that though

RSA 1024 hasnt been forced yet but is expected to be in near future, probably with months or years of computation. However, our product will still be in use in 10 years time. Maybe RSA 1024 will be easily forceable in 2025, as long as it takes more than a day dh 1024 is still ok. (I'm not saying dh 1024 will be a good choice in 2025, i'm saying if we choose it today then we wont have to phase it out before 2025)

The shared secret resulting from our dh is only used for 24 hours. SA lifetime is 24 hours so we're going to renegotiate our tunnels every 24 hours. That means a new DH secret every 24 hours. So as long as dh 1024 cannot be forced in a 24 hour period it's ok for us to use dh 1024. As long as we think it will still take more than a day to DH 1024 the resulting shared secret of a dh 1024 can be forced in maybe a few weeks.

Follow up question, if we use 3des tat needs 112 bits of security, DH should generate twice as many bits as needed, but DH 1024 only generates 160 bits.. How big a problem is that in practice ?

We also have the option to modify the negotiation, we could say with dh 2048, but generate the secret in a smaller value space. In openssl we can set the q parameter on the DH structure to determine the number of bytes to generate in DH_generate_key.. that seems less attractive an option than using dh 1024 however

If this doesnt hold water could somebody correct me ? thanks

dancl
  • 223
  • 1
  • 2
  • 6

2 Answers2

12

Let's define a few things clearly...

When you use an asymmetric algorithm like DH, it has a "strength" that relates to the difficulty of breaking through it with the best known attacks. The best known attacks on DH try to solve discrete logarithm with an index calculus variant that has a lot of common elements with the General Number Field Sieve, an integer factorization algorithm. They have the same asymptotic complexity, which is why it is often said that RSA-1024 and DH-1024 have "the same strength".

In practice, though, DH tends to be somewhat stronger than RSA of the same size. The last phase of GNFS involves doing linear algebra in a really big matrix; for RSA, the matrix elements are bits, while for DH the elements will be numbers with roughly the same size of the modulus -- so the matrix for DH-1024 will be about a thousand times larger than the matrix for RSA-1024. For large keys, this linear algebra is the bottleneck, precisely due to the sheer size of the matrix. Therefore, it is expected that DH-1024 is "more resistant" in some way than RSA-1024. We may note that the current record for RSA modulus factorization is 768 bits, while the discrete logarithm record is at 596 bits.

For 1024-bit factorization, we know that the problem exceeds the capacity of existing computers. We have some theoretical design concepts for a special-purpose machine that could factor a 1024-bit integer; it would take years to do a single computation, and building the machine would cost a substantial number of millions of dollars (but not billions -- at least "on the paper"). For DH-1024, I am not aware of a similar design or cost estimate.

Note that none of this is about the length of the key that you generate. What cryptographers like to work with are equivalent symmetric strengths. They want to say that the strength of an algorithm/key is, say, "100 bits", by which they mean: "the effort to break it would be equivalent to the effort involved in breaking through exhaustive search the 100-bit key for an ideal symmetric encryption algorithm". However, this is very reductive. Brute-forcing a symmetric key is about trying out all possible keys, which can be done in parallel and requires negligible RAM contents, only CPU. Brute-forcing scales up well, with negligible structural overhead. This does not hold at all for GNFS; in particular, the main difficulty in GNFS for large RSA or DH keys is that the final step with the matrix is very hard to compute in parallel and requires tons of very fast RAM.

This has not prevented various people and bodies to make estimates. You can browse these estimates there. To sum up, depending on who you are talking to, DH-1024 will be deemed to have about "80-bit strength", while DH-2048 rates closer to 112-bit strength.

Now 80-bit strength is still considerable. The biggest key-crunching effort reported so far is 64-bit. 80-bit is 65536 times harder. Specialized circuitry can probably lower costs a lot. But we are still talking millions of dollars, and not just a few millions; closer to a billion, really. So maybe this is sufficient for the value of the secrets that you want to hide in your embedded systems.


I note that "20 seconds" is a lot. Some years ago, I implemented a basic DH-2048 on a 33 MHz ARM7TDMI, i.e. not the fastest embedded system ever; and it could do a complete DH in less than 7 seconds. My code was pure C, and C is known to be suboptimal at leveraging the CPU abilities at multiplications for big-integer arithmetics. My guess is that going assembly would have allowed DH-2048 to run within 4 seconds or so, on the same platform.

One thing to take into account is the size of the exponent. When doing DH, you are computing modulo a big prime p, and raising two values to exponent x (the DH private key on your side). For 80-bit security, you need p to be a 1024-bit integer (at least) and you also need x to be chosen randomly of length 160 bits (because attacks are either GNFS over p, or a "generic DL" with cost sqrt(x)). For a 2048-bit prime, you get 112-bit security on the GNFS side, so you need the DH private key to be 224-bit. A bigger DH private key won't bring extra security, but it will make computations longer. Possibly, your current implementation uses an oversized DH private key (e.g. a 512-bit exponent), which may explain your issues.

For better performance, you might want to switch to elliptic curves (ECDH). Depending on what is supported on the other side, of course. ECDH provides the same level of security as DH with more arithmetic operations but with much smaller values, leading to a substantial performance boost (typically 5 to 20 times faster). A 224-bit curve provides "112-bit strength", i.e. similar to 2048-bit DH.

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
  • that's awesome thanks. re 20s, that's the entire negotiation, when i run the dh unit test i get ~6s. I guess other parts of ipsec cause the rest, i think windows forces pfs which makes performance worse re x, openssl DH_generate_key seems to generate a secret as long as the public_key.. // dh_key.c l = dh->length ? dh->length : BN_num_bits(dh->p)-1; BN_rand(priv_key, l, 0, 0)); – dancl Oct 31 '14 at 17:51
  • @dancl you can indeed reduce the size of X (and thus the cost of Y) by setting dh->q, or more directly by setting dh->length. There's actually an official if weird way of doing this: generate *DSA* parameters and openssl chooses a subgroup size (Q) allowed by FIPS186-3 (160bit for 1024bit P, 256 for 2048 or 3072), then convert to *DH* parameters (can be done at commandline with `dhparam -dsaparam`, I kid you not) and it drops Q but keeps the length of Q and uses that for choosing X. – dave_thompson_085 Nov 01 '14 at 14:08
  • Note that Snowden docs indicate that DH 1024-bit is vulnerable to NSA. – Paul Draper Dec 30 '20 at 23:06
2

DH 1024 is not secure anymore against NSA, according to Snowden documents.

NSA precompute discrete logs for DH 512, DH 768 and DH 2014 and can break it in real time.

https://weakdh.org/imperfect-forward-secrecy-ccs15.pdf

Kilo
  • 21
  • 1