Do the new instruction sets found on Intel's Haswell Architecture make any difference in cryptographic operations?

Question

Intel's Haswell architecture has support for several new Bit Manipulation Instructions. In Intel's own words,

Bit manipulation instructions are useful for compressed database, hashing , large number arithmetic, and a variety of general purpose codes.

Do the new instructions found on the Haswell processor make any difference in the computation speeds of the current or pending cryptographic functions being used?

score 10 · Accepted Answer · answered May 02 '13 at 17:04

Most of these operations are "trivial": they replace combinations of two or three existing opcodes. For instance, the BLSR type of instruction is, as specified in the page you link to, equivalent to a subtraction followed by a bitwise AND. This could already be done. Extra operations don't harm, and compilers will benefit from them, and, undoubtly, some cryptographic functions will gain a few cycles through use of some of these opcodes, but there is no groundbreaking result to expect.

Among the instructions, the most interesting are the counting ones (LZCNT, TZCNT) because counting the number of leading or trailing zeros in a register of N bits has cost O(log N) when using "classic" opcodes. These operations have some use in some corner cases of computations on big integers. In particular, I think this will help for binary GCD, which is used to compute divisions in finite fields -- an important step in computations over elliptic curves. Right now, division in finite fields is so expensive (when compared with multiplication) that it is worthwhile to use projective coordinates to compute things on elliptic curves: this implies about ten times as many operations, but avoids a lot of divisions, so it adds up to net gains. A fast implementation of binary GCD might change that picture a bit. For more uses of counting operations, see this Wikipedia page.

What may really boost things up is not the new instructions, but the larger registers. AVX2 offers 256-bit registers; that's room enough to compute eight SHA-256 in parallel, just like SSE2 allows for four parallel SHA-256. Password cracking software will benefit from this... (although GPU are arguably better). There is ongoing work for defining tree hashing modes for turning a given hash function into a system which benefits from parallelism; e.g. see this article (from the Keccak, aka SHA-3, designers). NIST has expressed their will to define some standard in that respect.

Of course, it still takes some particular scenario to benefit from such CPU gains. In most uses of hashing (or encryption or whatever), the cryptography is not the bottleneck; I/O is. When you hash files, you cannot hash them faster than you can read them from the disk.

score 1 · Answer 2 · answered Jun 29 '13 at 14:35

1

I think the new MULX instruction will help a bit, since it offers 3 operands. This might eliminate some MOVs and speed up tight loops.

The next Intel processor Broadwell will offer ADCX and ADOX to enable two independent ADD loops. This has been designed specifically with crypto in mind, see http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/large-integer-squaring-ia-paper.pdf

Of course, all of this is not ground breaking, but may result in, say, a speedup of factor two, maybe less.

answered Jun 29 '13 at 14:35

cxxl

111
4

1

@cxxxl - I am not sure I would trust anything "ground breaking". Intel has made mistakes in the past with their instructions. I think its best we get small increases, native opcodes, that duplicate multiple opcodes to simply improve performance instead of a case where we get a broken instruction opcode. – Ramhound Jul 01 '13 at 12:56
I would consider proper SSE support for carry handling or 64x64=128 bit muls ground breaking... – cxxl Jul 03 '13 at 12:25
I consider that just a natural progression of an existing extension. – Ramhound Jul 03 '13 at 12:31
It *would* be, but it's not available and I know of no plans to implement it. – cxxl Jul 03 '13 at 13:49
I thought you were saying it was available and ground breaking. I admit the comment made no sense in the context of your previous "they are not ground breaking" statement. – Ramhound Jul 03 '13 at 13:55

Do the new instruction sets found on Intel's Haswell Architecture make any difference in cryptographic operations?

2 Answers2