15

It's common knowledge that asymmetric encryption is in general much more expensive to compute than symmetric encryption, thus common practice is to use asymmetric encryption to establish a symmetric key for bulk data exchange. I'm not finding any information on exactly how much slower, though.

So, assuming I have some small piece of data X (say a 256 bit hash value), and I choose reasonable modern cryptographic algorithms, what is the rough relative cost to compute

  • X encrypted under some public key
  • X encrypted under some symmetric key
  • H(X), H being a cryptographic hash function
forest
  • 64,616
  • 20
  • 206
  • 257
Izz
  • 153
  • 1
  • 4
  • Have you tried using `openssl` with its benchmarking option? – forest May 09 '18 at 22:48
  • I haven't tried a benchmark myself, because I fear that what I do may not sufficiently resemble what a "production system" would do, and because I'm only looking for an order-of-magnitude answer. – Izz May 09 '18 at 22:49
  • While definitely interesting, I am not too sure if this question is on topic. – Tom K. May 10 '18 at 00:01
  • @TomK. Performance at a required security level seems to be on-topic. This question isn't so math-heavy that it belongs on Crypto.SE. – forest May 10 '18 at 00:03

1 Answers1

16

This depends on the algorithm. Especially with asymmetric cryptography, the speeds vary wildly. You may want to check out eBACS for more detailed and machine-independent benchmarking of various crypto primitives. As always, you need to perform your own benchmark on your own system to know exactly what to expect on a production system under the chosen conditions.

You should never use asymmetric ciphers to encrypt anything directly. If you encrypt anything other than a key, you can run into the risk of making cryptanalysis possible. RSA for example does not like encrypting non-random data. This is fine when you are encrypting a key (which is random, after all), but not if you are encrypting ASCII. Quoted from another answer (emphasis mine):

RSA has some operational constraints. With the most used variant (the one known as PKCS#1 v1.5), if the size of the RSA key is "1024 bits" (meaning that the central mathematical component of the key pair is a 1024-bit integer), then RSA can encrypt a message up to 117 bytes in length, and yields an encrypted message of length 128 bytes. That limited size, and the size increase when encrypting, are unavoidable consequences of the mathematical structure of the RSA encryption process. Due to these constraints, we do not usually encrypt data directly with RSA; instead, we select a small sequence of random bytes, which we call session key. We encrypt the session key with RSA; and then we use the session key with a symmetric encryption algorithm to process the whole message. This is called hybrid encryption.

Encrypting a small amount of data with a symmetric cipher will not let you benefit as much from its higher speed. Most symmetric ciphers have a certain one-time setup cost called the key schedule, where the key is processed and split up into separate subkeys for each component of the cipher, or spread out among the cipher's internal state. These subkeys are usually cached so they can be used directly each time the cipher is invoked. Each time a new key is used, the key schedule is run and must complete before anything can be encrypted. Some ciphers have very expensive key schedules, such as Blowfish, whose key schedule is equivalent to 521 Blowfish encryptions requiring 4 KiB of RAM and is the basis of the slow bcrypt function. Others have very simple key schedules, like TEA which simply splits the key up and mixes it with constants. When you are encrypting a large amount of data with a single key, you are able to benefit more from the speed of a fast cipher than if you are encrypting small chunks data independently.


A quick benchmark on a low-end system (Pentium III, 1 GHz) at medium load is below. Note that these benchmarks only show how fast these algorithms are on this specific machine. It may be accurate to within an order of magnitude, but a faster machine with different capabilities may not only be faster, the relative speed of the individual algorithms might be different! But to get an idea...

RSA, with different sizes:

OpenSSL> speed rsa
...
                  sign    verify    sign/s verify/s*
rsa  512 bits 0.001305s 0.000113s    766.2   8881.1
rsa 1024 bits 0.007685s 0.000372s    130.1   2687.5
rsa 2048 bits 0.050615s 0.001355s     19.8    738.1
rsa 4096 bits 0.353636s 0.005201s      2.8    192.3

AES, with different key sizes and input lengths:

OpenSSL> speed aes
...
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes     256 bytes    1024 bytes   8192 bytes
aes-128 cbc      21174.71k    23656.67k    24212.96k    39798.65k    40800.k
aes-192 cbc      18706.23k    20713.58k    21222.25k    34669.23k    35059.k
aes-256 cbc      16153.38k    17569.97k    18004.69k    30170.76k    30020.k

SHA-256, with various input lengths:

OpenSSL> speed sha256
...
The 'numbers' are in 1000s of bytes per second processed.
type              16 bytes    64 bytes     256 bytes    1024 bytes   8192 bytes
sha256            6537.11k    14781.14k    26256.93k    32630.49k    34562.k

* In textbook RSA, signature verification is identical to encryption, and signing is identical to decryption. Verification involves determining if se ≡ m (mod n), whereas encryption is running c ≡ me (mod n). In the real world, there are important differences involving special padding operations necessary to prevent RSA from being trivially broken.


I suppose you're wondering why there is this difference in performance. The reason has to do with how these cryptographic primitives work. A basic explanation of asymmetric ciphers, symmetric ciphers, and hash functions, along with what influences their performance, is below.

Asymmetric ciphers involve so-called hardness problems. These are open problems in mathematics which exploit the fact that an operation is easy to perform in one direction, but difficult to operate in reverse without a secret value. RSA uses the fact that it is trivial to multiply two huge prime numbers together, but prohibitively difficult to factor the resulting composite number back into the original primes without at least one of the original prime numbers. For textbook RSA, encrypting message m requires you compute c ≡ me (mod n). To decrypt c, compute m ≡ cd (mod n). Here, n is the product of the two secret primes, p and q, and d is the modular inverse of e (a public exponent which is often chosen to be a Fermat prime), found by calculating d ≡ e-1 mod (p - 1)(q - 1).

Asymmetric ciphers can vary wildly in speed because they all involve different mathematical concepts. Even distinct ciphers that use the same concepts (such as RSA and the Rabin cryptosystem) can vary in speed. These algorithms are often so slow because they require operating on very large numbers all at once, which is difficult for our modern CPUs that can only efficiently process 64-bit numbers (or a few times more, up to 512 bits for the most recent CPUs, when special optimizations are used). Modular arithmetic in particular is trivial when involving 64-bit integers on a 64-bit processor, but is very inefficient when massive 2048-bit numbers are involved.

Symmetric ciphers come in two types: stream ciphers and block ciphers*. Stream ciphers use the key to generate an endless deterministic stream of pseudorandom bytes (the keystream). This stream is combined with plaintext to form ciphertext. Decryption involves the same operation, combining the keystream with the ciphertext to obtain the plaintext. A block cipher on the other hand takes a key and uses it to permute an input of a fixed block, outputting a ciphertext block of the same size. A block cipher can be run in reverse with the same key to decrypt the ciphertext. Block ciphers must be run in a mode of operation to securely encrypt more than a single block of data with a key. Some modes allow for parallel implementations. Others must be serialized, whereas others still may be parallelized for encryption but not decryption, or vice versa.

Symmetric ciphers are very fast because they involve only the concepts of confusion (the output gives no hints about the input or key) and diffusion (a small change in either the input or the key causes a drastic change in the output). Unlike asymmetric ciphers which typically act on massive numbers, symmetric ciphers involve a large number of very simple key-dependent operations being repeated many times. Each operation is small enough to be computed efficiently by a CPU. In particular, ciphers using the ARX scheme (such as Salsa20) require only additions, rotations, and XOR operations on very small word-sized chunks of data, which can be implemented extremely efficiently on processors. There are no arithmetic operations on unwieldy numbers.

Hash functions are very similar to symmetric ciphers and they involve the same concepts of confusion and diffusion. In fact, hash functions are sometimes used to create ciphers, and vice versa. This can be seen with Salsa20, a stream cipher that uses a hash at its core, and Whirlpool, a hash function which uses a modified version of AES. A hash function takes an input of effectively unlimited size and outputs a digest of fixed size. Like a cipher, confusion prevents an attacker from learning anything about the input from the output, and diffusion amplifies small changes to the input.

Hashes are often slower than symmetric ciphers because they have more strict requirements. In particular, hashes must not only prevent an attacker from reversing the operation (a preimage attack) and modifying an input without the hash changing (a second preimage attack), it also must make it hard to generate a pair of inputs that create the same hash (a collision attack). Being collision-resistant is hard and tends to require a rather slow and complex hash. To put it into perspective, an attacker can collide the horribly broken MD4 hash with less effort than it takes to invoke the function twice. Even still, the best preimage attack on it is purely theoretical.

* Stream cipher encryption is C = P ⊕ EK and decryption is P = C ⊕ EK where P is plaintext, C is ciphertext, and EK is the keystream generated with key K. For a block cipher, encryption is C = EK(P) and decryption is P = DK(C) for a single block of P and C. A mode of operation is required to process more than a single P or C given a single K.

When using a hash function to construct a cipher or vice versa, the security properties demanded from the hash are different. For example, Salsa20 is a secure stream cipher despite the fact that it is built off of a hash function which lacks collision resistance. Likewise, a hash built out of a cipher requires the cipher be secure against related-key attacks, which is not necessarily required for a secure cipher on its own.

Glorfindel
  • 2,235
  • 6
  • 18
  • 30
forest
  • 64,616
  • 20
  • 206
  • 257
  • This answer muddles the distinction between key exchange and asymmetric encryption. – user2357112 May 10 '18 at 03:34
  • @user2357112 Key exchange is only one goal of asymmetric encryption, so I tried to avoid discussing it. Do you think it would be better for me to discuss how a hybrid cryptosystem works (how asymmetric crypto is used to exchange a session key)? – forest May 10 '18 at 03:38
  • Mostly, I find the second paragraph's treatment of DH and ECDH confusing. DH and ECDH aren't asymmetric encryption at all, but they're brought up without really explaining why they're brought up, in a context where it sounds like the answer is going to keep talking about weaknesses of asymmetric encryption systems. That makes it sound like DH and ECDH somehow qualify as asymmetric encryption. – user2357112 May 10 '18 at 04:06
  • @user2357112 Since OP mentioned encrypting using a public key, I included DH despite them technically not encrypting anything. But you're right, it is a bit out of place. I have removed it. – forest May 10 '18 at 04:12