1

How can I optimize HAProxy with SSL Termination to Nginx backends on Ubuntu?

The setup works fine and routes properly. However, when I perform SSL Termination with HAProxy, theres a huge performance hit (tests below). The key is 4096-bit rsa. HAProxy forces HTTPS for verification, then terminates SSL, and communicates HTTP to backend Nginx servers. Nginx servers are identical and serve multiple static pages i.e. 192.168.1.xx/page1.html, 192.168.1.xx/page2.html, etc. (I included NodeJS for completeness of my system, but only adds <1ms delay. NodeJS can be ignored.)

Here is the setup, configs, and current tests. Each Virtual Machine (VM) is running Ubuntu 14.04 and can have a variable amount of CPUs & RAM.

  • HAProxy (1.5.14): 192.168.1.10
  • Nginx-1: 192.168.1.20
  • Nginx-2: 192.168.1.21
  • Nginx-3: 192.168.1.22
  • NodeJS-1: 192.168.1.30

Here is the HAProxy config:

    global
            maxconn 40000
            tune.ssl.default-dh-param 2048
            log /dev/log    local0
            log /dev/log    local1 notice
            chroot /var/lib/haproxy
            stats socket /run/haproxy/admin.sock mode 660 level admin
            stats timeout 30s
            user haproxy
            group haproxy

            # Default SSL material locations
            ca-base /etc/ssl/certs
            crt-base /etc/ssl/private

            # Default ciphers to use on SSL-enabled listening sockets.
            # For more information, see ciphers(1SSL). This list is from:
            #  https://hynek.me/articles/harding-your-web-servers-ssl-ciphers/
            ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS
            ssl-default-bind-options no-sslv3


    defaults
            option forwardfor
            option http-server-close
            stats enable
            stats uri /stats
            stats realm Haproxy\ Statistics
            stats auth user:password
            log     global
            mode    http
            option  httplog
            option  dontlognull
            timeout connect 5000
            timeout client 50000
            timeout server 50000
            errorfile 400 /etc/haproxy/errors/400.http
            errorfile 403 /etc/haproxy/errors/403.http
            errorfile 408 /etc/haproxy/errors/408.http
            errorfile 500 /etc/haproxy/errors/500.http
            errorfile 502 /etc/haproxy/errors/502.http
            errorfile 503 /etc/haproxy/errors/503.http
            errorfile 504 /etc/haproxy/errors/504.http


    frontend www-http
            bind 192.168.1.10:80
            reqadd X-Forwarded-Proto:\ http
            default_backend www-backend

    frontend www-https
            bind 192.168.1.10:443 ssl crt /etc/ssl/private/company.pem
            reqadd X-Forwarded-Proto:\ https
            use_backend node-backend if { path_beg /socket.io }
            default_backend www-backend

    backend www-backend
            redirect scheme https if !{ ssl_fc }
            server www-1 192.168.1.20:80 check
            server www-2 192.168.1.21:80 check
            server www-3 192.168.1.22:80 check

    backend node-backend
            server node-1 192.168.1.30:8888 check

Here is ApacheBench (ab) test to one of the Nginx servers:

    $ ab -c 200 -n 10000 http://192.168.1.20/

    Server Software:        nginx/1.4.6
    Server Hostname:        192.168.1.20
    Server Port:            80

    Document Path:          /
    Document Length:        3130 bytes

    Concurrency Level:      200
    Time taken for tests:   2.257 seconds
    Complete requests:      10000
    Failed requests:        0
    Total transferred:      33720000 bytes
    HTML transferred:       31300000 bytes
    Requests per second:    4430.21 [#/sec] (mean)
    Time per request:       45.145 [ms] (mean)
    Time per request:       0.226 [ms] (mean, across all concurrent requests)
    Transfer rate:          14588.55 [Kbytes/sec] received

    Connection Times (ms)
                  min  mean[+/-sd] median   max
    Connect:        4   27 104.3     16    1187
    Processing:     4   18   8.2     16     358
    Waiting:        3   18   7.9     16     334
    Total:          9   45 105.8     32    1225

    Percentage of the requests served within a certain time (ms)
      50%     32
      66%     41
      75%     43
      80%     44
      90%     49
      95%     52
      98%     55
      99%     57
     100%   1225 (longest request)

Here is ApacheBench (ab) test to HAProxy with http:

    $ ab -c 200 -n 10000 http://192.168.1.10/

    Server Software:        nginx/1.4.6
    Server Hostname:        192.168.1.10
    Server Port:            80

    Document Path:          /
    Document Length:        3130 bytes

    Concurrency Level:      200
    Time taken for tests:   1.918 seconds
    Complete requests:      10000
    Failed requests:        0
    Total transferred:      33720000 bytes
    HTML transferred:       31300000 bytes
    Requests per second:    5215.09 [#/sec] (mean)
    Time per request:       38.350 [ms] (mean)
    Time per request:       0.192 [ms] (mean, across all concurrent requests)
    Transfer rate:          17173.14 [Kbytes/sec] received

    Connection Times (ms)
                  min  mean[+/-sd] median   max
    Connect:        3   18   3.5     18      32
    Processing:     7   20   3.5     19      36
    Waiting:        7   20   3.4     19      36
    Total:         15   38   4.2     37      57

    Percentage of the requests served within a certain time (ms)
      50%     37
      66%     38
      75%     39
      80%     40
      90%     44
      95%     46
      98%     50
      99%     51
     100%     57 (longest request)

Here is ApacheBench (ab) test to HAProxy with https:

    $ ab -c 200 -n 10000 https://192.168.1.10/

    Server Software:        nginx/1.4.6
    Server Hostname:        192.168.1.10
    Server Port:            443
    SSL/TLS Protocol:       TLSv1,DHE-RSA-AES256-SHA,2048,256

    Document Path:          /
    Document Length:        3130 bytes

    Concurrency Level:      200
    Time taken for tests:   566.303 seconds
    Complete requests:      10000
    Failed requests:        0
    Total transferred:      33720000 bytes
    HTML transferred:       31300000 bytes
    Requests per second:    17.66 [#/sec] (mean)
    Time per request:       11326.069 [ms] (mean)
    Time per request:       56.630 [ms] (mean, across all concurrent requests)
    Transfer rate:          58.15 [Kbytes/sec] received

    Connection Times (ms)
                  min  mean[+/-sd] median   max
    Connect:      483 8982 3326.6  11090   14031
    Processing:    16 2255 3313.0     43   11399
    Waiting:       14 2186 3253.3     35   11252
    Total:       5648 11237 879.1  11212   22732

    Percentage of the requests served within a certain time (ms)
      50%  11212
      66%  11274
      75%  11308
      80%  11321
      90%  11395
      95%  11641
      98%  11847
      99%  14063
     100%  22732 (longest request)

Here is OpenSSL test on the HAProxy VM.

    $ openssl speed rsa

                      sign    verify    sign/s verify/s
    rsa  512 bits 0.000081s 0.000006s  12314.6 179042.8
    rsa 1024 bits 0.000277s 0.000017s   3603.7  60563.8
    rsa 2048 bits 0.001852s 0.000058s    539.8  17231.3
    rsa 4096 bits 0.013793s 0.000221s     72.5   4517.4

So, the way I’m looking at it is HAProxy cannot outperform the openssl speed test of 72.5 sign/s, 4517.4 verify/s. However, HAProxy with SSL termination is performing around 17 request/s. Of course we can get a smaller key to boost overall performance, but that doesn’t solve the problem (if a problem exists) of the ~4.5x speed increase from openssl speed test to HAProxy.

So, given this information, is there an optimal HAProxy configuration that will increase the performance? Am I missing something in general, for instance: when a user visits a page for the first time, they only have to ‘sign’ once, and all concurrent requests to the page are only ‘verify’. If that’s the case, then AB test isn’t appropriately measuring that (correct me if I’m wrong). And, for this to occur, does the user have to visit the same Nginx server? If so, does that require stick sessions?

In attempts to answer my own questions, I tried adding sticky sessions from this post: HAProxy with SSL and sticky sessions and used Siege to test with multiple URLS. However, there still wasn’t a performance increase.

    $ siege -c 100 -r 10 -b -f urls.txt

    Transactions:               1000 hits
    Availability:             100.00 %
    Elapsed time:              22.56 secs
    Data transferred:           2.84 MB
    Response time:              2.17 secs
    Transaction rate:          44.33 trans/sec
    Throughput:             0.13 MB/sec
    Concurrency:               96.06
    Successful transactions:        1000
    Failed transactions:               0
    Longest transaction:            8.96
    Shortest transaction:           0.16

Where urls.txt is

    URL=https://192.168.1.10/
    $(URL)
    $(URL)page1.html
    $(URL)page2.html

So, am I stuck with this performance? Some places mention a similar request rate of ~75 ms/request for 4096-bit keys. https://certsimple.com/blog/measuring-ssl-rsa-keys

Or, is my HAProxy poorly configured and is processing SSL 2x somewhere? ServerFault: /questions/574582/nginx-ssl-termination-slow

  • FWIW, I worked around this once by running one HAproxy for SSL only, using `nbproc` to scale across multiple CPU cores. The requests are handed to the proper HAproxy (which has ACLs etc.) through a unix socket. Would love to have faster SSL instead myself :-) – Felix Frank Aug 05 '15 at 12:43

2 Answers2

1

One thing to consider is that many HTTP clients (including browsers) try to amortize the cost of the SSL handshake out over several HTTP requests. That is, they make one TCP connection to the server, perform the SSL handshake, and then reuse that TCP connection (with its SSL session) for multiple requests/responses, rather than performing an SSL handshake per request.

To enable this kind of scenario/test in your setup, you might include the -k command-line option for ab, for your frontend HTTP requests.

Another consideration is your use of the option http-server-close HAproxy setting. This tells haproxy to make a new backend TCP connection, per frontend HTTP request; this can add its own 40-100ms (if not more), depending on the backend network. If you allowed HAproxy to keep those backend TCP connections open, that too might reduce the per-request latency reported by ab.

Depending on the number of SSL sessions you anticipate, it's also possible that increasing the memory allocation for SSL session caching (HAproxy's tune.ssl.cachesize setting, perhaps in conjunction with tune.ssl.lifetime for increasing the cache timeout, and thus increasing the likelihood of session cache reuse) would allow for more resumed SSL sessions (and faster/less computationally intensive SSL handshakes).

But I think that the numbers reported by ab, when using keep-alive (-k), will demonstrate better the effectiveness of reusing the same SSL session (via same TCP connection) for many HTTP requests.

Hope this helps!

Castaglia
  • 3,239
  • 3
  • 19
  • 40
0

Compare apples to apples.

Your benchmark openssl speed rsa probably doesn't measure DHE time, because it doesn't use ephemeral keys. This means you test a less secure algorithm versus a DHE algorithm. The latter is slower, on the other hand it provides perfect forward secrecy (PFS).

But DHE is quite old, ineffective and modern browsers usually use a better ECDHE (or even ECDSA).

I think you should setup your ab benchmark to enforce ECDHE-RSA-AES128-SHA256. I think you should program your openssl benchmark to use a loop of openssl s_client -cipher ECDHE-RSA-AES128-SHA256 (instead of simplistic openssl speed).

kubanczyk
  • 13,502
  • 5
  • 40
  • 55