3

I setup a haproxy(1.6.3) on ubuntu 16.04 to load balancing two web servers. From my earlier tests, the web servers can handle over 20k request/s. The web servers were tested against wrk2, and I verified number of requests in log. However, with haproxy in front of web servers, it seems that the request per second is limited to about 6k request/s. Is there anything wrong in haproxy config?

haproxy.cnf

global
    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    maxconn     102400
    user haproxy
    group haproxy
    daemon

    # Default SSL material locations
    ca-base /etc/ssl/certs
    crt-base /etc/ssl/private

    # Default ciphers to use on SSL-enabled listening sockets.
    # For more information, see ciphers(1SSL). This list is from:
    # https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
    ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS
    ssl-default-bind-options no-sslv3

defaults
    log    global
    mode    http
    option    httplog
    option    dontlognull
    # https://serverfault.com/questions/504308/by-what-criteria-do-you-tune-timeouts-in-ha-proxy-config
    timeout connect 5000
    timeout check 5000
    timeout client  30000
    timeout server  30000
    timeout tunnel  3600s
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http

listen web-test
    maxconn 40000  # the default is 2000
    mode http
    bind *:80
    balance roundrobin
    option forwardfor
    option http-keep-alive  # connections will no longer be closed after each request
    server test1 SERVER1:80 check maxconn 20000
    server test2 SERVER2:80 check maxconn 20000

If runnign wrk with 3 instances, I get approximately the same result:

./wrk -t4 -c100 -d30s -R4000 http://HAPROXY/
Running 30s test @ http://HAPROXY/
  4 threads and 100 connections
  Thread calibration: mean lat.: 1577.987ms, rate sampling interval: 7139ms
  Thread calibration: mean lat.: 1583.182ms, rate sampling interval: 7180ms
  Thread calibration: mean lat.: 1587.795ms, rate sampling interval: 7167ms
  Thread calibration: mean lat.: 1583.128ms, rate sampling interval: 7147ms
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     8.98s     2.67s   13.93s    58.43%
    Req/Sec   516.75     11.28   529.00     87.50%
  64916 requests in 30.00s, 51.69MB read
Requests/sec:   2163.75    # Requests/sec decrease slightly
Transfer/sec:      1.72MB

Stats from haproxy: enter image description here

If running wrk with 1 instance to one of the web server without haproxy:

./wrk -t4 -c100 -d30s -R4000 http://SERVER1
Running 30s test @ http://SERVER1
  4 threads and 100 connections
  Thread calibration: mean lat.: 1.282ms, rate sampling interval: 10ms
  Thread calibration: mean lat.: 1.363ms, rate sampling interval: 10ms
  Thread calibration: mean lat.: 1.380ms, rate sampling interval: 10ms
  Thread calibration: mean lat.: 1.351ms, rate sampling interval: 10ms
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.41ms    0.97ms  22.42ms   96.48%
    Req/Sec     1.05k   174.27     2.89k    86.01%
  119809 requests in 30.00s, 98.15MB read
Requests/sec:   3993.36     # Requests/sec is about 4k
Transfer/sec:      3.27MB

haproxy -vv HA-Proxy version 1.6.3 2015/12/25 Copyright 2000-2015 Willy Tarreau

Build options :
  TARGET  = linux2628
  CPU     = generic
  CC      = gcc
  CFLAGS  = -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2
  OPTIONS = USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.8
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with OpenSSL version : OpenSSL 1.0.2g-fips  1 Mar 2016
Running on OpenSSL version : OpenSSL 1.0.2g  1 Mar 2016
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.38 2015-11-23
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with Lua version : Lua 5.3.1
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

HA-Proxy version 1.6.3 2015/12/25
Copyright 2000-2015 Willy Tarreau <willy@haproxy.org>

Build options :
  TARGET  = linux2628
  CPU     = generic
  CC      = gcc
  CFLAGS  = -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2
  OPTIONS = USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.8
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with OpenSSL version : OpenSSL 1.0.2g-fips  1 Mar 2016
Running on OpenSSL version : OpenSSL 1.0.2g  1 Mar 2016
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.38 2015-11-23
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with Lua version : Lua 5.3.1
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

I know that ab is not a very precise way to test this, but I thought haproxy should give a better result than a single node. However, the results show the opposite.

ab test HAPROXY

ab -n 10000 -c 10 http://HAPROXY/
Requests per second:    4276.18 [#/sec] (mean)

ab test SERVER1

ab -n 10000 -c 10 http://SERVER1/
Requests per second:    9392.66 [#/sec] (mean)

ab test SERVER2

ab -n 10000 -c 10 http://SERVER2/
Requests per second:    8513.28 [#/sec] (mean)

The VM is single core, so there is no need use nbproc. Plus, I monitor the cpu, memory usage, all VMs use less then 30% cpu, and 20% memory. There must be something wrong about the haproxy configs or my system configs.

I now have about the same performance from both haproxy and single server, and the issue is that there is a default maxconn 2000 in listen section which I missed. However, I expect the performance to be better when having more backend servers, and I still cannot achieve this.

With the same configs, I now upgrade to haproxy 1.8.3, but it does not make too much difference.

cwhsu
  • 163
  • 4
  • Adding the output from `haproxy -vv` *might* be a useful edit to the question. You should verify this against the latest HAProxy 1.6.x, which is currently 1.6.13. Version 1.6.3 is about two years behind that, according to the [release notes](https://www.haproxy.org/download/1.6/src/CHANGELOG). – Michael - sqlbot Jan 02 '18 at 12:53
  • I am aware that it is not the latest version of haproxy, and I plan to switch to the latest version later. Still, I wonder if there is something wrong about my config or is it something else that affect the result? – cwhsu Jan 02 '18 at 13:00
  • I see nothing obvious. Your output shows that `epoll` is being used, so that's good. – Michael - sqlbot Jan 02 '18 at 14:57
  • I am very confused now because I really think that using haproxy would provide better result, but it shows the opposite so far... – cwhsu Jan 02 '18 at 14:59

1 Answers1

0

Haproxy is by default single threaded, to spawn a process for each core use the nbproc option in the global configuration (this is discouraged in the manual as 'hard to troubleshoot')

  1. run haproxy in daemon mode
  2. enter nbproc settings in global section e.g. for 4 processors spawn 4 daemons
  global
                  nbproc 4
                  cpu-map 1 0
                  cpu-map 2 1
                  cpu-map 3 2
                  cpu-map 4 3

so daemon process 1 runs on cpu 0 and so on.

you can then explicitly bind those processes to endpoints e.g.

frontend http
   bind 0.0.0.0:80
   bind-process 1
frontend https
   bind 0.0.0.0:443 ssl crt /etc/yourdomain.pem
   bind-process 2 3 4

https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#daemon https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#3.1-cpu-map

Sum1sAdmin
  • 1,914
  • 1
  • 11
  • 20
  • But I only have one processor here, and I am sure it is not busy when handling the request. – cwhsu Jan 03 '18 at 01:34
  • there lots of posts on here about HAproxy introduced latency, most advice is upgrade/patch first – Sum1sAdmin Jan 03 '18 at 11:09
  • I now have about the same performance after modifying the config, but I will definitely change to the latest version later. thanks! – cwhsu Jan 03 '18 at 13:03
  • I actually change to the latest haproxy 1.8.3, but it seems that it does not have much affect on the performance. Do you have any suggestions about benchmarking haproxy? – cwhsu Jan 04 '18 at 05:56
  • this seems good https://medium.freecodecamp.org/how-we-fine-tuned-haproxy-to-achieve-2-000-000-concurrent-ssl-connections-d017e61a4d27 – Sum1sAdmin Jan 04 '18 at 08:46
  • I read a few times already – cwhsu Jan 04 '18 at 10:08