1

I recently switched my php7.3-fpm configuration to use UNIX Sockets instead of listening at localhost:9000. This solved a lag problem (every now and then I had requests that took over a second for no reason). But, now using unix sockets I get "503 service unavailable" issues with high loads - but only for PHP-Scripts. The static files are still delivered without error. Hence, it's something related to PHP.

I've spike tested the site using k6. At ~400 req/s hammering a simple script get like 90% 503-errors when testing, whereas using TCP it just slows down to infinity and kind of works even at 800 req/s having only ~0.5% of failing requests. I guess it's kind of a limit on connections on the UNIX socket. I had to increased the backlog size in /etc/php/7.3/fpm/pool.d/www.conf to get TCP working for those high loads (to 10240, which is 10x the default). But, I didn't find anything similar for UNIX sockets in the config... I also tried to increase the number of file descriptors lighttpd may use (server.max-fds) without success.

Since UNIX sockets are more efficient I'd like to use them. Especially, as they kind of solved this lag issue... and they don't have to pass the firewall, which for some reason firewalls localhost-connections since some newer kernel version. There might be a connect limit on the UNIX sockets... but, I don't know how to set it.

Questions

Do you either know WHY this is... or do you even have a solution for the problem?

Versions / Operating System / Config

lighttpd: 1.4.53 (cannot update to a more recent version!) php-fpm: 7.3 OS: debian/9.13 Kernel Version: "Linux worldtalk.de 4.9.0-15-amd64 #1 SMP Debian 4.9.258-1 (2021-03-08) x86_64 GNU/Linux".

/etc/lighttpd/lighttpd.conf

.
.
.
server.port                 = 80
server.stream-request-body  = 2
server.stream-response-body = 2
server.listen-backlog = 2000
server.max-keep-alive-idle = 2
server.max-keep-alive-requests = 4
server.max-read-idle = 25
server.max-write-idle = 25
server.max-fds = 10240
.
.
.

/etc/lighttpd/conf-enabled/10-fastcgi.conf

For UNIX sockets:

server.modules += ( "mod_fastcgi" )

index-file.names += ( "index.php" )

fastcgi.server = (
    ".php" => (
      "localhost" => (
        "socket"                => "/run/php/php7.3-fpm.sock",
        "broken-scriptfilename" => "enable"
      ))
)

and for TCP:

server.modules += ( "mod_fastcgi" )

index-file.names += ( "index.php" )

fastcgi.server = ( ".php" => ((
                        "host" => "127.0.0.1",
                        "port" => "9000",
                        "broken-scriptfilename" => "enable"
                )))

/etc/php/7.3/fpm/pool.d/www.conf

user = www-data
group = www-data

listen = 127.0.0.1:9000
;listen = /run/php/php7.3-fpm.sock  <--- THIS IS USED FOR UNIX SOCKETS
listen.backlog = 10240
listen.owner = www-data
listen.group = www-data
process.priority = -18

pm = dynamic
pm.max_children = 2000
pm.start_servers = 15
pm.min_spare_servers = 10
pm.max_spare_servers = 15
pm.max_requests = 0

.
.
.

--

SDwarfs
  • 385
  • 4
  • 15

1 Answers1

1

After some more reseach I found the solution here: How to set unix socket backlog with systemd?

Which essentially says...

From listen(2):

If the backlog argument is greater than the value in /proc/sys/net/core/somaxconn, then it is silently truncated to that value; the default value in this file is 128. In kernels before 2.4.25, this limit was a hard coded value, SOMAXCONN, with the value 128.

So I've set my "SOMAXCONN" to 10240 to not limit my backlog for the unix-sockets and this did increase the backlog size of the UNIX socket and resolves the issue:

echo "10240" > /proc/sys/net/core/somaxconn

for a quick fix and testing... To make it permanent add the following line to /etc/sysctl.conf:

net.core.somaxconn = 10240

SDwarfs
  • 385
  • 4
  • 15