2

I have server with FreeBSD (16 cores with HT, SSD, 32Gb RAM) which is getting about 40M http requests daily. All requests are served by nginx + php-fpm.

enter image description here enter image description here

At these graphs you see that we have a problems during traffic peak. I'm not sysadmin, please explain me what is "Active connections", "Writing", "Waiting", "Reading" and why "Writing" increases when server is unable to serve requests quickly?

Here are some more graphs with CPU,memory and Load Average.

enter image description here enter image description here enter image description here

As you can see, nothing strange happens with CPU and memory, but Load Average also have a peak.

During this Load Average peak, I've noticed that there is a unserved queue on php-fpm.sock

netstat -Lan | grep php-fpm unix 2525/0/32246 /tmp/php-fpm.sock

Number of members in queue is varying from 0 to 12000. When value is 0 - everything is fine and I'm getting http response in 60-100ms. When value is 5000-12000 is can take up to 3-10 seconds.

I've also checked if ther any unusual processes in top, but was unable to find anything.

Here is the top screenshot taken some minutes ago (right now everything is ok, no traffic peak): enter image description here

My conclusion: according to CPU and memory graphs I can say that this server can serve more and more requests, but because of non-optimal work of php-fpm it is impossible during traffic peaks.

Any suggestions about how can this problem be solved?

Will
  • 1,127
  • 10
  • 25
Kirzilla
  • 543
  • 3
  • 8
  • 20
  • 2
    Your CPU usage graph isn't tracking I/O wait, which is what you need to be looking at. You should fix this graph as soon as possible. – Michael Hampton Mar 24 '14 at 11:58
  • 1
    From this information it seems to be, that problem is around php. It can be useful to run top from cron every minute and then see states of php-fpm for time when problem was observed. Sample script to run from cron: https://gist.github.com/citrin/9742337#file-top-sh – citrin Mar 24 '14 at 15:25
  • 1
    @citrin, thank you! Will add now, let's see in the morning. – Kirzilla Mar 24 '14 at 15:29
  • 1
    Yesterday I moved memcached from tcp to socket. Memcache totally disappeared from top. Today I havent seen any fails. Magic? I have to investigate more... – Kirzilla Mar 25 '14 at 15:26

0 Answers0