0

Running a debian system with lighttpd, php5, xcache and fastcgi. 2GB ram, 2 cores, less than 10% cpu load in 5 min averages peak time, less than 1GB of ram in use.

The system runs a custom build webapp that scrapes flight search sites, no caching (of results) allowed so it's made in real time, and the code that does it uses libcurl and can probably be executed for quite a few seconds for every search. There's also an OpenX ad system.

Recently the site seem to timeout intermittently, and I have create a simple test script that just prints a word to make sure it's not related to the MySQL database.
From what I understand, as we run an opcode cacher we should not run many fastcgi "max-procs" (because each process would use it's own cache, I assume), but instead increase the children.
The childs was increased from 20 (with 2 max-procs) to 32, with no noticable difference. From what I understand, the amount of simultaneous scripts that runs are max-procs * children. Looking at the output of status.statistics-url while scripts take ages to run doesnt seem to indicate that all children are busy.

Is the correct approach to keep increasing children of fastcgi, or what more is there to do? Is it possible to see which scripts are in runtime, for how long they run etc etc etc?

fastcgi.active-requests: 39
fastcgi.backend.0.0.connected: 2259
fastcgi.backend.0.0.died: 0
fastcgi.backend.0.0.disabled: 0
fastcgi.backend.0.0.load: 19
fastcgi.backend.0.0.overloaded: 0
fastcgi.backend.0.1.connected: 4646
fastcgi.backend.0.1.died: 0
fastcgi.backend.0.1.disabled: 0
fastcgi.backend.0.1.load: 20
fastcgi.backend.0.1.overloaded: 0
fastcgi.backend.0.load: 39
fastcgi.requests: 6905


10-fastcgi.conf:
"max-procs" => 2,
"idle-timeout" => 20,
"bin-environment" => (
"PHP_FCGI_CHILDREN" => "32",
"PHP_FCGI_MAX_REQUESTS" => "500"


lighttpd error log, loads of these:
2011-05-30 09:45:48: (server.c.1258) NOTE: a request for /index.php?//search/poll timed out after writing 15180 bytes. We waited 360 seconds. If this a problem increase server.max-write-idle
2011-05-30 09:49:08: (server.c.1258) NOTE: a request for /index.php?// timed out after writing 12420 bytes. We waited 360 seconds. If this a problem increase server.max-write-idle

3molo
  • 4,340
  • 5
  • 30
  • 46

3 Answers3

1

Change your PHP binary to FPM instead of old fastcgi.

FPM (FastCGI Process Manager) is an alternative PHP FastCGI implementation with some additional features (mostly) useful for heavy-loaded sites.

Works much more stable, you shouldn't have timeout problems with it.

vartec
  • 6,137
  • 2
  • 32
  • 49
  • Even though I agree php-fpm is supposed to be better, it sure does not solve a problem like mine. See my answer for explaination. – 3molo Jun 03 '11 at 12:41
1

Like vartec said, PHP-FPM is probably a good idea here. Note that the PHP 5.2 version does not support dynamic process spawning (despite it being a configurable option), so you have to make sure you have enough workers to handle all your traffic spikes.

If you switch over to PHP-FPM, one benefit would be the opcode cache being shared between all your PHP processes (something that's possible to achieve with the lighttpd method, but a bit more annoying).

What kind of requests/sec are you seeing? I generally try to run one PHP process for every request/sec the server is seeing. This may not be the best idea on a relatively low memory system, but I haven't run into any issues yet.

Are you using a unix socket, or TCPIP to connect lighttpd and php? You should definitely switch over to unix sockets if you are using TCPIP. I've seen all sorts of intermittent, tough to diagnose issues when using TCPIP. You may be hitting firewall limits or connection limits with TCPIP.

Are you monitoring with something like Munin? It would probably be handy for you to have graphs of traffic load, server load, mysql load, etc. While this won't fix your issue just by having them, they will be very handy to you.

devicenull
  • 5,572
  • 1
  • 25
  • 31
0

One of the suppliers had problems responding to queries, so all searches on the site queued up alot of threads waiting to be executed by fastcgi. It seems there's no real fix on the fastcgi side, but rather implement a proper timeout in the code and possibly detect when suppliers are unrespnsive - and stop sending more queries down there.

Also, I switched to php-fpm and are now monitoring "/status" continously in order to detect problems like these early.

3molo
  • 4,340
  • 5
  • 30
  • 46