-1

a few days ago my Apache server randomly started hogging the CPU during which time the site becomes sluggish and unresponsive.

Here is what mod_status shows during an incident:

  • Nearly all connections are in the waiting state
  • A few connections are in the Keep-Alive state. All of them have Req column values (Milliseconds required to process most recent request) between 15 and 40 seconds, which is 10-20x longer than usual
  • During this time, running top command shows 100% usage of all 4 CPUs, with 5-7 Apache child processes on top, taking 10-40% percent CPU each

Each incident lasts 15-30 minutes and then the situation returns to normal for a while, until another incident occurs.

Server configuration and stats:

  • Digital Ocean droplet with 4 CPU's, 8 GB RAM
  • LAMP stack running a Wordpress site, about 40K hits per day (XMLRPC protected, wp-login protected)
  • MySQL performance appears normal (slow query logging on, nothing unusual)

Any advice on debugging this would be appreciated as I don't even know where to start.


Server Version: Apache/2.4.7 (Ubuntu) PHP/5.5.9-1ubuntu4.11
Server MPM: prefork
Server Built: Jul 24 2015 17:25:11
Current Time: Sunday, 02-Oct-2016 08:41:09 EDT
Restart Time: Sunday, 02-Oct-2016 07:55:53 EDT
Parent Server Config. Generation: 1
Parent Server MPM Generation: 0
Server uptime: 45 minutes 15 seconds
Server load: 105.26 38.36 22.87
Total accesses: 32705 - Total Traffic: 367.2 MB
CPU Usage: u455.08 s51.66 cu0 cs0 - 18.7% CPU load
12 requests/sec - 138.5 kB/second - 11.5 kB/request
144 requests currently being processed, 39 idle workers
_WWWWWWWWW_WKWWWWW__WWWW._WW.WWWW___W.WWWW__WWW_W_WWWW_WW.W_WWWW
_W.WCWW_WW.WKWWWWWWWWW_WWWKW.W_K_.KW__K.W._WWWWW__WWWWWWWWW._WKW
WWWK.WW_WW__WWWWWWWW__WWW_WWWW_WWWWWW_._WWWKK___WWW_WWWWWWWWWWWW
_.W..WW.
Scoreboard Key:
"_" Waiting for Connection, "S" Starting up, "R" Reading Request,
"W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
"C" Closing connection, "L" Logging, "G" Gracefully finishing,
"I" Idle cleanup of worker, "." Open slot with no current process

1 Answers1

1

W stands for "Sending Reply" (as per the key in the output of the server status), rather than some form of waiting state. This is the state between having received the full body of the request and completing sending the response.

The Extended status, which should be on by default in Apache v2.4, but turn it back on if you have turned it off, will show you which URLs are running for a long time and thus may give you some indication as to what is happening. It may well already be present below the output you have already shown.

You could also add %D to your access log format so that you can post analyse which requests are taking a long time.

Once you know which particular URLs are taking along time you may be able to do something to determine the why. Perhaps by adding debugging code or something.

Unbeliever
  • 2,286
  • 1
  • 9
  • 17