9

I currently have an Apache2 server running with mpm-prefork and mod_php on a OpenVZ VPS with 512M real / 1024M burstable RAM (no swap). After running some tests, I found that the maximum process size Apache gets is 23M, so I've set MaxClients to 25 (23M x 25 = 575 MB, ok for me). I decided to run some load tests on my server, and the results left me puzzled.

I'm using ab on my desktop machine requesting the main page from a wordpress blog.

When I run ab with 24 concurrent connections, everything seems fine. Sure, CPU goes up, free RAM goes down, and the result is about 2-3s response time per request.

But if I run ab with 25 concurrent connections (my server limit), Apache just hangs after a couple of seconds. It starts processing the requests, then it stops responding, CPU goes back to 100% idle and ab times out. Apache log says it reached MaxClients.

When this happens, Apache keeps itself locked up with 25 running processes (they're all in "W" if I check server status) and only after the TimeOut setting the processes start to die and the server starts responding again (in my case it's set to 45).

My question: is that expected behaviour? Why Apache just dies when it reaches MaxClients? If it works with 24 connections, shouldn't it work with 25, just taking maybe more time to respond each request and queueing up the rest?

It sounds kinda strange to me that any kid running ab can alone kill a webserver just by setting the concurrent connections to the servers MaxClients.

Rodrigo Sieiro
  • 1,171
  • 1
  • 8
  • 10

2 Answers2

18

HA! I finally found the problem myself. It's more related to programming than server admin, but I decided to put the answer here anyway because by searching google I found I'm not the only one with that kind of problem (and since Apache hangs, the first guess is that there's a problem with the server).

The issue is not with Apache, but with my Wordpress. More specifically with my theme. I'm using a theme called Lightworld and it supports adding an image to the blog header. To allow that, it checks the image size by using PHP's function getimagesize(). Since this function was opening another http connection to the server to get the image, each request from ab was creating another request internally from PHP. As I was using all my server available slots, these PHP requests were put in the queue, but Apache could never get to them because all it's processes were locked with the original request waiting for a slot to complete the PHP internal request.

Basically, PHP was putting my server into a deadlock state, and Apache would only start working normally after these connections timed out waiting for their "child" request.

After I removed this function from my theme, now I can ab my server with as many concurrent connections as I want, and Apache is queueing them as expected.

Rodrigo Sieiro
  • 1,171
  • 1
  • 8
  • 10
  • Thanks for posting this up here, I've been trying to figure out a problem with exactly the same symptoms for afew days now - think we have a cast of deadlock too! – James Yale Oct 11 '10 at 14:38
  • how did you determine this, am primarily interested in the logs and tools you used to determine the secondary outbound request. – Anirudh Goel Mar 14 '17 at 02:46
2

What is happening here is that you have 25 threads able to accept connections, and you are sending 26 concurrent requests. That last request sits in the socket queue dependent on the size of your backlog.

The second problem is that whatever you're running that takes 2-3 seconds, is taking long enough to respond that the 25 concurrent connections are slowing it down. sleep(1) might work, but, something where you're doing file locking or table locking from mysql, each parallel request may be waiting on the prior to complete until they hit the 45 second timeout.

23mb sounds small for an apache process with mod_php and any modules loaded, so, I suspect you might be seeing those apache processes taking a bit more ram as your application is running. You can't really do math with MaxClients and memory like that... it will be somewhat close, but, you never know.

www-data  1495  0.1  0.9  56288 19996 ?        S    15:48   0:01 /usr/sbin/apache2 -k start
www-data  1500  0.0  0.5  49684 12436 ?        D    15:48   0:00 /usr/sbin/apache2 -k start

There's one machine, 56M and 49M processes.

another machine:

www-data  7767  0.1  0.1 213732 14840 ?        S    14:55   0:08 /usr/sbin/apache2 -k start
www-data  8020  0.2  0.1 212424 13660 ?        S    14:57   0:08 /usr/sbin/apache2 -k start

another machine:

www-data 28509  0.8  0.1 161720 10068 ?        S    14:39   0:43 /usr/sbin/apache2 -k start
www-data 28511  0.8  0.1 161932 10344 ?        S    14:39   0:43 /usr/sbin/apache2 -k start

So, memory use is very dependent on the task, which modules are loaded etc. On the last two, I believe we've disabled pdo & pdo_mysql as that application doesn't use them.

The real question is, what are you doing that is taking 3 seconds? In today's world, that is an eternity and considered a 'blocking' application. Apache won't normally die, but, will leave those threads in the backlog queue until it can service them or the waiting requests time out. I believe your application is probably causing apache to time out. Try it on a page containing just phpinfo(); and see if the results are the same.

  • Thanks for all the tips! I'm aware I still need to optimize a lot of things (I just started configuring the server a couple days ago and it's my first experience with a VPS), but the problem was deeper than that... I posted an answer to the question explaining what was the problem in my specific case. – Rodrigo Sieiro Apr 18 '10 at 20:56