1

here is the problem:

i have a couple of sites hosted on a 3 GIG ram server on mediatemple and i never had a problem. suddenly every 2 to 3 days i get a kmemsize error and apache crashes. neither mysql nor apache seem to be working then. i never had this problem and there has been no increase in traffic.

the httpd log file says there is maxclient reached

if somebody can help me i am ready to pay something for solving this. i added 2 screenshots of the apache log file. if there is any other log file that can help let me know

http://www.travolto.com/screen/screen1.jpg

http://www.travolto.com/screen/screen2.jpg

  • It's probably not the maxclients error, I *think* that's normal. I've definitely seen that before on sites which are running apparently healthily. – Dominic Rodger Oct 13 '09 at 09:00
  • This sounds like it might be related to my issue: http://serverfault.com/questions/43752/apache-gets-clogged-with-certain-requests – Josh Nov 04 '09 at 13:45

9 Answers9

1

You're almost certainly running out of memory. Often apache will hit maxclients when you start to swap because the existing children are tied up waiting for disk i/o so it will spawn new children to handle new requests, which will then hang waiting for disk i/o, rinse, cycle, repeat until maxclients is reached.

Mostly likely you need to go through your mysql and apache configs and trim them to use less memory. You'll also need to take a look at whatever code you're running in mod_python and see if its being a hog. As Yves mentioned eyballing the RSS column in a "top" while browsing your site will help show you approximately how much memory each apache child is taking up. I've seen 20 - 50mb in mod_python scenarios before, so if I didn't want apache to use more than 1.5GB I'd limit my maxclients to around 40. MySQL similarly can be configured to only use a certain amount of ram, but that gets too complicated to explain here.

One thing I tend to disagree with is expanding your swap. Swap is sooo much slower than regular ram that it almost always triggers a runaway or pile-up scenario that takes you down just as badly as OOM would. Might as well get the machine back to a log-in-able state quickly so you can work on the real solution.

Just a note, not sure what you're doing with mod_python, but if its django (or even if not really) you should take a look at mod_wsgi. It tends to be a more efficient memory usage model.

cagenut
  • 4,808
  • 2
  • 23
  • 27
1
  1. What is your current MaxClients setting? You could try increasing that - just to see if that helps. Don't forget to restart Apache after editing httpd.conf :)

  2. Also, just in case: check if KeepAlive is On.

  3. Also, check the value of MaxRequestsPerChild (or a similarly named option) - if it is 0, try setting it to some large value (e.g. 1000-5000-15000).

  4. Even without the notable increase in traffic you could have been spidered by a bad bot, opening multiple connections to your server. Also, if you were referring to traffic monitored by e.g. Google Analytics and similar tools, then bots/spiders won't be included in those statistics at all. So also check if your apache request logs now have more requests per time period than before this error started appearing (I believe you might have a month-long history of gzipped apache logs in /var/log/apache2).

  5. If you find that some IPs are too aggressively requesting pages from your server - you could try mod_evasive, which limits the number of pageviews per time period (configurable). However, this will not help at all, unless you do find some offending spiders/bots.

  6. If nothing of that helps, you could try tracing apache process to find out what happens just before it dies (use strace for that). Several runs with strace, giving the same result, may help finding the problem. I would only use this approach as a last resort.

  7. It is very strange that MySQL dies as well. Could you please check if that is really the case, and provide more details? I do not recollect ever seeing both Apache and MySQL dying simultaneously.

chronos
  • 568
  • 5
  • 13
  • This one http://serverfault.com/questions/43752/apache-gets-clogged-with-certain-requests/43785#43785 is also worth trying, if you see many apache child processes in the W state. – chronos Nov 12 '09 at 11:54
1

MaxClients is the symptom here.

The reason you're hitting it is that a large fraction of requests are either slow, or crashing. If you can figure out what request is causing this issue.

Another possibility is hardware level memory corruption or bus issues.

LapTop006
  • 6,466
  • 19
  • 26
0

Doing a bit of googling for that second error message, I find this post:

I think that you are running out of semaphore undo structures. Try increasing the kernel parameter semmnu.

memnoch_proxy
  • 346
  • 1
  • 8
0

This is just a wild guess based on your logs... are you using Python scripts in your web application? If so, check to see if you have a runaway script causing a connection leak or memory leak of some sort.

0

Check your mysql logs whether it ran out of connections. Bugzilla had this issue that it would eat up connections until the whole site crashed.

Monitor the memory usage (try top(1) or vmstat(8)). When you see a surge, you need to look at the process and determine why it doesn't free the memory. Maybe there is a script running or something. Or you have a huge query.

Aaron Digulla
  • 954
  • 1
  • 13
  • 24
  • i tried monitoring top and vmstat for some time.. everything seemed normal. mysqld only using like 1% of memory while the few httpd processes less –  Oct 13 '09 at 09:28
0

Maybe this means you're running out of memory. Can you try this?

dmesg|grep oom

If it returns any output, it means your system ran out of memory and Linux had to kill a few processes. If that's the case, review your processes RSS memory usage and try to stop unnecessary jobs. Maybe you can also add more swap space - although this is not a good idea if this is a virtualized host (VPS).

Yves Junqueira
  • 671
  • 3
  • 7
0

Since you are running mod_python on the site, ensure you read:

http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html

Graham Dumpleton
  • 5,990
  • 2
  • 20
  • 19
0

The line in the 2nd log at 13:16:57 indicates you've run out of memory. Top will likely show all swap is consumed. You could try adding additional swap (man 8 mkswap, man 8 swapon). But there's a fair chance you'll run through that, too, if you've got a script that leaks memory. You might also check the httpd access file for any bursts of activity preceeding the out of memory event.

  • Note that touching swap *at all* is so slow that you'll immediately kill performance when that happens. – LapTop006 Jan 02 '10 at 13:32