8

I have an apache webserver running many VirtualHosts.

Recently it has been bogging down and becoming unresponsive, and I'm wondering how I can determine which VirtualHosts are causing most of the issue. We have had occasions in the past where a bug in the code of an individual site has taken down the whole server. My goal is to be able to diagnose these instances quickly.

I am monitoring the server with munin and notice that the number of apache processes, memory usage, and load tend to be very high during the periods in question. Problem is, these statistics are for the whole webserver, not for individual VirtualHosts.

I have written a script to parse the weblogs for traffic per VirtualHost, but it is appearing that that is not enough. I probably need to determine how many apache processes each VirtualHost is responsible for, or how long they hold each process open - or perhaps how much memory usage each is responsible for.

Where can I find this information? I don't mind writing a script to track this data, but I don't know exactly where to extract it from in the first place.

Brent
  • 22,219
  • 19
  • 68
  • 102

2 Answers2

4

I appreciate that it doesn't always suit to have mod_status available and on all of the time, but it and apachetop are the best ways to diagnose these problems. However there are many ways to skin a cat.

This trick is useful in a number of circumstances and isn't just Apache specific. It does depend on a number of factors however, and you need to know what it's doing to know it's limitations.

for pid in `pgrep -u www-data`; do find /proc/${pid}/cwd -printf "%l\n" ; done

Let's break it down:

  • pgrep -u www-data gives you the list of pids running under user www-data. That's the default on Debian / Ubuntu, change to suit your own system (RedHat based systems tend to use httpd, for example, as the user). For systems without pgrep, you can use ps axuwww | grep user | awk '{print $2}'
  • the *for; do; ... done * loop means we loop over every entry running the command(s) within the do part of the loop.
  • find /proc/${pid}/cwd -printf "%l\n" simply searches /proc for each of those PIDs and spits out the current working directory for that process. Apache will chdir() to the VirtualHost by default when serving files from that VirtualHost. /proc/PID/cwd is a symbolic link to the directory that apache process is running in. the printf "%l\n" prints the endpoint to that link. See find(1) for more info on that.

There are two major caveats to that trick:

1) If something running under the same context as the Apache process does a chdir()'s outside of the VirtualHost directory, you'd be hard pushed to find that out.

e.g. a PHP script running under mod_php (a CGI will be different as Apache fork's a separate process, but I'm presuming CGI's aren't a problem or you'd be able to track them easier).

2) If you have Apache instances which are very very quickly serving pages (e.g. a small static HTML page). This normally isn't a problem, but it may be possible. If you're getting a lot of "No such file or directory" errors, this is basically a manifestation of it. I would expect some, but not the majority unless they fit this particular case. Basically this is because the Apache processes you've scanned with ps have already exited by the time you've checked /proc. Obviously this means they are serving pages very very quickly.

Regarding memory bound Apache processes, I use ps_mem.py to calculate memory usage on my webservers. If you've got large Apache (in terms of resident memory size) processes and they are exiting quickly, that is roughly the equivalent of asking a big fat guy to keep running 100m sprints. If your webserver isn't a shared one, those "No such file or directory" errors are normally good candidates to move some content onto a smaller lightweight webserver (e.g. nginx / lighttpd) or start heavily caching content (e.g. varnish / squid).

Philip Reynolds
  • 9,751
  • 1
  • 32
  • 33
  • This is EXACTLY the kind of thing I was looking for! Thank you. – Brent Nov 27 '09 at 16:02
  • 1
    I see alot of "/" results in this list. Could these be a result of mod_php, and if so, do you know of any way to track these as well? – Brent Nov 27 '09 at 16:54
2

I think you want apachetop, or else mod_status (with ExtendedStatus On). I'm yet to have a performance problem in Apache that wasn't lit up by mod_status, and apachetop looks like a neat tool (that has some annoying limitations in log layout).

womble
  • 95,029
  • 29
  • 173
  • 228
  • Thank you - I don't really want to keep mod_status running, but will look into scraping apachetop for the info, if it is there. – Brent Nov 26 '09 at 18:49
  • So turn it on when you need it, and turn it back off again when you're done. Simple. – womble Nov 26 '09 at 19:01
  • That doesn't quite work for what I'm trying to do, since in order to turn it off/on you need to break all the current apache connections. I would like to SEE the current connections when the problem occurs - which either means leaving ExtendedStatus on. (it provides the info I'm looking for though +1) – Brent Nov 26 '09 at 20:55