5

Some details:

  • Webserver: Apache/2.2.13 (FreeBSD) mod_ssl/2.2.13 OpenSSL/0.9.8e
  • OS: FreeBSD 7.2-RELEASE
  • This is a FreeBSD Jail.
  • I believe I use the Apache 'prefork' MPM (I run the default for FreeBSD).
  • I use the default values for MaxClients (256)

I have enabled mod_status, with "ExtendedStatus On". When I view /server-status , I see a handful of regular requests. I also see over 240 requests from the 'localhost', like these.

37-0    -   0/0/1   .   0.00    1510    0   0.0 0.00    0.00    127.0.0.2   www.example.gov OPTIONS * HTTP/1.0
38-0    -   0/0/1   .   0.00    1509    0   0.0 0.00    0.00    127.0.0.2   www.example.gov OPTIONS * HTTP/1.0
39-0    -   0/0/3   .   0.00    1482    0   0.0 0.00    0.00    127.0.0.2   www.example.gov OPTIONS * HTTP/1.0
40-0    -   0/0/6   .   0.00    1445    0   0.0 0.00    0.00    127.0.0.2   www.example.gov OPTIONS * HTTP/1.0

I also see about 2417 requests yesterday from the localhost, like these:

Apr 14 11:16:40 192.168.16.127 httpd[431]: www.example.gov 127.0.0.2 - - [15/Apr/2010:11:16:40 -0700] "OPTIONS * HTTP/1.0" 200 - "-" "Apache (internal dummy connection)"

The page at http://wiki.apache.org/httpd/InternalDummyConnection says "These requests are perfectly normal and you do not, in general, need to worry about them", but I'm not so sure.

Why are there over 230 of these? Are these active connections? If I have "MaxClients 256", and over 230 of these connections, it seems that my webserver is dangerously close to running out of available connections. It also seems like Apache should only need a handful of these "internal dummy connections"

We actually had two unexplained outages last night, and I am wondering if these "internal dummy connection" caused us to run out of available connections.

UPDATE 2010/04/16

It is 8 hours later. The /server-status page still shows that there are 243 lines which say "www.example.gov OPTION *". I believe these connections are not active. The server is mostly idle (1 requests currently being processed, 9 idle workers). There are only 18 active httpd processes on the Unix host.

If these connections are not active, why do they show up under /server-status? I would have expected them to expire a few minutes after they were initialized.

Stefan Lasiewski
  • 22,949
  • 38
  • 129
  • 184

5 Answers5

5

Apache handles a thundering herd a little differently than you might imagine. When you get a burst of inbound traffic, it spawns a number of child processes, if it determines it needs more, it spawns twice as many in the next interval until it finally has enough processes to serve the requests or hits maxclients.

If you see these, it means that apache is just checking the children, and whatever caused apache to fork that many processes is probably gone. Yes they do take up client connections, but, whatever event caused things to spool up is probably gone.

First thing I would check in your logs would be a bunch of 302s prior to the event.

If you had something like

<?php include("http://www.oursite.com/header.php");?>   

where header.php was missing and

ErrorDocument 404 /404.php 

where 404.php included header.php, you would get a recursive loop and a hit on that page would immediately cause apache to use all available connections.

  • A thundering herd may explain our problems. Unfortunately, I'm not seeing a spike in requests around that time. However, maybe someone is slamming a poorly coded page. Why would you expect to see a bunch of 302s prior to the event? – Stefan Lasiewski Apr 15 '10 at 23:18
  • 302s would be a sign of a recursive 404 (and I see that the editor blanked out my code, I'll re-edit) which would cause apache to ramp up with no real increase in connections from the outside world. –  Apr 16 '10 at 01:29
  • I edited my post to include another important note. We had another outage last night. It's 8 hours later, and /server-status still shows these lines. – Stefan Lasiewski Apr 16 '10 at 17:02
2

My understanding here is that, given that these are connections from the parent to the child process, they’re merely Apache keeping track of what the children are doing. Bear in mind that:

  • children can hang around for quite a while after they’ve processed a request
  • the internal dummy connections occur regularly
  • if the child hasn’t done anything else (because the server’s mostly idle), the dummy connection will be the most recent thing it’s processed

it’s not the case, as far as I know, that the dummy connections “use up” children. Apache is checking what the status of its children is, rather than exercising them to test whether they work or not.

Mo.
  • 2,166
  • 16
  • 9
1

If memory serves, these are test connections generated by lightweight proxies (such as Lighttpd) which sit in front of heavier servers such as Apache.

Given that you’re in a jail, is the host server perhaps proxying requests to the (private) jail IP via lighttpd?

Mo.
  • 2,166
  • 16
  • 9
  • There is no proxy or Lighttpd server on this host. There is a Hardware Loadbalancer, but that is a separate piece of hardware, with it's own IP. These connections come from '127.0.0.2', which is the host of the localhost for this jail. – Stefan Lasiewski Apr 20 '10 at 17:41
1

You need to find which processes are connected to your Apache port (I'll assume it's 80).

I don't have a FreeBSD system so I can confirm the commands, but at least on a Mac this should give you a hint:

$ lsof -i

It will show something like:

COMMAND     PID  USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
BadGuy    26655 yvesj   24u  IPv4 0x3f32270      0t0  TCP localhost:56696->localhost:56695 (ESTABLISHED)
GoodGuy 26656 yvesj   15u  IPv4 0x5b7666c      0t0  TCP localhost:56695 (LISTEN)
GoodGuy 26656 yvesj   16u  IPv4 0x72a9e64      0t0  TCP localhost:56695->localhost:56696 (ESTABLISHED)

From this you can notice that process with PID 26656 is listening on port 56695 and process 26655 is connecting to that port. This way you can identify who is the bad guy (just don't get confused with the third line, which shows the other side of the connection (goodguy=>badguy).

When you apply this to your case, you'll find which other processes on your system is holding those connections to your Apache instance.

Good luck!

Yves

Yves Junqueira
  • 671
  • 3
  • 7
  • Also, you may not be so lucky to see the live connections if they are short-lived. In this case, there are other options: 1) run a tcpdump to see when the connections are open, then find the PID of the client process using the process above. Again, you may not be able to do this fast enough. 2) Add a firewall rule that logs access to that specific port on localhost and prints the PID of the calling process (dunno if this is easy to do on a FreeBSD system) – Yves Junqueira Apr 19 '10 at 23:42
1

Well, this had a surprise answer. This was caused by a filesystem problem when we took UFS filesystem snapshots at midnight.

This seems to be caused by a FreeBSD UFS bug. We use FreeBSD Jails on a FreeBSD Host, with the default UFS filesystem. The UFS filesystem is large -- 1.8TB.

Once per night, we run a backup using 'dump(8)'. dump(8) was creating a snapshot of the filesystem before backing it up, and this froze the filesystem. Dump is supposed to work with filesystems less then 2TB, but it failed in our case. This guy had the same problem.

(I moved my answer from the question section down here to the answer section. stefan, 20100608)

Stefan Lasiewski
  • 22,949
  • 38
  • 129
  • 184