Why does apache write its own process ID in httpd.pid?

6

I am new to Apache and debugging issues caused by it. I have come to understand that when Apache starts up, it writes its own process ID in httpd.pid in a human-readable format as explained here.

I don't quite understand why is the process id needed in the first place. Even if it is needed, I don't understand the reason of this approach.

On Linux, a process can find the process id of Apache by using ps -ef and etc. In general, I have not heard of any other process writing its process id in some file.

What is so special about this Apache process?

p2pnode

Posted 2012-05-10T12:14:08.360

Reputation: 1 257

2

See my answer here: http://unix.stackexchange.com/questions/12815/what-are-pid-and-lock-files-for/12818#12818

– LawrenceC – 2012-05-10T13:26:59.580

Answers

9

On most Unix systems, services are started and stopped by the init system. Many Linux distributions use the legacy sysvinit, which almost completely lacks service management functions, so scripts in /etc/init.d or /etc/rc.d perform the actual job of starting or killing Apache. These initscripts are written in plain sh and have no other means of tracking the processes they launch – other than by reading its PID from a preset location. (The initscript can know PIDS only of processes it launched directly, but not of children of those processes, nor processes launched the last time the same script was called. This means that the initscript cannot track processes that were programmed to "daemonize" themselves.)

(Yes, a process can be found using ps -ef, or by examining /proc directly. However, this is a somewhat unreliable method – there can be several Apache processes running at the same time: for example, mpm-prefork, or multiple independent Apache configurations. Because of this, almost every daemon on Linux will create a "pidfile" in /run or /var/run, in order for it to be easily stoppable by an initscript. You'll likely have crond.pid, ntpd.pid, rsyslogd.pid, sshd.pid, and so on.)

Only very recent Linux init systems bother tracking processes: Upstart in Ubuntu has to be told exactly how many forks to expect, while systemd in Fedora uses the kernel cgroups to track processes belonging to a service.

user1686

Posted 2012-05-10T12:14:08.360

Reputation: 283 655

@grawity, But using such "file hack" is weak and provides zero guarantee. Wouldn't things start screwing up if a user or a rogue process decides to delete the file? Or perhaps create one when nothing is running? – Pacerier – 2016-04-15T15:30:44.420

@Pacerier: Normal users can't delete files from /run. And if root decides to do so, well, it's not Linux's job to stop them from shooting themselves in the foot if they so wish. But either way, that's one of the several reasons modern init systems don't use pidfiles. ((Though the alternatives tend to be rather Linux-specific, so there's still the "omg it's not posix" crowd around them.)) – user1686 – 2016-04-15T15:35:23.767

2+1. The way is meant to be posted. I like you provided some info about the legacy and all. I learned some new things from this. – None – 2012-05-10T13:06:33.723

2

The reason is that third parties can use signals, by means of kill(1), to control the running Apache instance. For example to direct it to gracefully reload the configuration.

The other answers still apply.

Also keep in mind that you can run multiple instances of Apache on various ports and with various configurations on the same machine. It gets even more interesting when you use the (default, IIRC) forking mode. You need to be able to figure out the controlling instance of Apache for some horde of child processes, so this is the most pragmatic way to do it.

0xC0000022L

Posted 2012-05-10T12:14:08.360

Reputation: 5 091

1

Apache wants to know if another instance of Apache is already running. Writing the process number is more reliable than using ps and it allows to test whether a process shutdown unexpectedly (without deleting the .pid file). There is also no guarantee that ps will be available on any particular system (due to permissions, for example).

A number of other programs also use .pid files. Check your /var/run directory. Mine has several in it.

SigueSigueBen

Posted 2012-05-10T12:14:08.360

Reputation: 216

1

As the other answers said, you want to have the process ID to send signals to it. Since it's a daemon process, signals are one of the primary methods to communicate. You can end apache, tell it to reload config, etc., with signals.

It's more important with apache, and the process model, because (typically) apache has a parent/child relationship, with the parent being a controller that forks/reaps children. If you want to take the server down, you need to send a signal to the parent, not to the children. If you just did ps you'd have to wade through this parent/child relationship to find the parent; only then could you send it a signal.

Rich Homolka

Posted 2012-05-10T12:14:08.360

Reputation: 27 121