Is there any possibilty to monitor processes with nagios? I found the check_procs command, which I can't use because it doesn't let me specify a file to read the PID from. Also, I don't seem to find anything about this on Google so perhaps I am having some misconceptions what nagios is actually supposed to do?

My scenario is that I have a webserver which has a few VirtualHosts. I can monitor those with check_http just fine.

However, one of the Sites is dependent on a background process which I also want to monitor with nagios.

  • 113
  • 1
  • 5

4 Answers4


nagios checks processes due to different checks, not by PID. All services (like HTTP, MySQL, DNS, ...) have seperate configurable checks.

For instance, I have a webserver and DNS server running. Then I would use the check_http plugin and the check_dns and make nagios do a dns lookup for one of the domains I host the dns for and see if the webserver is still running.

If the service is not working properly, nagios will show an alarm. Same for the webserver check and others. The check_procs command is used to see if your server is not running to many processes at the same time (overload).

All checks in nagios can be configured with different parameters.

  • 1,788
  • 1
  • 10
  • 15
  • Thanks for your answer. I updated the question to include my scenario. – moritz Aug 07 '11 at 14:10
  • what is the background process? Is there a nagios check for it? (perhaps with NRPE)? – Goez Aug 07 '11 at 14:16
  • The process is a custom worker we wrote ourselves. There is certainly not a check for it in nagios exchange. NRPE seems to be used for remote execution of checks which doesn't seem to be relevant? – moritz Aug 07 '11 at 14:23
  • Does it listen on specified port? If so, you can check it via `check_tcp`. – quanta Aug 08 '11 at 05:18

The solution to this, really, is to write a check that monitors your background process for proper functionality. If you really just want to make sure something's running at a given PID, a script that just runs pgrep $(cat /path/to/pidfile) would work, but that's chock full of false positive potential -- if your process has died, then something else runs and gets the same PID, then your process check will succeed when it shouldn't.

The proper way to do this is to bugger off the daemonisation code in your service and run it under something like daemontools -- then when it bombs it'll get automatically restarted. You then also need to monitor the functionality, to catch times when the process doesn't die, but somehow fails to run properly.

  • 95,029
  • 29
  • 173
  • 228

You can do it by pulling hrSWRunName info from HOST-RESOURCES-MIB.

  1. Install net-snmp on the remote host
  2. Edit the snmp.conf file like below:

    rocommunity s3cret
    view    systemview    included   .
    view    systemview    included   .
  3. On the monitoring host, define a check_snmp command with something like this:

    define command{
        command_name    check_snmp
        command_line    $USER1$/check_snmp -H $HOSTADDRESS$ -P $ARG1$ -o $ARG2$ -C $ARG3$ -r $ARG4$
  4. And finally, create a service definition:

        define service{
            use                     generic-service
            host_name               remote_host
            service_description     <your_service_name>
            check_command           check_snmp!2c!HOST-RESOURCES-MIB::hrSWRunName.<your_service_pid>!s3cret!<service_name>
            contact_groups          admin

Testing from the command line:

$ /usr/local/nagios/libexec/check_snmp -o HOST-RESOURCES-MIB::hrSWRunName.2910 -C s3cret -H <ip_address> -P 2c -r nrpe
SNMP OK - "nrpe" | 
$ /usr/local/nagios/libexec/check_snmp -o HOST-RESOURCES-MIB::hrSWRunName.2910 -C s3cret -H <ip_address> -P 2c -r gmond
SNMP CRITICAL - *"nrpe"* | 
  • 50,327
  • 19
  • 152
  • 213
  • How does the PID get from the PID file, which would presumably be stored on the client machine, into the `check_snmp` command line (which would run on the monitoring server)? – womble Aug 07 '11 at 20:54
  • If I understand well, he has permission on the client. Doesn't `cat /path/to/pid_file` works? – quanta Aug 08 '11 at 02:42
  • If you're going to run `ssh client cat /path/to/pid_file` to get the PID, why would you then run `check_snmp`? Why not just run (in effect) `ssh client pgrep $(cat /path/to/pid_file)`? – womble Aug 08 '11 at 04:35
  • OK. That is another way. But the correct command should be `pgrep ` – quanta Aug 08 '11 at 05:01
  • Bah, it's been too long since I've used pgrep, clearly... – womble Aug 08 '11 at 05:08

If the background process always has the same name, then the check_procs command will work.

Here are the local command definitions for two background processes I check on my server. The first (OpenDKIM) has to have 2 processes running in order to pass the test. The second (dk-filter) has to have 1 process running to pass.

# 'check_local_opendkim' command definition
define command{
        command_name    check_local_opendkim
        command_line    $USER1$/check_procs -c 2:2 -C opendkim

# 'check_local_dk-milter' command definition
define command{
        command_name    check_local_dk-milter
        command_line    $USER1$/check_procs -c 1:1 -C dk-filter

Here's how to set up the same checks in an nrpe.cfg file:

command[check_opendkim]=/usr/local/nagios/libexec/check_procs -c 2:2 -C opendkim
command[check_dk-milter]=/usr/local/nagios/libexec/check_procs -c 1:1 -C dk-filter

Or is there some reason that you can't rely on the process name for verification that it's running?

  • 482
  • 1
  • 7
  • 13