-1

i'm trying to check if a program on a linux server is running and start it if not. But i get strange errors:

#!/bin/sh
SERVICE=nrpe

ps -ef | grep -v grep | grep $SERVICE | wc -l

if [ $? -gt 1 ]
then
    echo "$?"
    echo "$SERVICE service running, everything is fine"
else
    echo "$?"
    echo "$SERVICE is not running"
    service $SERVICE start
fi

The Output is:

[root@mail ~]# check_nrpe.sh 2 1 nrpe is not running Starting Nagios NRPE daemon (nrpe):

It is the same Message whether nrpe is running or not. If I test the command ps -ef | grep -v grep | grep $SERVICE | wc -l in shell, it works.

Kjellson
  • 85
  • 6
  • 1
    This is off topic here, but a hint: [`$?` does not contain what you think](https://stackoverflow.com/questions/6834487/what-is-the-dollar-question-mark-variable-in-shell-scripting). – Gerald Schneider Mar 06 '18 at 09:24
  • @GeraldSchneider: Scripting for system administration purposes like in this case is not OT. – Sven Mar 06 '18 at 10:24

1 Answers1

3

First of all, in order for your script to work as intended, change your "if statement" to:

if [ $? -eq 0 ]

In addition, it is possible that xinetd is managing your nrpe (it is common) and in that case you won't see a nrpe process running but a xinetd one instead.

So I'd say that a better check would be to see if the port is opened or not, I suggest you change the ps -ef... command to (assuming nrpe is configured in the default way and opens tcp port 5666):

netstat -plunt | grep -w 5666

The exit status will act as you expect, "0" if the port is opened and "not 0" if the port is closed.

Also, you can make your whole script a oneliner:

netstat -plunt | grep -qw 5666 && echo "NRPE is running" || echo "NRPE is not running"

Example:

[root@centolel tmp]# netstat -plunt | grep -qw 5666 && echo "NRPE is running" || echo "NRPE is not running"
NRPE is running
[root@centolel tmp]# service xinetd stop
Stopping xinetd:                                           [  OK  ]
[root@centolel tmp]# netstat -plunt | grep -qw 5666 && echo "NRPE is running" || echo "NRPE is not running"
NRPE is not running
Itai Ganot
  • 10,424
  • 27
  • 88
  • 143
  • Changing the if statement gains nothing. `$?` contains the return code of `wc`, not the output. `$?` will only contain something different than 0 when an error occurs. – Gerald Schneider Mar 06 '18 at 09:39
  • Other points not considered here: Not every service opens a port; "grep 5666" will also match on other ports containing that number, e.g. 25666. There is really no need to reinvent the wheel, every linux distribution comes with tested scripts to determine if a service is running, just look how other services do it. – Gerald Schneider Mar 06 '18 at 09:42
  • In Linux, there are many ways to achieve the same goal, ofcourse you can check if a socket is opened, if a pid exists and many other methods, but there positive/negative sides to every solution, so you should choose the one that matches you. In regard to port 5666 - I assume the OP has used the default configuration and for that my answer is good. – Itai Ganot Mar 06 '18 at 09:44
  • You are also assuming that there are no services running on ports 15666, 25666, 35666, 45666 and 55666. That kind of assumptions should not be left in systems, because they will cause unexpected results when things change in the future. – Tero Kilkanen Mar 06 '18 at 18:35
  • @TeroKilkanen, good point, edited my answer. – Itai Ganot Mar 06 '18 at 23:06