How to monitor a service and restart if stopped in Linux

24

7

Actually I'm not so sure whether i should use Shell Scripts, or if there some ways already. But whatever approach we use, i would like to keep a Service running all the time.

Let's say iptables as an example. Then ..

  • Whenever the iptables service is stopped or (in other words) not running, i want it to be started (or restarted) .. automatically whenever it stopped (or not running).
  • In other more simple words, i want to keep a Service up and running all the time.

(May be i could give a fair frequency to check, if doing Real-time checking is the problem. So lets say, every 5 mins)

The only way i could think of, is to use Shell Scripts with Cron Tab.

  • Is there any smart solution please?

Thanks!

夏期劇場

Posted 2013-12-03T08:53:02.170

Reputation: 539

You should not do that. Suppose a service is ill-configured, what would your strategy achieve? An infinite list of retrials. You should instead write a crontab script that alerts you to something not working. – MariusMatutiae – 2013-12-03T09:12:47.217

I'm just curious about the straight solution, for the original question. And also, i have a Service which just needs to be simply restarted whenever it stopped, for any reason. No problem with restarting. – 夏期劇場 – 2013-12-03T09:43:59.837

1

Your own suggested solution is smart enough. If you use it correctly (exit immediately if service is already running, alert you that the service has stopped so you can fix it, and so on....) it is the simplest way. A service that stops automatically is a problematic service, so eventually you should fix it, but otherwise, as a temporary patch, a cron script or another super-simple daemon that sleeps most of the time will do the job just fine. There are some tools like http://mmonit.com/monit/ but I think that in the end they all use a similar approach

– None – 2013-12-03T10:40:13.267

@MariusMatutiae, I agree with your point but it depends on the nature of the service, and most process managers will back off after a number of failed restarts. It's perfectly reasonable for a process to naturally end, and for us to want to restart it automatically, e.g. a worker that picks up a job from a queue and ends after each run. It's also a handy tool for sysadmins that suffer from bespoke memory-leaking code - limit the lifetime of a process and restart it automatically before it can get out of hand... – Alex Forbes – 2013-12-11T23:50:29.343

Answers

25

Update March 2018

This answer is now quite old, and since it was written systemd has won the pid1 war on Linux. Thus, you should probably create a systemd unit, if systemd is built in to your distribution (which is most of them).

Answer below is preserved for posterity.


The monit answer above is valid, but I thought I'd mention some alternatives:

It's worth bearing in mind that your operating system has already solved the process management problem. Traditionally, Linux has used sysvinit, which is basically the collection of scripts you see in init.d. However it's pretty dumb and can not monitor processes, init.d scripts are complicated and it's being replaced for good reason.

More modern operating systems are starting to replace sysvinit, and the frontrunners are Upstart and Systemd. Debian is leaning towards systemd, Ubuntu developed and has pretty much already transitioned to Upstart, and like Debian Redhat/CentOS/Fedora are moving towards systemd. Thus if you use an OS that has already replaced sysvinit I would recommend using what's built-in. The scripts are much easier to write than init scripts.

I have used runit and quite like it, but the easiest to use is supervisor. It's also very well documented, works almost anywhere and is packaged in all the major distributions.

But whatever you do, please, please, PLEASE do not use a shell script. There are so many things wrong with that approach!

Alex Forbes

Posted 2013-12-03T08:53:02.170

Reputation: 978

how to do it with sysvinit? – horseyguy – 2019-04-26T08:47:05.757

12

iptables is a poor example as it's not really a service or daemon that is running, but part of the kernel. You can't really "stop" iptables, you can only give it a configuration and "stopping" it involves giving it a blank configuration. Indeed I have had Linux systems crash, but the port forwarding setup using iptables continues to work.

Anyway, a utility called monit will do what you want. If you are using Debian it's an apt-get install monit away. It's a bit involved to learn about but very flexible.

LawrenceC

Posted 2013-12-03T08:53:02.170

Reputation: 63 487

3

We are using this simple script to make an alert and start the service if it is not running, You can add more services too..

 file name: uptime.sh

 #!/bin/bash
 #service monitoring
 /bin/netstat -tulpn | awk '{print $4}' | awk -F: '{print $4}' | grep ^80$ > /dev/null   2>/dev/null
 a=$(echo $?)
 if test $a -ne 0
 then
 echo "http service down" | mail -s "HTTP Service DOWN and restarted now" root@localhost
 /etc/init.d/httpd start > /dev/null 2>/dev/null
 else
 sleep 0
 fi
 /bin/netstat -tulpn | awk '{print $4}' | awk -F: '{print $4}' | grep ^53$ > /dev/null   2>/dev/null
 b=$(echo $?)
 if test $b -ne 0
 then
 echo "named service down" | mail -s "DNS Service DOWN and restarted now" root@localhost
 /etc/init.d/named start > /dev/null 2>/dev/null
 else
 sleep 0
 fi

 Cron setup:
 */5 * * * * /root/uptime.sh > /dev/null 2>/dev/null

Ranjithkumar T

Posted 2013-12-03T08:53:02.170

Reputation: 319

MariusMatutiae's point is correct but we have done a simple script to monitor the HTTPD and DNS service in my server, its running fine. When ever the service is down the script will restart the service and make an alert to us, It we get plenty of alert / mails regarding the service down, then we can do an investigation on it. – Ranjithkumar T – 2013-12-08T05:31:22.837

1

Alternative Solution For Desktop (KDE) :

We can watch a a service with the Applet/Widget Server Status... after installing it just add a command into the widget to monitor your service

Example : systemctl status httpd.service

KDE 4 Version : https://store.kde.org/content/show.php?content=101336

KDE 5 Version : https://store.kde.org/p/1190292/

intika

Posted 2013-12-03T08:53:02.170

Reputation: 839

0

I know it's been several years since the question was asked. but with the systemd (mostly available with centos and REHL) you can run this bash command with cron to check and restart if service is down.

#!/bin/bash

service=$@
/bin/systemctl -q is-active "$service.service"
status=$?
if [ "$status" == 0 ]; then
    echo "OK"
else
    /bin/systemctl start "$service.service"
fi

save it in your bin directory and name it like monitor. Give appropriate file permission to it. then run it like

sudo monitor redis

if you want to check redis service and restart/start if required.

last of all add this to your cron job.

hope this will help

Ahmad Sajid

Posted 2013-12-03T08:53:02.170

Reputation: 1

0

To add to the long list of init/svc supervision, as a sub-directory to S6 there is a new kid on the block, 66, that handles s6 service management and logging in a fast, light, user friendly manner. This is the link to the official documentation for Obarun-Linux https://web.obarun.org/software

This is an FAQ of how to use this 66 software and make sense of s6 http://sysdfree.wordpress.com/266

Since its stable release only one bug was found relating to kernel changes from 4.20-->5.0, all other reported problems had to do with people learning something new. If service management had to become any simpler than this it might be better to switch to ms-windows (Linus forbid). To see in real life how this can work one only has to download an Obarun live.iso and play with it. Install services and their 66-scripts enable them, kill them, see their logs, stop them and start them (while enabled), bunch services up into a tree and have trees of service start and stop all together, have user-level services separately from system. It does what s6 does well and makes it simpler for the user to exploit the bulletproof system under s6.

Image downloads can be found here: https://web.obarun.org/index.php?id=74 md5 check files https://repo.obarun.org/iso/

Apart from init and service management s6/66 don't have any dependencies from anything else on the system. It is a layer of the base system leaving the rest of the software to work on their own, init/svc-mgmt blind. All s6 and 66 is written in C and it is not linux specific, or glibc specific. Skarnet's (s6 authors) servers have been running for nearly a decade without many pauses on musl custom built system. Alpine, Void, and Adelie currently also have s6 software on their repositories, Adelie uses it for service supervision by default. Void now carries 66 as well. I don't know whether and to what extend anyone has ported s6 to xxBSD or other xxIX systems.

Gus Fun

Posted 2013-12-03T08:53:02.170

Reputation: 21