0

I would like monit to monitor my service and once it's stopped then I want the computer to reboot. The process don't have a certain port that I can monitor. This is what I did:

check process chat with pidfile /var/run/chat.pid
start program = "/etc/init.d/chat start"
stop program = "/etc/init.d/chat stop"
if changed ppid then exec /sbin/reboot

I tried all sort of things but it only restart my service.

Any suggestion.

Khaled
  • 35,688
  • 8
  • 69
  • 98
edotan
  • 1,786
  • 12
  • 37
  • 57
  • Why reboot? There aren't many good reasons to reboot a linux server, most of the time you can simply restart services to achieve the same effect. Linux is designed to run for long periods of time without reboot. – Kyle Mar 22 '12 at 16:43
  • did you fix the `ppid` to `pid` because checking the parent pid for a service is always going to be wrong. If so you should update your question. – Tom Mar 23 '12 at 17:20
  • I am seeing a similar problem on my CentOS 6.2 instances, in that I cannot get it to exec simple commands, hence I have submitted a question to the monit users mailing list, but as yet got no response. http://lists.nongnu.org/archive/html/monit-general/2012-03/msg00053.html – Tom Mar 23 '12 at 17:20

2 Answers2

1

I think ppid might refer to the parent id, which will always be 1 for a service, so use

check process chat with pidfile /var/run/chat.pid
start program = "/etc/init.d/chat start"
stop program = "/etc/init.d/chat stop"
if changed pid then exec /sbin/reboot

instead. i tested this with some local service, and it works for me, restarting the service causes the server to reboot. (whether this is a good idea generally is another matter... ;-)

from the man page....

PID TESTING

Monit can test the process identification number (pid) of a process for changes. This test is implicit and Monit will send a alert in the case of failure by default.

The syntax for the pid statement is: IF CHANGED PID [[] CYCLES ] THEN action action is a choice of " ALERT ", " RESTART ", " START ", " STOP ", " EXEC ", " MONITOR " or " UNMONITOR ".

This test is useful to detect possible process restarts which has occurred in the timeframe between two Monit testing cycles. In the case that the restart was fast and the process provides expected service (i.e. all tests succeeded) you will be notified that the process was replaced.

For example sshd daemon can restart very quickly, thus if someone changes its configuration and do sshd restart outside of Monit's control you will be notified that the process was replaced by a new instance (or you can optionally do some other action such as preventively stop sshd).

Another example is a MySQL Cluster which has its own watchdog with process restart ability. You can use Monit for redundant monitoring.

Example:

check process sshd with pidfile /var/run/sshd.pid if changed pid then exec "/my/script"

PPID TESTING

Monit can test the process parent process identification number (ppid) of a process for changes. This test is implicit and Monit will send alert in the case of failure by default.

The syntax for the ppid statement is: IF CHANGED PPID [[] CYCLES ] THEN action action is a choice of " ALERT ", " RESTART ", " START ", " STOP ", " EXEC ", " MONITOR " or " UNMONITOR ".

This test is useful for detecting changes of a process parent.

Example:

check process myproc with pidfile /var/run/myproc.pid if changed ppid then exec "/my/script"

Tom
  • 10,886
  • 5
  • 39
  • 62
  • That's what I did but it won't reboot. it will only restart the service. I need the server to reboot once the process is dead – edotan Mar 22 '12 at 16:26
  • I also tried :check process chat with pidfile /var/run/chat.pid start program = "/etc/init.d/chat start" stop program = "/etc/init.d/chat stop" if 1 restarts within 1 cycles then exec /sbin/reboot – edotan Mar 22 '12 at 16:28
  • the `stop program = "/etc/init.d/chat stop` and start directives allow monit to control the service, like `monit chat start`. I tested the example above and it definitely works for me. If you restart chat, and it gets another pid, then the server reboots. – Tom Mar 22 '12 at 16:39
  • Basically if the above stanzas are not working for you, and `service chat restart` and `monit chat restart` do not cause the server to reboot, then its probably that the chat init script is not working properly. If you provide the output of the following commands in a [pastebin](http://pastebin.com/) you might see the issue. `ps -ef | grep chat; monit chat restart; ps -ef | grep chat` – Tom Mar 22 '12 at 16:46
  • there is a typo in that last comment try this instead ``ps -ef | grep chat; service chat restart; ps -ef | grep chat`` – Tom Mar 22 '12 at 16:54
  • interestingly this works on my local fedora box which is running monit-5.2.5-3.fc16.x86_64 but not on my centos server with monit-5.1.1-2.el6.x86_64 – Tom Mar 22 '12 at 17:37
0

It is very difficult to monitor arbitrary processes from the outside; conversely, it is trivial to control processes started by you.

I would suggest you investigate runit, an alternative init daemon for servers.

adaptr
  • 16,479
  • 21
  • 33