13

I have a Nginx web proxy, gunicorn web server, and a python/flask web app. The Gunicorn process apparently died, and I want to ameliorate that in the future by looking into a utility that can monitor and restart the gunicorn process in the event it crashes again.

I've found several process supervision utilities that can do the job:

  • daemontools
  • launchd
  • runit
  • s6
  • supervisor
  • SystemD
  • upstart
  • ...

Is there a comprehensive article that compares and contrasts the various utilities used to monitor and restart a process?

https://en.wikipedia.org/wiki/Process_supervision

Rob Bednark
  • 215
  • 1
  • 2
  • 8
Matthew Moisen
  • 341
  • 2
  • 5
  • 12
  • 2
    You should really be monitoring these services from outside the box, as situations *will* arise where you get into a restart loop and consume all of the available resources on your server. – EEAA Oct 06 '15 at 19:17
  • Write your own code to do it exactly the way you want. It's an easy task. – Ryan Babchishin Oct 06 '15 at 23:26

2 Answers2

3
  • runit is a successor to daemontools (both are written in c)

  • supervisord uses python.

I've been using runit with socklog by the same author inside Alpine Linux lxc containers for around 10 months to manage web / database & various other services. It is light, easy to manage & I have had no service failures. The logging daemon also runs as it's own user & not root which is nice.

voidlinux uses runit as it's init system & also for service supervision (search the package tree for run files for examples of runit scripts).

Stuart Cardall
  • 531
  • 4
  • 7
2

If your distro uses Upstart, go with it. It has very basic support for job restarting, but includes limits that can prevent from restart loop, as mentioned by @EEAA.

If your OS uses another init program, don't change it. I can't really help you with the other tools you mentioned, as I generally use Ubuntu where Upstart is still present (as of the latest LTS), so I have little to do with them. But it's not a hard task to create a simple script which is run from cron once a minute (or more frequently in ie. a loop), which can check if a PID exists and issue restart on failure.

sam_pan_mariusz
  • 2,053
  • 1
  • 12
  • 15