0

I have a problem with monit where occasionally Varnish will crash and refuse to start. So Varnish is dead and my webserver is inaccessible. Here's the message from the monit log:

info     : 'varnish' stop: /etc/init.d/varnish
info     : 'varnish' start: /etc/init.d/varnish
error    : monit: Error reading pid from file '/var/run/varnish.pid'

Within the Varnish monitor, I thought of setting an option to restart nginx so it can listen for external requests on port 80 again if something like this happens:

if 3 restarts within 3 cycles
    then exec "/etc/init.d/nginx restart"
    and timeout

Except when I call that, sometimes nginx stops successfully... but never starts again.

The solutions I've thought of are kind of a hack (kill -9 nginx && /etc/init.d/nginx start) and (killall -9 varnishd && rm -f /var/run/varnish.pid).

I was hoping anyone could offer suggestions to either of the two above problems. Thanks!

Aliaksandr Belik
  • 259
  • 6
  • 17
Lin
  • 2,869
  • 6
  • 26
  • 25

3 Answers3

0

never ever use -9 BUT ONLY if you tried -3 and -15 already, it leaves the sockets open and basically the application has no chance to clean up after itself.

Istvan
  • 2,562
  • 3
  • 20
  • 28
  • What OS doesn't clean up sockets from a killed process? – womble Sep 04 '09 at 07:26
  • 1
    i guess you might have problem with understanding what i wrote The idea here is that properly written programs will respond to a -15 by cleaning up anything they need to do before dying. immediately. A "kill -9" just causes the process to die; it gets no chance to do any cleanup. Therefore, if you don't know how a program was written, you should try the -15 first, in case it does need to clean up files, flush logs or whatever. http://en.wikipedia.org/wiki/Functional_illiteracy – Istvan Sep 04 '09 at 23:50
  • OK, to put it another way: what OS will leave sockets open from a process that is killed with -9? – womble Sep 05 '09 at 02:36
0

You'll be fighting monit forever; I don't recommend anyone use it for anything. A much more robust architecture is something like daemontools.

womble
  • 95,029
  • 29
  • 173
  • 228
  • I dont think that he need additional. Just the basic unix commands would do the job. – Istvan Sep 04 '09 at 23:47
  • Truly spoken like someone who's never spent hours trying to convince monit to agree that a service is in the same state as reality. – womble Sep 05 '09 at 02:37
  • Can you please elaborate on that answer. Why shouldn't we use monit? – Nathan Lee May 01 '13 at 21:09
  • Because monit uses PID files, which are an inherently race-prone way of saving daemon state. `wait`(2) was created for a reason. – womble May 06 '13 at 10:13
0

I have a similar problem when restarting nginx. I use something like this:

/etc/init.d/nginx stop
sleep 2
/etc/init.d/nginx start

And it works

hdanniel
  • 4,253
  • 22
  • 25