0

I have a problem with apache 2.2 on my ubuntu 6.06 LTS server, some old rails sites are producing seg faults and all sorts of madness which seems to be eventually dragging down apache. I am migrating them to a 8.04 installation with nginx and passenger, where the bug has been squashed - but that takes time, until then I have tried to setup monit to rescue apache whenever it stops responding:

if failed host www.site.com port 80 protocol http
    and request "/" with timeout 5 seconds for 2 cycles
      then restart

50% of the time, that restarts apache successfully and saves the day, however, the other 50% of the time apache dies and monit does nothing. When I check monit status, it shows a -1 for the response time here:

port response time                0.061s to www.site.com:80/ [HTTP via TCP]

Where 0.061s would be the -1. I can't seem to find any documentation explaining the -1, or why -1 seems to slip by the failed statement.

Is there anything I can do to make sure monit catches 100% of failures? or can anyone shed light on the -1 and how to deal with it?

Aliaksandr Belik
  • 259
  • 6
  • 17
Matthew
  • 33
  • 3

1 Answers1

1

What happens if you reduce the number of cycles required for a fail ? Possibly your site is flapping, and you never get two consecutive fails.

Dave Cheney
  • 18,307
  • 7
  • 48
  • 56
  • Am I right in thinking that you're saying if apache dies (-1), monit won't run the test a second time? that would certainly explain it. I'll take out the "for 2 cycles" and see what happens. Might take a while to prove the theory, but thanks for the suggestion. – Matthew Jun 19 '09 at 12:07