2

Is there a way to tell monit to alert me if there are more than X errors (e.g. lines matching "ERROR") in a log file in a certain time?

My use case would be: errors sometimes appear in my log file (i.e. network errors, remote server hiccup etc) and they are not critical. But I'd like to be notified if there is a spike because that would require a quick investigation (e.g. botched deploy, newly introduced bug).

Ideally I'm thinking of something like

check file myapplog with path /var/myapp.log every 2 cycles
   if lines matching "ERR" > 10% then alert

I think I can get this by writing an external script and then doing something like

check program cer with path /usr/local/bin/checkerrorrate.sh 
   if status != 0 then alert

but I'm wondering if there is a better option.

riffraff
  • 125
  • 6

1 Answers1

0

I don't think Monit is the best choice for assessing the frequency of these error messages. The limitations of the File Content testing routines may make this tricky without going to an external solution. See: http://mmonit.com/monit/documentation/monit.html#file_content_testing

Specifically:

  • The content is only being checked every cycle. If content is being added and removed between two checks they are unnoticed.

  • On startup the read position is set to the end of the file and Monit continue to scan to the end of file on each cycle. But if the file size should decrease or inode change the read position is set to the start of the file.

  • Only lines ending with a newline character are inspected. Thus, lines are being ignored until they have been completed with this character. Also note that only the first 511 characters of a line are inspected.

I would instead poll the remote server(s) or the related services to check health.

ewwhite
  • 194,921
  • 91
  • 434
  • 799