0

First, a brief explanation:

Using Zabbix for system monitoring, I'm try to understand if/how it can be used to run important, scheduled task for which I need an OK/PROBLEM value reported (ie: via email).

I already use a custom-written script, called by cron, to report errors on program execution. However, such an approach is open to being "flooded" by a fast-repeating, yet failing, scheduled task. What I really want is to be notified on a "edge change" - ie: from normal (OK) to failed (PROBLEM) executions, and vice-versa.

From here, I had the idea to trying Monit - and it works very well. However, having already Zabbix deployed, I would like to avoid using another tools if I can reasonably accomplish my goal using the existing setup.

OK, back to main problem:

From my researchs/tests, the basic approach is to treat the to-be-executed task as a recurring check/data query. Two possibilities exists:

The first approach requires a login for each command execution, which tend to "pollute" the logs with unnecessary entries, so I tend toward the second approach. That said, both methods have a significant problem: they only capture the command's output, not capturing the command's exit value

So, my questions are:

  • anyone knows how to capture the command exit value? Note: I would like to avoid wrapper scripts.
  • is someone using a similar approach? If so, do you have any feedback?
  • should I simply "resign" to use Monit?
shodanshok
  • 44,038
  • 6
  • 98
  • 162

1 Answers1

1

In general, Zabbix is not a task scheduler - Rundeck, Ansible/AWX or another solution might be a better fit. Having said that, it is still possible to use Zabbix for this, especially if it is a one-off task.

The solution to your concern about the exit code is to use a wrapper script. Make that script capture output, exit code and whatever else you need (maybe the time it took to run your command). This script then can send all of these values to Zabbix trapper items that you can alert upon.

Keep in mind that long running tasks should not be executed as Zabbix userparameters directly. If your command could run for longer than a couple of seconds, execute it with atd or a similar approach instead.

The default timeouts are:

  • 3 seconds on the agent
  • 4 seconds in the default server config file since Zabbix 3.0, 3 seconds before that
  • 3 seconds in the server if not specified in the config file

Max is 30 seconds, but you really, really should not increase the defaults.

Richlv
  • 2,334
  • 1
  • 13
  • 17
  • I would avoid wrapper scripts, if possible. Anyway, +1 for a well written answer. Do you know the maximum allowed time for an `userparameter` check? – shodanshok Nov 03 '17 at 20:37
  • If you _really_ want to avoid wrapper scripts, do `; echo $?` or something similar. I'd really suggest a wapper script, though. Added timeout info in the answer. – Richlv Nov 04 '17 at 00:22
  • Ok, thanks to your answer and based on what I read [here](https://www.zabbix.com/forum/showthread.php?t=52938), it is evident that using Zabbix to launch a potentially long-running script is not a good idea. Rather, I should schedule it via cron, capturing it's exit value in another file, and let Zabbix check this last file for 0/1 exit codes. That said, using Monit probably is the preferred approach. – shodanshok Nov 04 '17 at 10:30
  • I'd suggest avoiding both cron and writing to a separate file. Using cron makes things harder to maintain (config split in various systems) and using a temporary file in most cases is a bit excessive. An `atd` job that is scheduled by a userparameter or an external check, and that sends the results using `zabbix_sender`, is likely to be the best approach. – Richlv Nov 04 '17 at 17:38