2

I am developing a script for monit status checking so I can send OK status messages to NagiOS NSCA server (passive checks). The problem I am having is that my bash script still sends messages if the scripts grep function does not include anything that would trigger the message sending.

Script:

Variables

rsysl='rsyslog'
log='messages'

Commands in variables

host=$(hostname)
monstat=$(monit status|grep -C 1 '$rsysl')
nsca_status=$(echo -e "$host\t$rsysl\t0\tOK" | /usr/sbin/send_nsca -H mon.lv.lan -c /etc/send_nsca.cfg)

Monit status command

# Postfix check
$monstat

Message sending function as you can see it only should send the message when status is equal to not running and not accessible

if [ "status" == "not running" ] && [ "status" == "not accessible" ]; then
   $nsca_status
else
   :
fi

Grep output (in real situation the message sending command has to match running and accessible:

# monit status|grep -C 1 'rsyslog'

Process 'rsyslog'
  status                            Running
--

File 'rsyslog-messages-log'
  status                            Accessible
cr0c
  • 1,116
  • 3
  • 15
  • 32

1 Answers1

6

There are actually a number of problems in the excerpts you posted. The one that's making it always send messages is that the "Commands in variables" section isn't doing what you think it's doing. Specifically, what var=$(command) does is execute the command immediately, then put its output in the variable. Since the nsca_status=$( ... | /usr/sbin/send_nsca ... ) command is always executed, the message is always sent -- and sent before the if statement that's supposed to decide whether to send it or not.

In general, storing a command in a variable is tricky (see BashFAQ #50:I'm trying to put a command in a variable, but the complex cases always fail!), and generally a bad idea. In a case like this, either just use the command directly (without trying to store and retrieve it), or use a function:

nsca_status() {
    echo -e "$host\t$rsysl\t0\tOK" | /usr/sbin/send_nsca -H mon.lv.lan -c /etc/send_nsca.cfg
}

(and then execute it with just nsca_status -- no $.)

In the case of the other two commands in that section, you probably actually do want to execute them immediately and store the results, so they're mostly OK. Well, actually, there is a problem with monstat=$(monit status|grep -C 1 '$rsysl') -- the single-quotes around $rsysl will prevent it from being expanded as a variable reference, so grep will be searching for $rsysl, instead of rsyslog. To fix this, use double-quotes instead. Variable references should almost always be wrapped in double-quotes. But note that you should not then try to execute $monstat as a command -- that'll try to execute grep's output (Process 'rsyslog' status Running ...) as if it were a command, which makes no sense.

The other problems I see are in the if statement:

if [ "status" == "not running" ] && [ "status" == "not accessible" ]; then

...there are actually 3 fatal problems here (and one minor quibble): first, it's comparing the string "status" with "not running" and "not accessible", but you want to be comparing the output of the monit status | grep ... command. This is simple to fix, use "$monstat" instead of "status".

Second, the && part means it'll trigger only if both matches occur; that is, if something's not running and something's not accessible. I would expect you'd want to trigger the report if either something's not running or something's not accessible, so use || instead.

Third, you're doing string equality tests; that is, you're checking to see if the entire report consists of "not running", and nothing else. I'm pretty sure you want to see if it contains "not running" or "not accessible". You can do this with bash's more advanced conditional expression ([[ ]] instead of [ ]), which allows wildcard matches:

if [[ "$monstat" = *"not running"* ]] || [[ "$monstat" = *"not accessible"* ]]; then

... where the wildcards (*) match whatever's before & after the string in question. BTW, note that I also used = instead of == -- it's actually more standard in shell scripts. Another option would be to use grep to do the matching:

if echo "$monstat" | grep -E -q "not running|not accessible"; then

note that there are no [ ] or [[ ]] here; the if statement looks at whether the command succeeded or failed, and grep succeeds only if it finds a match. The -q part tells grep not to print whatever match it finds -- we don't want to see the match, just to know whether there was one.

Actually, it occurs to me that there might be a fourth serious problem: does monit status capitalize its status messages? This is important because "Not running" (or "Not Running") will not match "not running". If it's capitalized, either capitalize the search string the same way, or do a case-insentive search with either [[ "$monstat" = *[nN]"ot "[rR]"unning"* ]] or grep's -i option.

Oh, and a final note: if you don't need an else clause, just leave it out. No need to have an empty one with the : pseudo-command.

Anyway, with all these changes here's what I get for the entire script:

#!/bin/bash

# Variables
rsysl='rsyslog'
log='messages'

# Function to send a status message
nsca_status() {
    echo -e "$host\t$rsysl\t0\tOK" | /usr/sbin/send_nsca -H mon.lv.lan -c /etc/send_nsca.cfg
}

# Store output of commands
host=$(hostname)
monstat=$(monit status|grep -C 1 '$rsysl')

# Send message if there's anything wrong
if [[ "$monstat" = *[nN]"ot "[rR]"unning"* ]] || [[ "$monstat" = *[nN]"ot "[aA]"ccessible"* ]]; then
    nsca_status
fi

EDIT: I think I may've misunderstood the sense of the test; is it supposed to send the data if everything is OK? I was assuming it was sending an error status, and hence should send only if there was a problem. If that's the case, use appropriate !'s to invert the sense of the matches. In the [[ ]] version, use != to see if the string is not found:

if [[ "$monstat" != *[nN]"ot "[rR]"unning"* ]] && [[ "$monstat" != *[nN]"ot "[aA]"ccessible"* ]]; then

In the grep version, a single ! inverts the entire if test:

if ! echo "$monstat" | grep -E -i -q "not running|not accessible"; then
Gordon Davisson
  • 11,036
  • 3
  • 27
  • 33
  • Thank you for a awesome answer. Tried your fixes, but it does not work still. Would the problem be in the monit status command? I have heard that a command that has spaces in it, wont run in variable. Maybe I should change monit status command to function? – cr0c Jul 09 '14 at 07:13
  • And I do need both Not running and Not accessible match. If either one of them is ok I do not need to send OK status. – cr0c Jul 09 '14 at 07:28
  • The monit status is working fine, but the if statement does not seems to send the data to NSCA. I am fustrated over this :S – cr0c Jul 09 '14 at 08:11
  • 1
    @mYzk: I think I misunderstood when it should send the data -- see edit. – Gordon Davisson Jul 09 '14 at 16:00
  • I got it fixed already, but thanks :) It helped me a lot – cr0c Jul 10 '14 at 10:11