0

This is with a distributed Icinga 1 environment.

I have about 100 hosts on an Icinga 1 client/satellite that are stuck with UNREACHABLE status. All four checks for each host are returning OK state but the overall state of the device is UNREACHABLE.

The problem may have been caused by me leaving Icinga 1 running with the wrong permissions for /usr/lib64/nagios/plugins/check_icmp. (check_icmp did not have suid bit set.)

So I stopped Icinga and emptied the state retention file (state_retention_file=/var/spool/icinga/retention.dat) on the satellite and that didn't help. If I empty that same file on the master might it help?

ps shows my submit_check_result.sh submit_host_check.sh scripts running as zombies but they don't live very long.

mr.zog
  • 902
  • 3
  • 16
  • 36

2 Answers2

0

I had to restore my check forwarding scripts on the client.

Here are the broken bits.

# BEGIN submit_check_result.sh
##############################

return_code=-1

case "$3" in
    OK)
        return_code=0
        ;;
    WARNING)
        return_code=1
        ;;
    CRITICAL)
        return_code=2
        ;;
    CRITICAL)
        return_code=2
        ;;
esac
/usr/bin/printf "%s\t%s\t%s\t%s\n" "$1" "$2" "$return_code" "$4" | /usr/sbin/send_nsca -H 111.14.219.31 -c /etc/nagios/send_nsca.cfg &
# END Check_result

##############################

BEGIN submit_host_result.sh

##############################

return_code=2

case "$3" in
    OK)
        return_code=0
        ;;
    WARNING)
        return_code=1
        ;;
    CRITICAL)
        return_code=2
        ;;
    UNKNOWN)
        return_code=2
        ;;
esac

END Check_host
##############################
mr.zog
  • 902
  • 3
  • 16
  • 36
0

And here is what seems to have fixed the problem.

cat /etc/icinga/scripts/submit_check_result.sh

return_code=-1

case "$3" in
    OK)
        return_code=0
        ;;
    WARNING)
        return_code=1
        ;;
    CRITICAL)
        return_code=2
        ;;
    UNKNOWN)
        return_code=-1
        ;;
esac

# pipe the service check info into the send_nsca program, which
# in turn transmits the data to the nsca daemon on the central
# monitoring server
# submit to master Icinga den-mon-prod

/usr/bin/printf "%s\t%s\t%s\t%s\n" "$1" "$2" "$return_code" "$4" | /usr/sbin/send_nsca -H 111.14.219.31 -c /etc/nagios/send_nsca.cfg &

cat /etc/icinga/scripts/submit_host_check.sh

return_code=-1

case "$2" in
    UP)
        return_code=0
        ;;
    DOWN)
        return_code=1
        ;;
    DOWN)
        return_code=2
        ;;
    UNREACHABLE)
        return_code=3
        ;;
esac

/usr/bin/printf "%s\t%s\t%s\t%s\n" "$1" "$2" "$return_code" "$4" | /usr/sbin/send_nsca -H 111.14.219.31 -c /etc/nagios/send_nsca.cfg &
mr.zog
  • 902
  • 3
  • 16
  • 36