3

Setup

I'm currently having a working 2-nodes HA cluster using Pacemaker+Corosync. My nodes are running on Debian 8 (Jessie). Now I would be able to get notified when changes occur in the cluster (resources stop/start, promote/demote, move...).

Since email reporting is so-2008, I'd like to use Slack. To do that, I created a script that uses curl to post a message to my team's Slack channel, using Slack's webhooks. My script is using environment variables that are documented here: Pacemaker - 7.3. Configuring Notifications via External-Agent.

The script works fine when executing manually in the shell and is able to post on the specified channel. It also logs to /var/log/ocf-notifier.log.

Problem

Based on this answer, I created a new resource in the cluster using Pacemaker's ocf:pacemaker:ClusterMon resource agent, which calls my custom script (/usr/local/bin/ocf-notifier).

However I noticed that my script is simply not called at all when changes occur in the node (tried stopping resource as well as completely shutting down a node).

Therefore, I tried to launch crm_mon by hand like the following:

$ crm_mon -Arf --interval=2 -E /usr/local/bin/ocf-notifier -e '@jordan'

And see if this could trigger the thing by playing around with the cluster with another shell. As it turns out, crm_mon was able to see the changes happening in the cluster (node going offline, resource being stopped/started...), but my custom script never blinked an eye. My custom log file remains empty, and nothing appears in Slack since I believe the script is simply not called.

TL;DR

crm_mon does not call the external agent on cluster events, as it should with -E option. What am I doing wrong?

Habovh
  • 271
  • 3
  • 12
  • Could you show your script, omitting sensitive parts if needed? – gxx Feb 15 '16 at 08:42
  • Here's a pastebin to the script: http://pastebin.com/RGuY9htQ. The thing is, the script is working fine if launched manually. It is just not called by `crm_mon`... – Habovh Feb 15 '16 at 08:47
  • Two more questions: 1.) Which software versions are you using? How did you install / build these? 2.) Could you show your configured `ClusterMon` resource? – gxx Feb 15 '16 at 08:57
  • 1) `crm` 2.1.3, `Pacemaker` 1.1.12, `corosync` 2.3.4, they have been installed using `apt-get install pacemaker`. 2) `cluster_mon_p` primary resource: http://pastebin.com/3pCsMdSP – Habovh Feb 15 '16 at 09:05
  • From which repo did you install `pacemaker`? – gxx Feb 15 '16 at 09:07
  • `deb https://ppa.mmogp.com/apt/debian jessie main` – Habovh Feb 15 '16 at 09:09
  • Turns out the repo `ppa.mmogp.com` is down for some reason today. I initially installed this repo when I was following the instructions [over here](https://wiki.debian.org/Debian-HA/ClustersFromScratch) on how to setup a HA cluster. – Habovh Feb 15 '16 at 09:29
  • Did you solve this in the end? – gxx Apr 05 '16 at 13:47
  • Unfortunately no, I did not try again in the end because I needed to focus on other things, and my setup is now pretty stable and fail-safe. Despite not being actively looking for a solution, any new clues are more than welcome! – Habovh Apr 05 '16 at 14:02

0 Answers0