1

I have built a system which does various types of time-series analysis and now I would like to feed it data from a monitoring tool. Since I have Nagios set up already in my test environment, I prefer to get it from there. But as a second choice I could get access to a test Zenoss instance, and would appreciate answers for Zenoss as well.

What I want

I want time-series for multiple KPIs on multiple devices.

Ideally I would be able to specify the data format, but as long as it contains the information I need I am happy to transform it upon receipt. The information I need is just

  • The device identifier e.g. 10.2.42.2 or Ubuntu-42A
  • The component being monitored e.g. CPU or Memory
  • The KPI e.g. %Usage, KBytes Available
  • The value of the KPI
  • The timestamp

Finally, I would like to send the data via HTTP (for now, later via HTTPS).

I can already do this in the case of an alert - for example when a threshold is breached I know how to configure Nagios to call a simple script of mine with the device IP etc. as parameters - and my script executes the HTTP request. But I haven't seen how this can be set up to fire on every poll.

What I don't want

I don't want alert data, I want the raw time-series.

I don't want to poll Nagios to get this data - the polling intervals would vary and I would like to avoid unnecessary network traffic.

I checked this question but that seemed to send data from slave Nagios nodes to a master Nagios node.

  • 1
    The protocol is documented, the source code is freely available and can be modified, the application is tested. Why don't you want to implement a gateway from NCSA to your app? – symcbean Aug 01 '13 at 12:36
  • Thanks for your comments. I was looking for a solution which did not involve touching the Nagios code: (1) fear of getting overwritten by an upgrade, and (2) a customer on whose infra this would eventually get deployed may simply not allow it. For now that is exactly what I intend to do, but I'd still like another long-term solution if one exists. – Rohit Chatterjee Aug 01 '13 at 15:44
  • 1
    You don't need to change either the Nagios code nor the code of the system which is sinking the data - you build a gateway. – symcbean Aug 01 '13 at 15:48

2 Answers2

0

You can do this with the pieces that are intended for distributed monitoring.

For example, use an ocsp command to send all check results elsewhere. The command def can point to a script that pushes perfdata via curl or similar.

Keith
  • 4,627
  • 14
  • 25
0

In nagios.cfg we added

  • obsess_over_services=1
  • ocsp_command=OUR_COMMAND_NAME

Then we defined the new command in commands.cfg:

  • command_name OUR_COMMAND_NAME
  • command_line /path/to/our/script

The script receives the following parameters:

  1. Host name
  2. Service Description
  3. Service State
  4. Message from the relevant plugin

Referring to my question: the device I wanted is this host name, and the component & KPI can be extracted from the service description and plugin message.

I do however need to do a little parsing work to get these values, since the plugin message is written more for humans than machines e.g.

OK - 1.05 GB used (1.05 GB RAM + 0.00 GB SWAP, this is 32.4% of 3.24 GB RAM)

but at least the format is consistent so I'm not complaining.