33

We're expanding our Nagios 3 setup, and are frequently coming across new NRPE or general Nagios plugins to test our existing infrastructure. This is in dribs and drabs though - it would be useful to get a summary of plugins that the Nagios users out there most value.

Please list a single plugin per post, preferably with a short description of why you love it and a link to the MonitoringExchange or plugin developer site. This way folk can vote for plugins already listed and we can see them in preferential order.

It would be better to not list plugins that others have already mentioned, for the same reason. If you have more to add regarding a plugin someone else has listed, please leave a comment to their answer.

Thanks!

Mike Pountney
  • 2,443
  • 2
  • 20
  • 15

24 Answers24

8

In terms of flexibility, you can't beat the snmp plugin. It's behind nearly every check I run, and if that isn't, the TCP connect is.

Matt Simmons
  • 20,218
  • 10
  • 67
  • 114
  • snmp check is looking pretty good for the favourite at this stage Matt (and others :) - do you have any special techniques that you utilise with it? My main beef with net-snmpd at the moment is the complexity of the 'new' extend-rather-than-exec functionality - does check_snmpd handle this well? – Mike Pountney Jun 06 '09 at 01:38
7

Our most useful plugins are one which test our higher application functionality. For example, we have tests that try to log into the website and tests that try to send an email and check check a pop3 mail box to make sure it arrived. If any of those things break, then we can use lower level checks to see what is wrong. Is the pop3 dead? the MTA, the MDA? The database server? the datastore?

David Pashley
  • 23,151
  • 2
  • 41
  • 71
  • Wotcha Dave ;) Do you have any tips on how you connect the higher and lower level checks? For example, how do you create a dependency between sending an email and checking the pop3 box to see it's there? Is it possible to do this purely via Nagios, or are you using your own logic? – Mike Pountney Jun 06 '09 at 01:49
  • Are you really up at 4am? :) Nagios has service dependencies, so you can make your email check depend on the smtp, MDA and POP3 checks, but this just means that the email check doesn't alert if any of the lower level ones do. Thi gets a lot harder if you have a cluster of servers providing a service as there's no way in nagios to say "don't alert us for this service if all these services are dead". Usually we just rely on knowing how the system fits together to know where problems lie. – David Pashley Jun 06 '09 at 04:49
7

Honestly, the one that does the most for me is plain old check_disk. Nothing makes me feel quite so special, in that "stop eating the paste" way, as having a server that was running fine yesterday blow up, running around like mad and then finding out it's because I let the disks fill up. Never having to do that again in my life is worth a lot to me.

(And don't forget to check the inodes, too, kids watching at home.)

chaos
  • 7,463
  • 4
  • 33
  • 49
7

WebInject is very useful for monitoring Web sites if you want to go beyond the check_http functionality; it can handle login pages and perform multiple steps in one Nagios check.

gareth_bowles
  • 8,867
  • 9
  • 33
  • 42
4

I find check_nfsmount is useful on many of my servers.

Edit: I would also vote up check_snmp if I had the rep to do that. It is in use on all of my servers, plus the logic behind check_hpjd which I have running on all of my HP Printers.

steve.lippert
  • 698
  • 6
  • 13
4

My most useful one is one that I wrote myself that checks the SSL certificates on our webservers so I can keep an eye on expiry.

TCampbell
  • 2,014
  • 14
  • 14
4

PNP (pnp4nagios.org) - generates RRD-style graphs for any Nagios check that outputs perf. data. Awesomely useful, especially when trying to convince the devs that that newly-installed service really is the cause of all those CPU spikes...

RainyRat
  • 3,700
  • 1
  • 23
  • 29
3

check_nt (talking to NSClient or something similar on the subject) lets you interrogate WMI on a Windows box - if there's a performance counter for it, you can now monitor it with Nagios.

RainyRat
  • 3,700
  • 1
  • 23
  • 29
2

The most useful for me is one i wrote for my needs: nagios-check-webpage

It downloads an entire page with js/css/images, with multi-threads and gzip (save lot of bandwith), like real navigators.

Vincent
  • 191
  • 5
2

This is a bit of a shameless plug, but if you're monitoring Windows machines using NRPE, NagiosPluginsNT seems to work pretty well. ;-)

Mike Conigliaro
  • 3,105
  • 2
  • 24
  • 24
2

I would agree that check_snmp is an extremely valuable plugin; it can be used for almost any purpose and everything shows up in SNMP generally speaking. SNMP is available on systems as diverse as HP-UX, Tru64, and OpenVMS with no additional installations.

Another (not quite a) plugin that is very useful is NagiosGrapher; I have my experience in an article that explains more, but also so that others can use it without any difficulties that I experienced.

One last: NSCA. You can write a Perl or Ruby or ksh script and feed the output into NSCA.

Between the flexibiities of NSCA and SNMP combined with the reporting of NagiosGrapher, this should expand your monitoring very well.

Mei
  • 4,560
  • 8
  • 44
  • 53
2

I like check_http to check my websites are still working, I have expanded it to check that certain text can be found on it after one time my hosting company decided to serve blank pages and my nagios checks all passed as the server was still running.

Simon Foster
  • 2,572
  • 6
  • 36
  • 54
1

Number one is NagiosWSC It lets you do agentless monitoring of Windows hosts over WMI

Zypher
  • 36,995
  • 5
  • 52
  • 95
1

One that checks the actual latency to pull up websites, and scans it for a 'status:ok' hidden tag. It caught a problem with our squid cache and a language set problem that only happened once every few nights at 3am when someone hit the site with a browser that requested a turkish language internationalized version of the page.

Seriously, set up every type of monitoring that you possibly can. The weird bugs and errors that you can catch in a complex environment with good monitoring is just amazing. Also, log your performance data to an rrd database and display it in Cacti.

Karl Katzke
  • 2,596
  • 1
  • 21
  • 24
1

One of the most important plugins is the one I've written myself: check_rdiff_backup. I do backups overseas, and Nagios tells me if and when something happens to them.

If you're looking for rdiff-backup plugin, there's one that you can find on Google.

zenek
  • 11
  • 1
1

check_curl for me has been a godsend. Really made a difference for flexibility with doing website checks, and also found it a lot easier than webinject, with almost all the same functionality that I needed

breadly
  • 217
  • 2
  • 12
1

Not strictly a plugin, but getting twurl ( https://github.com/marcel/twurl ) to work as a means for setting off alerts was an absolute godsend. No need for SMS alerts and just satisfying all round.

Details on how it was done: Nagios alerts using twitter (with twurl) not firing - apologies for the shameless self promotion ;)

Other things...

Check_diskio ( https://trac.id.ethz.ch/projects/nagios_plugins/wiki/check_diskio ) has been incredibly useful in conjunction with the standard CPU load and process number checks in determining when/if IO is bound, and in what way. Using nagiosgraph ( http://exchange.nagios.org/directory/Addons/Graphing-and-Trending/nagiosgraph/details) makes it even easier.

jhackett
  • 63
  • 5
1

Well, the simple, plain and default ones - check_disk, check_load, check_http are perfect enough for most of the cases(mostly we wanna know whether servers, websites are up and running, right)

and other than check_disk and check_swap, there is a check_memory http://exchange.nagios.org/directory/Plugins/Uncategorized/Operating-Systems/Linux/check_memory/details which gives the memory usuage...

Invent Sekar
  • 481
  • 1
  • 4
  • 5
0

Centreon definently for graphs and all the Nagios features !

Antoine Benkemoun
  • 7,314
  • 3
  • 41
  • 60
0

check_multiprocs used with check_nrpe link text

Nicolas Marengo
  • 266
  • 5
  • 9
0

This is kind of cheating, because I have done a lot of development on it, and it actually checks lots of different things at once, but the most useful nagios 'plugin' for me is Resmon. It is an agent you run on the server itself, and nagios connects over http to perform the checks. I guess it's similar to what nrpe does, but with a number of different design decisions.

Mark
  • 2,846
  • 19
  • 13
0

I had used two plugins .. one was to provide XML feed of the alerts, and other to send alerts via twitter. Both were useful (apart from the regular plugins). if you count NRPE as a plugin, then add that too.

Ram Prasad
  • 301
  • 1
  • 8
0

check_apt is really cool and reminds me about updating my Debian servers.

zenek
  • 11
  • 1
0

Recently I started using the check_multi plugin, using a patched / enhanced NRPE for large output (because of HTML and performance information).
It's almost "one plugin to rule them all"! ;-)

Henk
  • 1,321
  • 11
  • 24