What's your suggested tool to monitor multiple Unix (Linux and OSX specifically) based systems at the same time? I need to monitor the utilization of the CPU, memory, and disks in real time and would prefer a single tool to do so.
-
PROTIP: All of the recommended monitoring applications Nagios, Zenoss, Munin, and MRTG all use the same tool to store data and generate graphs. RRDtool. With some work you can migrate from one to another. http://www.arcanadev.com/adtempus/features/ – Joseph Kern Jun 29 '09 at 11:31
-
This question is somewhat related, if not a duplicate of http://serverfault.com/questions/44/what-tool-do-you-use-to-monitor-your-servers – Aron Rotteveel Jun 30 '09 at 07:20
8 Answers
nagios ! i've never used it with OSX, but quick googling shows that there are nrpe plugins for it.
i use nagios in environment with ~80 linux and windows servers, but there are deployments for thousands of servers as well.
for trend tracking - munin mentioned here as well - is very nice tool as well. you can feed critical readings from munin back to nagios.
ps. choice of tool heavily depends on definition of realtime. if 2 minute lag from event occuring to notification being sent over sms / e-mail is ok. at least older versions of nagios which did not allowed more frequent checks then once per minute.
- 29,561
- 5
- 64
- 106
-
2I've used it with a couple of different OSX servers, and the NRPE/plugin stuff works just the same. – RainyRat Jun 28 '09 at 22:01
Zenoss does everything you want out of the box and can work over either SSH or SNMP. I've also previously used Zabbix as a full on monitoring system, and previous to that Cacti for trends and Nagios for alerting.
All of these are free, and some are more integrated than others. Zenoss has the benefit of tying together a lot of things out of the box, and the option of enterprise level support. Being based on Zope, it's a little more resource intensive to run than the others, but very easy to hack on if you know Python.
You should give all of these a trial run and see which one fits your use case the best.
- 11,946
- 7
- 46
- 68
I use Munin, myself. It's like nagiosMRTG, but I liked it better back when I evaluated them both. I forget why.
- 5,217
- 1
- 27
- 39
-
-
Munin is a graphing system. Nagios is a monitoring system. As such, they're as alike as Apache and Exim. – David Pashley Jun 28 '09 at 23:23
-
Nagios is free, popular and open source. There are a lot of monitoring plugins availble (for different devices and services). Unless you use a separate GUI, configuration is by text file. It sends alerts notification e-mails, which is how my organisation traps and responds to system problems (alerts into a ticket system).
There are various methods for collecting information from machines being monitored. Whichever monitoring solution you choose, I recommend collecting information from each system directly by SNMP. If you're unfamiliar, it'll take a small bit of learning. In the end however, it's the standard solution that Just Works.
- 556
- 4
- 6
I've had experience with both OpenNMS and Nagios, and for the task you describe I'd be inclined to use Nagios. It's quite easy to set up and if you need to provide custom probes for anything, a little bit of script-fu is all that's needed.
If you're willing to set up SNMP, MRTG is quite good. It's quite complicated, unfortunately so requires a certain amount of work to setup. If you're not looking for trend graphing specifically, nagios is reasonably good, as recommended earlier. It's also got several plugins which allow it to do trend graphing (Nagiosgraph is what we use.). Munin is ok, but its graphs are in my experience more than a little overcomplicated. Bigbrother can do monitoring, but I'd avoid it if at all possible. It's more than a little bit of a train wreck.
- 5,777
- 1
- 27
- 40
The key for monitoring multiple linux/unix hosts with nagios is creating a tarball that can sit on all of them. Spend a little time on the front end, and your life will be a breeze later on. Just unpack it annd you are good to go.
- 217
- 2
- 12