3

I'm looking for suggestions for a good monitoring tools, or tools, to handle a mixed Linux (RedHat 4-5) and HPUX environment.

Currently we are using Hobbit which is working reasonably well but it is becoming harder to keep track of what alerts are sent out for what servers.

Features I'd like to see:

  1. Easy configuration of servers.
  2. The ability to monitor CPU, network, memory, and specific processes

I've looked into Nagios but from what I have seen it won't be easy to set up the configuration for all of our servers ~200 and that without installing a plugin into each agent I won't be able to monitor processes.

nbartolomeo
  • 218
  • 1
  • 5

6 Answers6

4

Nagios may have a bit of a learning curve, but you can define templates within its configuration files that can be reused by other objects in it to save you time. It's a great monitoring system. You typically don't need a client installed on each host it is monitoring so long as the hosts have SNMP running.

Monitoring Windows systems with it can be a little different. For them NSClient++ works very well and is easy to install, even via a script, SMS, etc. http://nsclient.org/nscp/

sinping
  • 2,055
  • 14
  • 12
  • I could be miss understanding but I believe that you are required to have an agent+plugin on the server if you wish to monitor specific processes. I'm not against using an agent; I just don't want to have to configure an agent and plugin on each server that I need to monitor processes on. – nbartolomeo Apr 15 '10 at 14:56
  • True, but do you really need to monitor the process or the service it provides. It would be better to check that the server is responding to requests on port 80 rather than checking to see if the web server process is running for example. – sinping Apr 15 '10 at 15:37
  • @nbartolomeo: that is why you use a configuration manager like CFengine or puppet. – Dan Andreatta Apr 15 '10 at 15:59
  • @Dan Andreatta: Yes, that is another thing that we are looking into that I'm hoping to be able to implement along side the new monitoring software. @sinping: We do actually need to monitor specific processes. Some of the servers run weblogic and we need to know if specific modules for it are runnig not just if it is responding. – nbartolomeo Apr 16 '10 at 12:44
  • 1
    You can monitor processes via SNMP as well. See section '2c' here: http://agiletesting.blogspot.com/2005/10/mini-howto-2-system-monitoring-via.html – sinping Apr 16 '10 at 13:33
  • +1 for Nagios, I've found the SSH-based Nagios daemons much more consistent than any SNMP-based solution. No matter what, you're going to have to install something on all of those servers anyway, whether it's SNMP or the Nagios nrpe daemon. – gareth_bowles Apr 17 '10 at 04:11
3

Set up SNMP on your servers, preferably via some configuration management tool like Puppet.

Then, use a monitoring tool like Zenoss Core to monitor them. Zenoss can scan a subnet for hosts, which makes it easy to add 200 servers, and you can group/organize the servers in various ways, to determine what exactly is monitored.

We're only monitoring a dozen devices so far, but Zenoss is very powerful yet user friendly. It has a friendly GUI, history graphs, alerts, etc.

Martijn Heemels
  • 7,438
  • 6
  • 39
  • 62
  • I think this is solution we will end up going with. Though it is also possible that It will get stuck in management discussions and never happen. – nbartolomeo Apr 16 '10 at 15:33
2

My understanding is that Nagios is more suited for smaller installations. While I have not used it, it seems that OpenNMS is better suited for the scale of your installation.

Someone wrote a comparison between Nagios and OpenNMS

Dan Andreatta
  • 5,384
  • 2
  • 23
  • 14
  • I use Nagios with over 100 servers. Configuring the initial monitoring for 100+ servers will be time consuming regardless. – Warner Apr 15 '10 at 13:53
  • There are some very large installations of Nagios monitoring 10's of thousands of hosts and services that I know of. Yahoo is a big Nagios user for instance. http://video.google.com/videoplay?docid=-2694482537942655203# is a good talk on deploying Nagios in a large scale. – 3dinfluence Apr 15 '10 at 16:49
1

The good news is that there are many solutions to handle your requirements, now you get to choose. I'd look into the following products:

Zenoss

Groundworks

Zabbix

Hyperic

Brennan
  • 1,388
  • 6
  • 18
0

If you are allowed to use SNMP, give a look at Cacti. It's more easier to add / remove hosts than Nagios and i like their interface more. Cacti has ability to monitor CPU, network interfaces, memory usage, disk space usages, and services.

user40424
  • 296
  • 2
  • 3
0

I would recommend Zabbix, It can monitor your hosts with SNMP or via a agent installed on the servers, it is very flexible and scalable. Zabbix provide host discovery, but you can also make a XML file to import your devices into its database. They recently released an API interface wich make easy to integrate the datas from the monitoring into other applications (We've successfully build an Iphone app on top of this API).

Hope this helps.

Maxwell
  • 5,026
  • 1
  • 25
  • 31