What tool do you use to monitor your servers?

Question

For a more comprehensive list of monitoring tools and their features, check out this Wikipedia page.

As the question states, what are the most commonly used tools used for this task and what are their strengths and weaknesses?

My servers are running Debian Lenny, but the question is not primarily focussed on UNIX-monitoring alone as many tools will probably have some form of cross-platform support. — Aron Rotteveel, Apr 30 '09 at 08:24
Maybe they use different tools but from an overall system point of view you end up doing the same thing over and over again on the different systems. It's just a bit of scripting to squeeze out the last bit of data you want. I'd consider "tools" in this context the recording instance (monitoring server) not the actual plugin/script that spits out the data — Martin M., Jun 10 '09 at 05:02
I like to also monitor the applications (performance, availability, etc). Monitoring tools seem to have a spectrum with their ability to monitor hardware on one end and their ability to monitor applications on the other. Hardware<-----+----->Application — Nathan Hartley, Oct 15 '10 at 15:13

score 136 · Accepted Answer · answered Apr 30 '09 at 08:25

136

I've used Nagios in the past with success. It's very extensible (over 200 add-ons), relatively easy to use and lots of reports. A negative would be the initial setup.

answered Apr 30 '09 at 08:25

jdiaz

1,189
3
15
16

10

Nagios works great to monitor all types of host (Windows, Linux, Routers, Switches, etc.) I recommend using a configuration tool like fruity or Lilacto ease the configuration pain. NSClient++ on the windows boxes and nagios-statd on the linux stuff to monitor running processes, disk usage, etc. – TonyB May 01 '09 at 23:27
Unfortunately Nagios requires an agent on Windows boxes - in the past I've found the agent notoriously prone to randomly dying. – PowerApp101 May 12 '09 at 13:04
We looked at both Nagios and Zabbix for our monitoring. Zabbix won after a short evaluation, mainly due to ease of deployment and functionality (for example, Zabbix includes graphing as a core function while Nagios requires a plugin). I found configuring Nagios to be a pain. – May 27 '09 at 16:14
GroundWork OpenSource have a network monitoring appliance that uses Nagios at it's core, and simplifies the setup/management – Rog Jun 01 '09 at 02:40
12

There is a new nagios fork called icinga. It is nowhere yet, but their goals looks promising. http://www.icinga.org/ – cstamas Jun 01 '09 at 17:34
There is a plug-in to nagios called Nagios WSC - it allows you to do agent-less monitoring of windows hosts via WMI - it's an ASP.net web app so you'll need an IIS server to run it off of. http://nagios-wsc.sourceforge.net/ – Zypher Jun 01 '09 at 18:47
With regard to the windows nagios agent dying - we've found that it often has conflicts with other services on the default port. Move it up to an unused high port, and you should be fine. – Brent Jul 08 '09 at 23:04
OpsView is a nice nagios wrapper that moves all the configuration into the gui, and adds some nice functionality like service graphs, and a central command center for multiple instances. – Brent Jul 08 '09 at 23:05
If you follow the Nagios quick setup guide, you can be up and running within 15 minutes, add the PNP4Nagios package and you have trending and reporting. – Dan Oct 05 '09 at 10:18
Groundworks' WMI Monitoring plugins for NRPE work pretty well - http://www.groundworkopensource.com/community/downloads/plugins-download.html – dunxd Sep 14 '10 at 11:45
Love Nagios. Didn't find it hard to set up, but I did have to write some custom dashboards to support a complex privilege model though. – symcbean Oct 06 '10 at 12:32
I love the Groundwork Opensource fork of Nagios – cpgascho Dec 13 '10 at 15:25
The nagios fork called "icinga" really rocks. The new webinterface is nice, with the integrated pnp and lconf the admin live is a lot smoother. – shakalandy Apr 11 '11 at 08:54

score 70 · Answer 2 · edited Nov 14 '11 at 18:17

70

Cacti is a very good web-based frontend to RRDTool, providing very handy graphs and stats. RRDTool is the part that gathers data from multiple systems and monitors a wide range of technical data.

We're using that cacti/RRDTool solution to monitor Unix and Windows systems. We get a lot of useful metrics including load, CPU/RAM usage, HD space, users logged in, network traffic, running processes, and so on.

You will find more information on cacti on the What is Cacti? page.

edited Nov 14 '11 at 18:17

Skyhawk

14,149
3
52
95

answered Apr 30 '09 at 08:42

paulgreg

4,094
6
31
32

Cacti is a fun solution that looks great and comes at a great price (free). However, setup of network devices is a PITA and was poorly documented. It might be better now but I wouldn't commit to it until you've done your research. – Chris Porter May 05 '09 at 03:05

score 57 · Answer 3 · answered Apr 30 '09 at 08:26

57

Personally, I love Munin which is very easy to install and to write plugins for as it has a very straightforward architecture. There are quite many plugins already around for all the purposes you could imagine, so you probably won't even have to write plugins in the first place.

It also provides beautiful graphs and the option to configure (very basic) alerts.

answered Apr 30 '09 at 08:26

pilif

638
9
11

2

I'm a big fan of Munin too. It has support for integrating with Nagios (so you can run both), and support for all common flavours of unix. I don't think there's any support for monitoring a Windows node - however it's written in Perl, so while it may be non-trivial it should certainly be *possible*. – John Dalton May 01 '09 at 06:35
2

@John. Windows node are supported via either munin-node-win32 that is a native munin node, or via SNMP just like any host. – Steve Schnepp May 04 '09 at 14:09

score 34 · Answer 4 · answered Apr 30 '09 at 09:14

34

Zabbix. It's open-source, and reasonably simple to setup and customise. We have a lot of custom monitoring scripts that feed into the zabbix server, but it takes care of centralising that data, displaying it appropriately, notifications (email, IM, SMS, twitter, etc), and so forth.

answered Apr 30 '09 at 09:14

Tony Meyer

889
1
13
25

2

We're also using Zabbix and find it to be pretty powerful and configurable. We tested both Zabbix and Nagios and opted for Zabbix in the end because while Nagios seems to have a good reputation, it's a bit of a pain to install and a lot of functionality comes from plugins rather than featuring within the core application (graphing is a good example of this, you get it for free with Zabbix). – May 27 '09 at 16:12
3

I prefer Zabbix because it flexibility in terms of graphing and mapping your infrastructure (in terms of availability) as well as a flexible way of monitoring. – Andrioid Jul 05 '09 at 10:02

Shard · Answer 5 · 2009-04-30T11:24:38.557

29

I have been doing roll outs of Spiceworks at our company and we are finding it to be a great tool not just for monitoring servers but everything else on the network.

It does things like automatic inventory and custom monitoring to send you emails when there is a problem (EG: Printer is down to 10% of ink or hard drive of this server has 20%).

Its downside would probably be is density of information per computer, don't get it wrong it has A LOT of data per machine but for things like servers where you might want a lot of stats you might need to use another tool.

EDIT: oh did i mention its business model is based around it being free forever.

edited Apr 30 '09 at 11:24

answered Apr 30 '09 at 08:33

Shard

1,432
4
21
35

Spiceworks does a lot of awesome stuff - and FREE. – Apr 30 '09 at 10:19
3

SpiceWorks has a really large community that overlaps with ServerFault quite a bit as well. Going to be interesting to see the interplay between the communities. I use SpiceWorks as well. Awesome tool. – Scott Alan Miller Apr 30 '09 at 19:31
Am now using this based on your recommendation. Excellent tool. – Marko Carter May 29 '09 at 16:07
We use it at our work. It is quite impressive. The inventory alone of hardware, not to mention software, is worth a look on it's own. – Terry May 29 '09 at 21:52
Last time I used Spiceworks (version 3 something), it didn't have any way to add or modify hardware components such as monitors, video cards, etc. It would detect them, but often incorrectly. Thus I'm still using GLPI + OCSNG which I *hate*. – Boden Jun 16 '09 at 21:33
This is windows only? – barfoon Aug 20 '09 at 19:46
i would not recommend spiceworks, as it runs on windows and its platform support, scalability is very limited. spiceworks is very immature in its design and has 2-3% of what nagios can do. – Farhan Nov 14 '11 at 20:16

score 18 · Answer 6 · answered Apr 30 '09 at 20:13

Smokeping not only checks the availability of various servers and services but also keeps track of their latency while providing easy to use, nice looking, and quick to display graphs.

Wide range of latency measurement plugins is available out of the box. If you know some Perl, it is easy to create your own ones for any exotic needs.

Large installations will benefit from Master/Slave System for distributed measurement.

Highly configurable alerting system will help you notice issues before they start affecting users or evolve into major outage.

Smokeping is free and OpenSource Software written in Perl by Tobi Oetiker, the creator of MRTG and RRDtool

Smokeping is good to see what your network is like – Amandasaurus Aug 18 '09 at 09:49 — Amandasaurus, Aug 18 '09 at 09:49
Smokeping is amazing for visualizing latency. – James Sep 24 '09 at 11:06 — James, Sep 24 '09 at 11:06

score 15 · Answer 7 · answered Apr 30 '09 at 15:32

15

OpenNMS is used where I work to monitor more than a thousand Linux machines. We monitor the hardware of each machine and the applications running on them.

answered Apr 30 '09 at 15:32

jassuncao

185
1
3
9

+1 for OpenNMS, we also use this at work to monitor thousands of machines and interfaces. We have many different operating system, and we are able to monitor all of them using OpenNMS. – Steve K May 02 '09 at 19:48
not my first choice but very useful – May 20 '09 at 09:03
how is it with adding MIBs for new hardware? – slovon Jun 16 '09 at 09:47
OpenNMS has a lot of snmp stats already in its default config so it can auto-discover and start graphing out of the box. New SNMP stats are pretty easy to add, just give a name for the RRD, the OID and data type and put it in a group for the type of device the stat applies to. – mtinberg Aug 03 '11 at 20:07

score 15 · Answer 8 · answered Apr 30 '09 at 16:40

15

Zenoss Core is of some use, We are using it (for about a year) for lightweight monitoring of servers, net switches and UPSs.

Zenoss Core is an award-winning open source IT monitoring product that effectively manages the configuration, health and performance of networks, servers and applications through a single, integrated software package.

answered Apr 30 '09 at 16:40

gimel

1,193
7
9

If you use the free version of Zenoss Core, be ready to do a lot of SNMP MIB tweaking. I also found that it steadfastly refused to gather operating system data on some of my servers, and is surprisingly difficult to set up for simple tasks like checking the contents of a Web page. – gareth_bowles May 04 '09 at 19:58
Can sympathize with MIB problems, but web page checking can be done with Nagios plugins on Zenoss. – gimel May 05 '09 at 05:15

score 12 · Answer 9 · answered Apr 30 '09 at 08:34

Nagios is great since it's free and there is plenty of plugin's for it. However the UI and config is very difficult.

It's exact opposite in pro's/con's which is also great is Microsoft System Centre Operations Manager (SCOM) which is not free, has less plugin's but setup and config are brilliant and easy.

I must admit if I was in a primarily Microsoft company, had very high reliance requirements (i.e. can't afford for monitoring to break) or had to think about getting developers to work with it then SCOM would be my recommendation over Nagios.

score 12 · Answer 10 · answered May 01 '09 at 22:22

I've used:

Nagios - requires some old-timey command line setup, not pretty, but sturdy and functional. It has been superseded by:
Zenoss - requires much less footwork to set up, has a commercial variant. Once running, the rest is controlled through a browser. Very powerful, but requires some MIB work if you use the free version.
Intermapper - commercial program, spendy if you have lots of nodes to monitor. Appears to be written in Java (for better or worse).
Spiceworks - haven't tried the latest version. Older versions needed a little more umph under the hood to get it to respond, but otherwise, it works nicely. Free version comes with nag ads.

I use InterMapper as well. The console client is written in Java. The server is written in Python. Postgres is used as the backend database for data aggregation and reporting. — lsiu, Feb 06 '12 at 15:14

score 11 · Answer 11 · answered Jul 05 '09 at 10:01

We use AlertFox since a few weeks and are very happy it. It not only checks our uptime and performance, but also monitors shopping cart, user login and other critical parts of the website via transaction scripts (iMacros based).

For our internal monitoring (disk space etc) we use Nagios.

score 10 · Answer 12 · answered Apr 30 '09 at 15:27

10

PRTG Network Monitor - can't say enough great things about it. Awesome web front end and especially great for monitoring routers (bandwidth etc) and other devices through SNMP and measuring uptime for SLA's, etc.

www.paessler.com

answered Apr 30 '09 at 15:27

Brandon

2,807
1
22
28

score 8 · Answer 13 · answered Apr 30 '09 at 09:44

8

As a Windows person, MOM. We're looking to upgrade to Systems Center Operations Manager (SCOM) but won't need to until we start deploying Windows 2008.

answered Apr 30 '09 at 09:44

Richard Gadsden

3,696
4
28
58

I use MOM also. I love it and hate it at the same time. – spoulson Apr 30 '09 at 11:44
SCOM is great monitoring platform for Windows based Enterprise environments. The real genius here is the Management Packs released by the Microsoft product groups themselves (this is part of the MS Common Engineering Criteria that every product have a SCOM MP within 90 days of RTM). Getting advice and knowledge from the product teams themselves can greatly improve the ability of an operations department to keep things running and healthy without bothering the more senior admins for every little thing. – Kevin Colby Aug 17 '11 at 19:57

score 8 · Answer 14 · answered May 01 '09 at 20:52

8

I'm surprised nobody has mentioned logwatch or logcheck for linux servers - saves a tonne of time reading logs!!

answered May 01 '09 at 20:52

Brent

22,219
19
68
102

Those tools wont really give you metrics and long term readability of your infrastructure trends. They are a nice addition but I wouldn't solely rely on them. Afaik "logwatch" is somewhat evil as it will only report about errors you tell it about as opposed to "logcheck" where you tell the tool known good stuff and it will report everything else. – Martin M. Jun 10 '09 at 05:09

score 8 · Answer 15 · answered Apr 30 '09 at 12:49

For monitoring statistics (memory usage, load, mysql activity, apache activity, etc.) I use Munin. Out of the box it already tracks a lot of things and plots graphs for different time intervals (last 24 hours, last 7 days, last month, last year). Through plugins even more things can be monitored. It's output are HTML pages with pretty graphs.

Munin has a master/node architecture: nodes gather statistics on a server and the master stores the data and produces HTML and graphs.

I use Monit to keep track of running processes and to restart or alert me when certain configureable conditions arise (high cpu load, high memory usage, no HTTP response, etc.) Monit can also monitor more general things about a server, such as cpu load, memory usage, harddisk status or disk usage.

Monit needs to be configured for every service or hardware you want to monitor and how to respond when something goes wrong. The most used options are to do nothing, send an alert email or restart the service.

Monit is great when it works, but sometimes it fails to start, stop or restart a service and there is not a lot of diagnostic information available to tell you what went wrong. This means you don't know if the problem was with your service or with the Monit configuration, which runs with a cron-like minimal environment.

Both tools are available by default on most Linux distributions.

score 7 · Answer 16 · answered Apr 30 '09 at 20:46

Our project uses Ganglia for our 100+ node clusters. One reason we use it is because it's the monitoring tool that comes with Rocks.

It's important for us to have very low overhead on each node so that as many resources as possible are available for computation. Ganglia gives us a good overview of the cluster and allows us to drill down to individual nodes if needed. Besides know what's going on right now, we can get a pretty good look at what's happened over the last hour, day, week, month, and year. The graphs of various statistics are basic and functional.

score 7 · Answer 17 · answered Apr 30 '09 at 11:48

7

I'm part of a operational monitoring upgrade project. We've had various vendors come onsite to present a few big dollar systems and mixed in some cheaper alternatives to compare.

One of which is Hyperic, which is also available as a free open source solution. I was impressed with its delivered capabilities and extensibility for custom agents.

answered Apr 30 '09 at 11:48

spoulson

2,173
5
22
30

While it is not easy on resources, it surely is a great monitoring tool! – Vincent De Baere May 04 '09 at 14:02

score 7 · Answer 18 · answered Apr 30 '09 at 12:56

7

I use Pingdom for monitoring my server. It sends me an SMS message when the server is unreachable.

answered Apr 30 '09 at 12:56

Jon Tackabury

540
1
7
14

score 6 · Answer 19 · answered Jun 02 '09 at 03:30

It all depends what you mean by "monitor"!

Is it (system or service) available? We use nagios.
What is it doing? We use munin for linux servers, and cacti for just about everything else, even though it is a pain to configure sometimes...
What has it done? We use syslog-ng to concentrate syslogs in one place and then run a customized logcheck script daily to send reports via email. We are looking for something similar for Windows servers.

score 5 · Answer 20 · answered Nov 16 '09 at 04:52

A new entrant on the scene to check out for competing with Cacti and the RRDTool based solutions is Graphite (http://graphite.wikidot.com/)

RRDTool is replaced with a backing store called Whisper. The docs give a pretty good overview of why it differs and I really like the CLI for ad hoc graphing when investigating something.

dr-jan · Answer 21 · 2009-05-12T21:46:48.417

4

Hobbit - it's a faster better version of Big Brother (which seems to be alarmingly commercial these days).

http://hobbitmon.sourceforge.net/

edited May 12 '09 at 21:46

answered Apr 30 '09 at 15:56

dr-jan

434
7
16

We also use Hobbit, it's awesome, it handles 600+ servers with 10+ monitors each, many of them updating every minute – MarkR May 12 '09 at 21:37
1

Hobbit is now called Xymon. http://www.hswn.dk/hobbiton/2008/11/msg00123.html – Clinton Blackmore Jun 08 '09 at 16:49

tomjedrz · Answer 22 · 2009-04-30T16:23:51.253

We use (and like) WhatsUp from Ipswitch for our relatively small Windows network. It is easy to setup, and relatively easy to manage, and knows how to deal with Windows servers as well as standard stuff.

For larger networks, non-Windows-oriented networks, or networks with lots of varied stuff, I heartily recommend OpenNMS. OpenNMS software if free and the company is more than happy to sell support and implementation services. It also happens to be run by a very sharp friend of mine from college!

score 4 · Answer 23 · answered Jun 16 '09 at 09:02

If you're in a hurry and want a quick tool to monitor your MS server then use performance monitor for windows, set up a counter log with custom monitoring template and a custome schedule (eg: collect data for 5 min every hour). Then download Microsoft's LogParser and Codeplex's Performance Analysis of Logs (PAL) Tool (http://pal.codeplex.com/) to crunch your counter log. PAL will generate a great documented report with links to possible issue solving documents/tools.

score 4 · Answer 24 · answered May 05 '09 at 00:16

For those who don't like the Nagios web interface there is NPC, a plugin for Cacti that makes the Nagios UI available from within Cacti, but with better looks (ajax etc.).

It reads from a database provided by NDO2DB, which is a great way to have your infrastructure available from within a database for use in scripts and other tools.

score 4 · Answer 25 · answered May 12 '09 at 13:10

Currently we use PRTG from Paessler. It's excellent. No agents required, excellent Ajax web interface, historical logging, graphing, WMI, etc etc. There's a 10 sensor version available for free but we plonked down a couple of grand for the enterprise version. Money well spent.

score 3 · Answer 26 · edited May 22 '10 at 10:38

3

Zabbix (http://www.zabbix.com) is good too and easier to setup than Nagios.

edited May 22 '10 at 10:38

Aron Rotteveel

8,239
17
51
64

answered May 22 '10 at 07:27

Vivek Varghese Cherian

425
4
3

score 3 · Answer 27 · answered Apr 30 '09 at 19:26

I use a combination of Solarwinds, VMware server performance tabs, and custom scripts.

Solarwinds Orion Network Performance Monitor is what I use with our Windows sys. admins on my web servers. Still getting some useful app metrics running on it, but it has good information on basic box level stuff (disk, network, CPU).

For my VMware guests, I love the performance tabs.

For my Sun servers, when I need something that isn't available in Solarwinds (because our admin hasn't added it or what), I write custom scripts (usually in Perl) to monitor things like mirror health, swap usage, etc.

I'd like to get more onto Solarwinds, but there's only like 26 hours in a day (or so my boss believes) so I find that can be a tad limiting...

score 3 · Answer 28 · answered Jun 10 '09 at 17:07

We use OpsView, which runs on top of Nagios. The webUI helps us deploy new host monitor definitions without having to allow SSH access, provides public views, and records historical values. This is handy for provisioning and determining suitable baselines.

score 2 · Answer 29 · answered Apr 30 '09 at 08:44

2

Sorry to say but I've ended up using lots of custom scripts. While far from ideal I doubt there's a more common solution.

answered Apr 30 '09 at 08:44

Matt Lacey

143
2
8

There will always be a need for custom scripts! – Techboy Apr 30 '09 at 11:58

score 2 · Answer 30 · answered Apr 30 '09 at 09:07

2

We've written our own monitoring software. Our code isn't nearly as sophisticated as a commercial package, but we didn't need much functionality. It was easier to write our own than to investigate other packages and learn how to use them. The code does just what we want and it's easy to extend.

answered Apr 30 '09 at 09:07

John D. Cook

151
3

2

I think it's important to think through the implications of a decision like this. To write something from scratch may not be that much of effort - but maintenance down the road is a bear. – Adam Apr 30 '09 at 18:29
I could imagine maintenance being a problem, but it hasn't been for us, even though we've run this system for years. Since the code base is small and familiar, it's been easy for us to add new functionality as needed. Maintaining a commercial solution could also be a problem over time, grafting on pieces from new vendors when the original product doesn't do everything you need, etc. – John D. Cook May 01 '09 at 02:32

score 2 · Answer 31 · answered May 02 '09 at 23:21

2

I'm using PA Server Monitor . It's primarily Windows focused (event logs, performance counters, services, etc) although getting better with other systems now that some limited SNMP support has been added. The thing I like best is it's easy to configure compared to a lot of apps (no config files, no command lines, etc). I wouldn't recommend it for a heavy *nix environment though.

Oh, it's not free, but less expensive than some competitors.

answered May 02 '09 at 23:21

DougN

670
2
7
16

They also have a free edition which I use to monitor my private server. http://www.poweradmin.com/ServerMonitor/Free.aspx?show=monitors – Flo May 05 '09 at 11:12

score 1 · Answer 32 · edited Oct 05 '09 at 07:51

1

I use Polymon and love it.

http://www.codeplex.com/polymon

It's fantastic for monitoring anything that can be communicated by TCP Port, SNMP, Powershell, WMI, SQL, HTTP, Perfmon, or Ping.

I don't monitor anything *nix, so I can't speak to that. But for the Windows world it's very simple to set up, extremely intuitive, and extremely flexible, It has very nice built-in dashboard display, sms or email notification, etc.

edited Oct 05 '09 at 07:51

Anton Gogolev

1,572
3
16
22

answered Jun 10 '09 at 16:29

Bob

597
2
8

It was the Powershell support and price (free) that sold me. We use it to monitor our systems at the application level. – Nathan Hartley Jul 15 '09 at 19:49

score 1 · Answer 33 · answered Apr 06 '11 at 13:59

I've worked with Pandora FMS, and I like it mainly because it's very flexible and easy con configure for the average sysadmin. Also I like the web interface with all the reports and the extensive documentation. And not very useful for a single datacenter, but very cool is the geolocation interface that shows the position of the agents monitored.

I've also tryed Nagios and I like all the plugins it has, and that it's well known among sysadmins.

Note: I've been one of the developers of Pandora FMS for some time.

score 1 · Answer 34 · answered May 02 '09 at 12:18

1

For HP servers you can't beat their Systems Insight Manager (SIM), lots of lovely low-level counters and alerts etc., not a bad GUI either and the link to your support contract is worth the effort on its own.

answered May 02 '09 at 12:18

Chopper3

100,240
9
106
238

score 1 · Answer 35 · answered Nov 16 '09 at 10:27

We needed something customisable as we need to monitor some systems which are not online all the time, but can send mail or be dialled in.

We tried nagios (maze of scripts), AppManager (nice, but nonadaptable), Zenoss (nice, but when you mention Oracle, price gets hefty multipliers) and landed on Zabbix which has open protocol, open database structure, heck, I can write a plugin on every level in a hour. It's nicely compartmentalised (server, client, database, ...). And it's web frontend is quite nice and customisable.

YMMV, for us the monitoring of "offline" systems is important and it is usually not covered by such software.

score 0 · Answer 36 · answered May 29 '09 at 16:14

We use WhatsUp from ipswitch, it's very easy for setup small networks, it can autodiscover networks by port scan, it can use windows and SNMP credentials.

For monitor statics like cpu, mem, and disk, we need to setup SNMP. WhatsUp support SNMP v1, v2, v3.

WhatsUp have a passive monitor through syslog (Unix), event viewer (Windows) and SNMP Traps.

It has a nice ajax web interface with custom user and custom workspaces.

P.D. sorry for my bad english

score 0 · Answer 37 · answered May 29 '09 at 21:43

I've used hobbit, big brother and nagios when working for poorer (read cheaper) organizations. Of the three I prefer hobbit because its simple and bulletproof. I've always felt that nagios is is trying to be an open source version of openview or tivoli, and frankly if I have the time to spend configuring a framework like openview or tivoli then monitoring is probably my entire job and my organization can probably afford to buy openview, so why use nagios?

score 0 · Answer 38 · answered May 30 '09 at 04:19

We've just started using "Servers Alive" which is very inexpensive, it isn't too pretty looking, but it supports a tonne of different checks and can alert in several ways, handles technician scheduling/rosters etc for any notifications. You can also make checks rely on others, i.e. "this" system requires "that" to be up/running.

score 0 · Answer 39 · answered May 30 '09 at 04:41

0

For Windows: Admin Arsenal (but that's a given in that we own the product)

For Unix - IBM Tivoli

answered May 30 '09 at 04:41

Shawn Anderson

542
7
14

score 0 · Answer 40 · answered May 30 '09 at 06:37

0

We use Orca to monitor our systems. It's not super pretty, but it gives a ton of low level details other monitoring systems don't use.

answered May 30 '09 at 06:37

Blair Zajac

531
5
9

score 0 · Answer 41 · answered May 30 '09 at 07:38

I use a combination of Nagios, Cacti, custom scripts and one of my own projcts -- System Health Monitor. I like having external service monitoring as well as graphs of system resources so you can do post-mortem analysis of system problems or quickly check the graphs to see if things look 'normal' compared to their historical values.

score 0 · Answer 42 · answered Jul 08 '10 at 20:40

0

WhatsUp Gold from Ipswitch

answered Jul 08 '10 at 20:40

colealtdelete

6,009
1
29
34

score 0 · Answer 43 · answered May 31 '09 at 22:47

0

Nagios combined with nagvis (graphics to show off monitoring)

linked to mail, google talk and twitter.. so you cant escape the monitoring

its even got a great firefox plugin

answered May 31 '09 at 22:47

hoberion

231
2
15

score 0 · Answer 44 · answered Jun 01 '09 at 17:41

I am using nagios and hobbit (bigbrother opensource implemenation) independantly and have found both having positive and negative qualities.

nagios:
pro: has a nice sub-minute scheduler for running tasks at regular intevals and has an embedded perl interpreter to boot.
con: config insists on having a 'server' for every test, when sometimes you just want to run a test that is based on an application 'feature' but not necessarily isolated to a single host. Revert to a meta-config that generates the actual nagios config to overcome this.

hobbit:
pro: opensource compiled server instead of the massive scripts used by original big-brother easy integration with the bb client 'dboard' command to poll data.
con: also stuck in a 'server-oriented' mentality, which fits most folks, but not me.

score 0 · Answer 45 · answered Sep 14 '10 at 11:52

Currently using Groundworks Open Source Community Edition 5.3 - although support has fallen by the wayside on that version now. May upgrade to GWOS 6 or perhaps jump ship to Zabbix or similar Open Source system. I tend to favour those based on Nagios, but wouldn't go for vanilla Nagios due to the nightmare of managing all those interdependent config files.

Groundworks' WMI Monitoring plugins for NRPE work pretty well. Nagios triggers a WMI service check on a windows box using NRPE, which then does the WMI querying of your other windows boxes. This gets around the requirement to have NRPE agents on your windows boxes, and also the nightmare of trying to get Nagios running on *Nix to authenticate on Windows.

Another nice option is to set up SNMP on your windows boxes as part of your base build. There are some options out there to expose WMI checks via SNMP (SNMPTools) (although you need to install this on each Windows box, making it not agentless).

There are a number of Windows tools which can monitor Windows logs and send an SNMP trap when certain events occur.

Does GWOS support passive checks with NSCA? With NSCA the communication is reverse from active checks like NRPE, SNMP or check_ssh. Passive checks run on remote server, and then send that data to the Nagios server. — Stefan Lasiewski, Sep 14 '10 at 16:01
It supports anything that Nagios supports - including passive checks. I know it can handle SNMP traps (although I still haven't got that working) and I believe NSCA checks are also possible, but have never tried it. — dunxd, Sep 14 '10 at 21:04

score 0 · Answer 46 · answered Oct 05 '10 at 15:54

We're using AlertGrid, it's ideal for web apps. Unlike millions of typical dotcom monitors it does not monitor performance (response time etc.) from outside, but it lets you trace the execution of your code and all your custom metrics/statistics by sending events from inside of your app. Once you start sending events from your app to AlertGrid, everything is configurable using nice visual editor (100% web) and non-technical people can easily create their own alerting rules. Email, SMS, phone and webhook alerts are available.

It has a plugin for simple server monitoring (windows), which installs as a service, runs in background and emits events about cpu usage, % free RAM, and processes runing. Takes half a minute to set up, and it works! The only caveat is that the machine must have an internet connection.

score 0 · Answer 47 · answered Oct 08 '10 at 11:32

We started using Server monitoring Bijk.com - http://www.bijk.com before several weegs ago.

And we are happy for simple installation and very easy GUI and maintanence - mail & SMS alerts for free is good for us.

score 0 · Answer 48 · answered Oct 15 '10 at 10:25

I use 10-Strike Network Monitor

It works as service 24/7 and monitors all devices in the network by periodc polling each device within lan. Also Ican set up the program's response to particular events for example device or service on/off. Program can display a message, play a sound, run external programs, write a record to log, send SMS, restart/shut down a service or a computer and so on.

score 0 · Answer 49 · answered Jun 04 '09 at 17:52

0

We us IP Check which has been renamed PRTG it allows for a wide range of sensors that can monitor all sorts of different activity.

answered Jun 04 '09 at 17:52

Leigh Riffel

605
2
10
23

score 0 · Answer 50 · answered Jun 10 '09 at 16:46

0

Someone should mention Netgong for a simple on/off monitoring tool via ping intervals.

answered Jun 10 '09 at 16:46

score 0 · Answer 51 · answered Jun 11 '09 at 16:04

0

I use NetGain Enterprise Manager from NetGain Systems. It's take just few minutes to install and get it up and monitoring. Best of all, it's free. check out http://www.netgain-systems.com

answered Jun 11 '09 at 16:04

score 0 · Answer 52 · answered Mar 17 '11 at 22:03

0

the very VERY excellent multitail to keep an eye on logfiles. nagios to keep my eye on service uptime. rrdtool to keep my eye on bandwidth.

answered Mar 17 '11 at 22:03

solid7

159
1

score 0 · Answer 53 · answered Apr 06 '11 at 14:04

OPManager (Ports, HTTP Get Requests, ICMP, SNMP (Disk/Memory/CPU)) (personal favourite!) http://www.manageengine.com/network-monitoring/

OpManager is an award winning network monitoring software that helps administrators discover, map, monitor and manage complete IT infrastructure.

Cacti (SNMP Graphing, Traffic, Disk Usage, CPU Utilisation etc) (http://www.cacti.net)

About Cacti. Cacti is a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality.

PRTG (Paessler, no longer available unfortunately)

SmokePing: (packet loss & latency) http://oss.oetiker.ch/smokeping/

Pingdom: http://www.pingdom.com

score 0 · Answer 54 · answered Jun 15 '09 at 12:47

I've worked with a lot of monitoring systems at a lot of places. Most of them have already been mentioned. Here are a few that haven't been:

SMARTS - now owned by EMC. Really is the best thing ever for root cause. It's not cheap and support may not be good anymore as it's owned by EMC. We were lucky enough to work with the founders of the company to get it implemented.

Big Brother. Nice and simple, but a bad license. It's also the ugliest web gui I've ever seen, so I had to rewrite it. Never got Big Sister to work.

HP Openview, when engineered, installed and run by a competent engineer can be good. However I've only seen it done right once and wrong more often than I can remember. I would never choose to use it.

BMC Patrol. Just awful. Die, die!

And finally, for logs and tracking down problems you just have to use Splunk. If this had been around 10 years ago I would have saved myself a lot of wasted time.

score 0 · Answer 55 · answered May 06 '11 at 09:14

EventLog Analyzer is a web based, real time, agent less, event log and application log monitoring and management software. The eventlog analyser software collects, analyzes, reports, and archives, Event Log from distributed Windows hosts, SysLog from distributed Unix hosts, Routers, Switches, and other SysLog devices, Application logs from IIS Web server, IIS FTP server, MS SQL server, Oracle database server, DHCP Windows and DHCP Linux servers. The eventlog analyzer application generates graphs and reports that help in analyzing system problems with minimal impact on network performance.

score 0 · Answer 56 · answered May 20 '11 at 04:01

0

Try Ground work.It uses Nagios. So it has all features of nagios and you can edit monitorings graphically through a webinterface which is not possible by nagios alone. https://kb.groundworkopensource.com/display/SUPPORT/Home

answered May 20 '11 at 04:01

Bijo

209
1
3
6

score 0 · Answer 57 · answered May 27 '11 at 21:22

0

Please check Verax NMS. Advantages:

Service-oriented approach
Monitoring servers as well as networks, network devices (e.g. switches, routers), data center infrastructure (e.g. power supply, air conditioning) and applications (e.g. www & application servers, databases)
Rich library of plug-ins and SDK for new ones
Virtualization support
Advanced event correlation rules
Advanced reporting (SLA compliance)

answered May 27 '11 at 21:22

Artur Nowakowski

1

your post has attracted a lot of moderator flags, because it appears to be blatant advertising without disclosure. We don't mind a post like yours if it's relevant and accurate, but we prefer full disclosure. – Mark Henderson May 28 '11 at 04:38

score 0 · Answer 58 · answered May 01 '09 at 21:26

I've used Activexperts Network Monitor with great success (on a mostly Windows network but it had some unix and linux hosts, printers of various brands and so forth that was also monitored with it).

It's really easy to setup and learn, rather cheap for what you get (was $500 for site/enterprise license) and supports vbscript and remote unix commands. If the network is small (a few hundred nodes at most) I think this is much more intuitive than System Center Operations Manager which feels more directed at huge windows networks only.

Network Monitor comes with a lot of predefined scripts for monitoring stuff like e-mail servers including various Exchange versions and all its services, http servers with expected response, event logs, sql queries and expected responses and so on.. .and dependencies are easy to configure ("all these depend on this router so if it fails to respond to ping and snmp, don't bother alarming us about all the stuff behind it that's not responding"). SMS with gateway or local GSM modem support and all rules can of course have actions like service restart, server restart or custom script - to fix reoccuring problems for you (it's important I think, kinda like regression testing is for development).

...I've also tried to tame a Hobbit and didn't really enjoy it at all (nor the bloated Windows agent) - but it was set up for Windows server monitoring and it really blows at that - most likely more suited for a linux or unix-centric network.

score 0 · Answer 59 · answered Aug 03 '11 at 19:55

0

Solarwinds Ipmonitor in combo with Dell Open manager and MS Scom.

answered Aug 03 '11 at 19:55

Alan

836
1
9
18

score 0 · Answer 60 · answered May 02 '09 at 20:13

We use hyperic - it has both an open source version and a commercial one

It monitors the operating system (RHES 3, 4 and 5 + Ubuntu), Apache, MySql, JBoss, Tomcat, mail servers, memcached and it probably can monitor more applications. No special configuration is needed, all servers were found with the auto discovery, even if they were installed in an untraditional place. It is very easy to use and configure, you can control your services (start/stop etc.) and define alerts.

Minuses - You need to configure it to run on boot (we are using cron to do that).

score 0 · Answer 61 · answered Jul 05 '09 at 16:00

0

Nagios with groundwork on top of it.

I'm not sure if groundwork helps or hinders, but nagios is definitely good.

answered Jul 05 '09 at 16:00

Jason Tan

2,742
2
17
24

score 0 · Answer 62 · answered Apr 30 '09 at 11:16

0

We use Level Platforms for this task. Provides a ton of useful information without overloading the sysadmins, and makes it extremely easy to handle all of the hardware in our server room (as well as many of our clients').

answered Apr 30 '09 at 11:16

John Rudy

243
3
8

score 0 · Answer 63 · answered Jul 30 '09 at 19:09

0

Ipswitch's WhatsUp Gold

answered Jul 30 '09 at 19:09

score 0 · Answer 64 · answered Jul 30 '09 at 19:41

0

We've tried Applications Manager Its running on java and mysql. It's really powerful and easy to configure from the browser. It's not that expensive either.

Currently we use SCOM from MS. I wouldn't recommend it to anyone!

answered Jul 30 '09 at 19:41

Tommy

195
1
2
9

score 0 · Answer 65 · answered Jul 30 '09 at 19:41

0

We use IBMs director, Dells Open manage and "whats up gold"

answered Jul 30 '09 at 19:41

Alan

836
1
9
18

score 0 · Answer 66 · answered Apr 30 '09 at 11:55

0

Also take a look at Argent Guardian. It's cross-platform, can function as a syslog server, they'll give you the database schema to do your own reporting, if you need that, and you can import your own images as "maps" to give visual alerts.

answered Apr 30 '09 at 11:55

K. Brian Kelley

9,004
31
33

score 0 · Answer 67 · answered Jul 31 '09 at 15:28

We use Ipswitch whatsup gold 12 for monitoring about 2000 devices, both performance and tcp/ip or wmi based monitors and both windows and linux. Good thing about it is that it is easy to use and configure, has bulk change options and autodiscovery, multiple notification methods. The bad side: seems to have had a limit of about 2000 devices, after that performance was getting slow, plus it only runs on windows. The distributed version doesn't really deserve the name and the price tag. We evaluated nagios (setup too complex for a dynamic environment), zenoss (no bulk change or autodiscovery, too limited for dynamic environment) and currently looking at Zabbix, which seems most promising with all the nice features Whatsup has and more, such as fully distributed architecture with probes and server, relatively simple setup, open source backend (mysql, apache)...

score 0 · Answer 68 · answered Jul 31 '09 at 17:45

0

I've been using Sysmon for a number of years. There are a few modern services that it doesn't monitor, but it compiles easily on most *nix platforms, has almost no dependancies, is extremely light-weight, can monitor very large numbers of devices and services with ease, can handle complex network layouts (incl. ring topologies) and failover monitoring. It's basically a config file deal, but the format is pretty easy (based on plist/css).

answered Jul 31 '09 at 17:45

morgant

1,460
6
23
33

Its website has gone AWOL. – sendmoreinfo Sep 12 '11 at 23:43
It looks like the sysmon.org domain has been taken over, but it can still be found at [http://puck.nether.net/sysmon/](http://puck.nether.net/sysmon/). – morgant Sep 15 '11 at 15:39

score 0 · Answer 69 · answered Oct 04 '09 at 13:18

Nagios and HPOpenview are the two that I am familiar with and have experience in. Both are good choices, although for the latter I'll echo other posters that it needs someone that knows how to do it right. hen again the only place I saw it running was when I was with HP so that might have helped my perception.

score 0 · Answer 70 · answered Apr 30 '09 at 13:57

For the status of servers and services (whether they are up or down, and sending warnings if they go down) and for yes/no questions ("has a backup been done in the last 24 hours?") we use nagios. It is hard to set up, but it is immensely configurable. Custom scripts can be run on remote computers. Alerts can send emails, send text messages or even run custom scripts.

For the health of servers we use munin - it provides nice graphs of memory usage, cpu usage, network usage etc. Pretty easy to set up on linux at least (I have not tried with Windows).

score 0 · Answer 71 · answered May 17 '09 at 14:48

0

I notice no one has mentioned HP SiteScope yet

answered May 17 '09 at 14:48

gharper

5,365
4
28
34

score 0 · Answer 72 · answered Dec 15 '09 at 19:22

0

ServersAlive is a relatively cheap, simple tool for all sorts of polling, including TCP services, Windows services, your own custom scripts, whatever. The response from the developer on his mailing list is rapid and personal.

I used it at a previous job for service monitoring and it was reliable, customisable and cheap.

answered Dec 15 '09 at 19:22

nray

1,540
17
23

I really wanted to like ServersAlive. I thought it had an amazing feature list for the price. But there were just as many quirks as well. The biggest being the fact that it was designed to be ran as a single user interactive GUI that you would leave up and running all the time (as apposed to a service/client model). I even went round and round via email once with the author on this one issue. In the end, it was far to awkward for our environment. – Nathan Hartley Oct 15 '10 at 15:02

score 0 · Answer 73 · answered Dec 15 '09 at 19:29

MSP Center (the former OpManager) is really frustrating to use and I can't recommend it. The interface is entirely web-based which means zero feedback and an arbitrarily limited set of choices any time you want to do something. Their website seems full of tips and documentation, but it's a bit like Outlook - it promises a whole bunch of power but is hamstrung by some developer's limited imagination.

If you're looking for a zero-config solution for your helpdesk, well maybe, but it's not any sort of power tool. If you have time to tune your monitoring to meet your needs then there are other solutions that would reward your efforts more.

What tool do you use to monitor your servers?

73 Answers73

Linked

Related