1

Possible Duplicate:
What tool do you use to monitor your servers?

I'm a developer in a shop with dozens of Solaris, Linux and Windows servers. Since I'm not a sysadmin I'm posting here to tap that expertise. There are various monitoring needs here and I'm looking for a good solution. Examples of needs include:

  • regularly checking how many files are in a directory on a remote server via sftp, or scp, or NFS, or Windows share
  • regularly confirming that we can still login to a remote server that we don't control
  • regularly running some ad hoc query against a database, and alerting people via email if some criteria is met
  • alerting us if a particular string appears in an application log file, e.g. a Java Exception has been thrown
  • having robust/flexible scheduling to do all the above
  • supporting a variety of languages to write monitoring scripts
  • having a diverse set of plugins that allow us to tap new functionality created by the user community

My research so far leads me towards Nagios, however someone has also mentioned Symantec Altiris. Any suggestions or comments about these or other platforms is greatly appreciated.

Thanks!

bethesdaboys
  • 131
  • 1
  • 2
    Go for Nagios. It is by far the most flexible, extensible, and stable monitoring system out there. You'll be able to easily write monitoring plugins in any language of your choosing to satisfy your example requirements. – EEAA Apr 29 '11 at 17:18
  • Nagios seconded, but there are packaged versions which make config simpler and can track actual metrics (as opposed to SLA type monitoring) like centreon, fruity and groundwork monarch – symcbean Apr 29 '11 at 22:31
  • Thanks ErikA for your link to another thread/post ([here][1]), which was helpful. In the mean time, since posting this, I have begun researching Zabbix, which looks quite good. [1]: http://serverfault.com/questions/44/what-tool-do-you-use-to-monitor-your-servers – bethesdaboys May 01 '11 at 11:05

3 Answers3

2

Zabbix in my opinion is the hands down best choice. Zabbix is one of the best Open Source performance monitoring tools on the market.

The best example of why I feel so strongly about the power of Zabbix was told to me by a community member. They work for a company who uses AIX, Linux and Windows in their environment. They needed a service to monitor their mixed environment. If I recall correctly they had a preference for Open Source rooted programs. They would take a monitoring solution and set it up in their lab environment and then let everyone on the team poke around with it. They would also invite the vendor in to discuss their product. Then they would repeat the process with the next product. In the end they brought Zabbix into their environment with a support contract, even though Zabbix SIA at the time had ZERO sales people. Last I heard they are running one of the largest Zabbix installs. However in a true testament to Zabbix, they are now expanding their Zabbix operations within the company. In addition I've been using Zabbix myself for about 6-7 years. I've even done some hacking for it, writing a Lua patch to allow Lua scripts to run inside Zabbix, along with Zabcon, the Zabbix console.

As noted above Zabbix has native agents for many platforms including Windows. In addition if the Zabbix agent does not support what you are looking for directly you can achieve your results with an external script. External scripts are either triggered by Zabbix, user parameter, or via a crontab entry which then sends the results to the Zabbix server using the zabbix_sender.

Zabbix can also handle log file monitoring and there is a good utility for improved integration with syslog.

Also any data which get's pushed into Zabbix can be triggered upon. Triggers by themselves don't do anything, to do something you must tie a trigger to an action. Actions can be anything you want from send an email to run a script which reboots a host.

Zabbix also has a very vibrant community and has recently had it's 10th birthday. I've been using Zabbix for about 6-7 years and have been nothing but pleased with it over that time.

Red Tux
  • 2,074
  • 13
  • 14
0

nagios + plugins (you can make it yourself in various languages)

silviud
  • 2,677
  • 2
  • 16
  • 19
0

We use PRTG (http://www.paessler.com/prtg/) which is not often mentioned but we've found to be great. Nagios is fantastic for flexibility (and the price is great) but if you want to easily add new servers, services and monitors, you'll need something a little more user friendly.

It doesn't have a big community though (if any) so may fail in that respect for you.

Andre Lackmann
  • 426
  • 2
  • 5
  • How difficult is adding a line or three to a config file and restarting the Nagios process? That's hardly rocket science. – EEAA Apr 30 '11 at 04:03
  • sure.. but if you have several people responsible for monitoring and you want to easily add new monitors etc without necessarily needing those users to have command line access to the box, then you're stuck. Additionally, we like to store all our configs in version control, so again, it would require those changes to be committed into the repo. It just expands the knowledge requirement on the user who is making the changes. Sometimes you want ops to just be ops, not necessarily full on sysadmins. I get it tho, for most, Nagios is plenty (or Munin, or ) – Andre Lackmann May 01 '11 at 02:42
  • Actually, we have a system set up where we use svn to completely manage our Nagios instance. The nutshell version is that when you want to make changes, we `svn update` to the latest rev, make our changes, then commit. There is a pre-commit hook that runs the new config through the nagios syntax check. If that fails, the commit is rolled back and errors are echoed back to the user. If it succeeds, the commit is accepted and a post-commit hook does an svn update into the nagios config dir, followed by the requisite restart. – EEAA May 01 '11 at 03:53