I have done some research on nagios, opennms, and zenoss but am not confident that I have found what I am looking for.
The main driving force for me right now is being able to monitor backups. This includes mysql, mssql, and eventually some file system backups.
We have a tool that wraps the backup process for these different systems and collects statistics. So, items like:
- number of databases backed up
- size of db backup file
- size of db backup file compressed
- time to make backup
- time to zip file
I want to be able to A) have notifications if the jobs are not run according to schedule B) be able to set thresholds on the statistics which would trigger notifications C) I want to be able to trend and graph the statistics
I am planning on sending this information to the monitoring application through an HTTP POST. Or, the monitoring application could pull it from a log file as well.
However, we will have other processes with other "arbitrary" (from the monitoring system's perspective) statics that will want to monitor and trend, so flexibility is very important.
The tool or tools should also be able to do general monitoring and trending of network interfaces, server load, etc. Once we get the backup monitoring in place, we will want to include those items as well.
Thanks.
Follow-up:
I have decided to try the following in the given order:
- Zabbix: seemed more of a "one stop shop" than the others and was easy to install in Ubuntu Lucid RC
- opsview
- Nagios w/ nagvis, pnp4nagios, nagiosgraph
- cacti w/ npc plugin
- Munin: a little scarred of the simplicity, but this might prove to be a blessing in the long run
Will post back once I have made a decision, it may be a while until that happens.