0

I would like to implement some open source monitoring in my server infrastructure. I want detailed monitoring for both the server and Raid array. The client should notify me if any of the disks fail in that RAID array.

Wilshire
  • 538
  • 6
  • 19
johngillow
  • 125
  • 2
  • possible duplicate of [What tool do you use to monitor your servers?](http://serverfault.com/questions/44/what-tool-do-you-use-to-monitor-your-servers) – Jason Berg Jun 25 '11 at 22:08
  • Product and service recommendations are off topic per the updated [FAQ](http://serverfault.com/faq). – sysadmin1138 Aug 09 '12 at 17:16

2 Answers2

2

I'd suggest using nagios and if need be making your own plugins to monitor the RAID array if need be.

Wilshire
  • 538
  • 6
  • 19
1

Just about any monitoring tool should be able to handle that without issue. Myself I use Zabbix for my monitoring and setting something like this up with Zabbix will be pretty straight forward.

In Zabbix I would set up a "User Parameter" to pull the array status. Probably something like this: userparameter=raid[*],cat /proc/mdstat | grep -A 1 $1 | tail -1 | sed 's/.+(\[.+\])\s*$/\1/'

Then I would create an item to monitor it: Item Name: $1 Raid status (The $1 will substitute with the value passed to the key, md0) key: raid[md0] interval: 60 seconds type: char

Then I would write a trigger which would regex against that: {host:raid[md0].regexp("_")}=0

And then create an action to alert.

Then if you put this into a template you can have this trigger/item pushed out to all hosts. You will also need to make sure the User Parameter gets pushed out to all hosts as well.

In addition you would gain something Nagios does not do, performance monitoring as well as availability monitoring as you can also track disk and other system metrics over time.

Holger Just
  • 3,315
  • 1
  • 16
  • 23
Red Tux
  • 2,074
  • 13
  • 14