5

As a longtime mdadm user is was just experiencing a disk error and remembered that I configured a Auto Mail reporting for mdadm on disk errors.

Therefore i just had to insert my Mail address inside /etc/mdadm/mdadm.conf

MAILADDR someone@exaplme.org

But I really missed that one out on FreeBSD. And as my ZFS RAID6 is now running over half a year i thought of what would happen if a disk would experience a failure?

I never configured any external mail address. Is there an easy way to accomplish and test this?

Like on mdadm:

sudo mdadm –monitor –scan –test

And how would i do it on Linux with ZoL (ZFS on Linux)?

EDIT: Sorry i meant AUTOMATED Reporting. not scheduled.

I know i could have build a script and parse zpool status -X every minute. but I think this is not a very elegant approach of accomplishing the reporting problem. It would be better to be notified instantly on a disk failure (like mdadm)

EDIT[2]: Thank you for your advices, but no im stuck with some scripting issues, fould someone help me with my for loop problem in /bin/sh here -> PASTEBIN

EDIT[3]: Got my for loop problem. :) (update in PASTEBIN)

Any more advices for my script?

Daywalker
  • 485
  • 5
  • 25

2 Answers2

3

Run a regular script (cron) that checks zpool status -x output. Longer-term, the ZFS on Linux project is working towards this in the form of an event daemon. The Solaris-derived systems had access to the Fault Management Architecture.

As far as automated reports, even commercial solutions like NexentaStor use scheduled checks. There's nothing wrong with that.

enter image description here


Something like this:

[root@mdmarra ~]# zpool status -x
all pools are healthy

Versus something awful like:

[root@mdmarra ~]# zpool status -x
  pool: vol1
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://www.sun.com/msg/ZFS-8000-JQ
 scan: scrub repaired 0 in 1h15m with 0 errors on Sun Jul 28 21:15:10 2013
config:

        NAME          STATE     READ WRITE CKSUM
        vol1          UNAVAIL      0     0     0  insufficient replicas
          mirror-0    DEGRADED     0     0     0
            c1t0d0    UNAVAIL      0     0     0  cannot open
            c2t0d0    ONLINE       0     0     0
          mirror-1    DEGRADED     0     0     0
            c1t1d0    UNAVAIL      0     0     0  cannot open
            c2t1d0    ONLINE       0     0     0
          mirror-2    DEGRADED     0     0     0
            spare-0   UNAVAIL      0     0     0  insufficient replicas
              c1t2d0  UNAVAIL      0     0     0  cannot open
              c2t8d0  UNAVAIL      0     0     0  cannot open
            c2t2d0    ONLINE       0     0     0
          mirror-3    DEGRADED     0     0     0
            c1t3d0    UNAVAIL      0     0     0  cannot open
            c2t3d0    ONLINE       0     0     0
          mirror-4    DEGRADED     0     0     0
            c1t4d0    UNAVAIL      0     0     0  cannot open
            c2t4d0    ONLINE       0     0     0
          mirror-5    UNAVAIL      0     0     0  insufficient replicas
            c1t5d0    UNAVAIL      0     0     0  cannot open
            c2t5d0    FAULTED      0     0     0  too many errors
        cache
          c3t5d0      ONLINE       0     0     0
        spares
          c2t8d0      UNAVAIL   cannot open

errors: No known data errors
ewwhite
  • 194,921
  • 91
  • 434
  • 799
3

Try zfswatcher, it works really well for me.

ptman
  • 27,124
  • 2
  • 26
  • 45