3

We have 20 2TB SATA drives to be used in a ZFS pool. I am after some advice on the best way to achieve good I/O performance, whilst being able to offer some redundancy (3 disk failures before data loss is what we are looking to achieve).

I am bit confused as to whether I need to use mirroring or raidz.

The 20 drives will be plugged into 2 16-port raid controllers (10 on each controller). Maybe I create hardware raid volumes for each lot of 10 disks and then in zfs, then mirror the two available raid volumes, thus creating one super volume?

Any advice would be great.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296

5 Answers5

6

With 20 disks you have a lot of options. I'm assuming you already have drives for the OS, so the 20 disks would be dedicated data drives. In my Sun Fire x4540 (48 drives), I've allocated 20 drives in a mirrored setup and 24 in a striped raidz1 config (6 disks per raidz and 4 striped vdevs). Two disks are for the OS and the remainder are spares.

Which controller are you using? You may want to refer to: ZFS SAS/SATA controller recommendations

Don't use the hardware raid if you can. ZFS thrives when drives are presented as raw disks to the OS.

Your raidz1 performance increases with the number of stripes across raidz1 groups. With 20 disks, you could use 4 raidz1 groups consisting of 5 disks each, or 5 groups of 4 disks. Performance on the latter will be better. Your fault tolerance in that setup would be sustaining the failure of 1 disk per group (e.g., potentially 4 or 5 disks could fail under the right conditions).

The read speed from a raidz1 or raidz2 group is equivalent to the read speed of one disk. With the above setup, your theoretical max read speeds would be equivalent to that of 4 or 5 disks (for each vdev/group of raidz1 disks).

Going with the mirrored setup would maximize speed, but you will run into the bandwidth limitations of your controller at that point. You may not need that type of speed, so I'd suggest a combination of raidz1 and stripes. In that case, you could sustain one failed disk per mirrored pair (e.g. 10 disks could possibly fail if they're the right ones).

Either way, you should consider a hot-spare arrangement no matter which solution you go with. Perhaps 18 disks in a mirrored arrangement with 2 hot-spares or a 3-stripe 6-disk raidz1 with 2 hot-spares...

When I built my first ZFS setup, I used this note from Sun to help understand RAID level performance...

http://blogs.oracle.com/relling/entry/zfs_raid_recommendations_space_performance

Examples with 20 disks:

20-disk mirrored pairs.

  pool: vol1
 state: ONLINE
 scrub: scrub completed after 3h16m with 0 errors on Fri Nov 26 09:45:54 2010
config:

        NAME        STATE     READ WRITE CKSUM
        vol1        ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c4t1d0  ONLINE       0     0     0
            c5t1d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c6t1d0  ONLINE       0     0     0
            c7t1d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c8t1d0  ONLINE       0     0     0
            c9t1d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c4t2d0  ONLINE       0     0     0
            c5t2d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c6t2d0  ONLINE       0     0     0
            c7t2d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c8t2d0  ONLINE       0     0     0
            c9t2d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c4t3d0  ONLINE       0     0     0
            c5t3d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c6t3d0  ONLINE       0     0     0
            c7t3d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c8t3d0  ONLINE       0     0     0
            c9t3d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c4t4d0  ONLINE       0     0     0
            c5t4d0  ONLINE       0     0     0

20-disk striped raidz1 consisting of 4 stripes of 5-disk raidz1 vdevs.

  pool: vol1
 state: ONLINE
 scrub: scrub completed after 14h38m with 0 errors on Fri Nov 26 21:07:53 2010
config:

        NAME        STATE     READ WRITE CKSUM
        vol1        ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c6t4d0  ONLINE       0     0     0
            c7t4d0  ONLINE       0     0     0
            c8t4d0  ONLINE       0     0     0
            c9t4d0  ONLINE       0     0     0
            c4t5d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c6t5d0  ONLINE       0     0     0
            c7t5d0  ONLINE       0     0     0
            c8t5d0  ONLINE       0     0     0
            c9t5d0  ONLINE       0     0     0
            c4t6d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c6t6d0  ONLINE       0     0     0
            c7t6d0  ONLINE       0     0     0
            c8t6d0  ONLINE       0     0     0
            c9t6d0  ONLINE       0     0     0
            c4t7d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c6t7d0  ONLINE       0     0     0
            c7t7d0  ONLINE       0     0     0
            c8t7d0  ONLINE       0     0     0
            c9t7d0  ONLINE       0     0     0
            c6t0d0  ONLINE       0     0     0

Edit: Or if you want two pools of storage, you could break your 20 disks into two groups:

10 disks in mirrored pairs (5 per controller).
AND
3 stripes of 3-disk raidz1 groups
AND
1 global spare...

That gives you both types of storage, good redundancy, a spare drive, and you can test the performance of each pool back-to-back.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • thanks for the in-depth comment! just so i understand, this setup would give me two separate volumes, each with a different config arrangement? – Shannon Pace Jan 19 '11 at 03:37
  • Oh, these were two examples of what you could do with 20 disks. – ewwhite Jan 19 '11 at 03:39
  • @Shannon, the example here has 44 drives in two completely separate zpools. You could do a setup similar to one of them however, it really depends on what kind of hardware failure you want to be able to sustain and what kind of performance you need out of the system. Stripes over mirrors is quite fast, but you sacrifice a lot of space, and you can only sustain simultaneous failures on certain disks. – Chris S Jan 19 '11 at 03:40
  • i think i am leaning towards a single raidz-3 pool... – Shannon Pace Jan 19 '11 at 03:45
  • 1
    In the best practices guide, there's a note that: `The recommended number of disks per group is between 3 and 9. If you have more disks, use multiple groups.` You will probably want more than one raidz3 group in that case. – ewwhite Jan 19 '11 at 04:01
3

Take a look at the Best practices guide.

we have 20 2tb sata drives to be used in a zfs pool. i am after some advice on the best way to achieve good i/o performance, whilst being able to offer some redundancy (3 disk failures before data loss is what we are looking to achieve).

ZFS with RAIDZ-3 (triple-parity RAID) will give you the redundancy that you're looking for. The i/o performance -- as with any RAID-5-ish configuration -- will be better for reads than for writes, and whether it is "good enough" or not depends a lot on your hardware. Other folks may be able to provide better information on this area (the ZFS filesystems I work with have not been designed with performance as a primary consideration).

the 20 drives will be plugged into 2 16-port raid controllers (10 on each controller). maybe i create hardware raid volumes for each lot of 10 disks and then in zfs, i mirror the two available raid volumes, creating one super volume?

One of the big advantages to ZFS is that it combines RAID, volume management, and filesystem management in one place -- giving you a single point of management for your environment. You get a lot more flexibility if you configure your disks in a JBOD configuration.

larsks
  • 41,276
  • 13
  • 117
  • 170
  • so, could i just set up a raidz-3 pool with three hot spare drives in zfs? seems too simple :) – Shannon Pace Jan 19 '11 at 03:36
  • [This thread](http://mail.opensolaris.org/pipermail/zfs-discuss/2009-April/028054.html) has some interesting discussions regarding configuring large numbers of disks. – larsks Jan 19 '11 at 03:58
2

Everyone telling you to use RAIDZ is wrong. RAIDZ is terrible for performance! Mirroring is best for performance! Only use RAIDZ when you need space more than performance.

You have 20 disks. Create 9 vdevs of two-way mirrors and two hot spares. That gives you your three disk failure and 18TB of storage.

Don't use hardware RAID at all. Configure your raid controller in JBOD (sometimes called "passthrough") if possible. If not, create 20 RAID0's (a terrible thing to do, but the least terrible if you can't configure JBOD). Any other configuration defeats ZFS.

Spread the disks across controllers as much as you can (best is 1-1, but I realize that's not practical in your situation). Buy more controllers if possible.

bahamat
  • 6,193
  • 23
  • 28
1

shannon, in a similar configuration i created 15-disk RAIDZ2 pools. Performance was fine, but the time to rebuild the raid after a disk failure was significant. Something like 30 hours, and I was using 500GB disks. I think I was limited by storage controller bandwidth (U160 scsi) more than anything else but I predict you will find it takes longer than you'd like.

When upsizing to 2TB disks i'd have ended up with 120 hour rebuilds, which seemed like too much. I ended up rebuilding with 9-disk RAIDZ2's.

It's easy enough to test this in your environment; build your array, fill it, and then pull a disk and wait for a rebuild. Remember that with ZFS, only the space actually used will be rebuilt (resilvered in zfs parlance), so you have to fill the array to do a good test.

If I were you I'd do 2x RAIDZ2's, 9 and 10 disks each, and one hot spare. You'll have to use a -f flag to make ZFS let you add non-identically-sized raidz2's to the same pool.

Note that with my suggested RAIDZ2 config, if 3 disks in the same pool fail, you are hosed. OTOH if 4 disks, 2 in each pool, fail, you are OK.

Dan Pritts
  • 3,181
  • 25
  • 27
  • oh yeah, make sure you set up a cron job to scrub the pool(s) periodically. zpool scrub I think. This will help catch errors while they are still fixable. – Dan Pritts Jan 19 '11 at 05:22
  • When I said "performance was fine" above, i should clarify. It was fine in my use case, which was large data transfer. It would presumably suck at random i/o. – Dan Pritts Feb 09 '11 at 21:26
0

Do not use hardware raid in conjunction with ZFS. The filesystem will not be aware of any problems that the hardware controller is aware of, and will no react accordingly.

You could use RAIDZ3 to achieve 3 disk failure security. Mirrors would also achieve this, but there would be limitations on which 3 disks. It would make more sense if you picked a probability for failure than to arbitrarily says you can allow for 3 disks to fail.

Performance in a situation like this is going to primarily be limited by network connectivity (I assume most of the array is used for serving files across a network somehow) and the computer's CPU (all that parity isn't going to compute itself and ZFS will not use crypto accelerators yet).

Chris S
  • 77,337
  • 11
  • 120
  • 212