8

I understand ZFS would prefer all disks carrying the same size. However, in the case that I have two disks with different size (1TB and 1.5TB), I'd like to have certain redundancy, but not mirroring. So I chop two disks into 5 partitions, with each partition roughly 500GB and create a "raidz" pool ... zfs happily obliged. Does it setup actually increase any reliability? The thought is that if a disk doesn't go totally busted, and only portion of it fails, I can still access the data?

200_success
  • 4,701
  • 1
  • 24
  • 42
python152
  • 183
  • 3
  • 5
    This adds no reliability and degrades performance. – Michael Hampton Oct 04 '16 at 22:14
  • 1
    Most the answer/comments I received, didn't go deeper enough to reason why. The simple "against best practice" or "equal disks just make common sense" actually doesn't mean much to me. Obviously, given this peculiar disks I have, I am *not* going after the best of the best in terms of reliability and performance. I am asking what can be done given what I have. Also, things change: in the book of "Master FreeBSD ZFS" ... authors clearly says in the old/solaris day, whole disk provisioning was regarded as crucial, now it is not. Partition-based provisioning is as good or even encouraged. – python152 Oct 06 '16 at 16:22
  • Especially for parity-based redundancy, there is no absolute requirement on the equal size (I vaguely remember the zfs book also said the same capacity drive from different vendors do end up with different size). It is up to how ZFS data placement algorithm choose to handle that to balance off spreading the parity bits and balance the performance. – python152 Oct 06 '16 at 16:30
  • @python152 It always depends on if you can live with the downsides (ZFS is really quite flexible, best practices are recommendations, not rules). If you can accept downtime and are confident in partition management, replacing partially faulted disks is not as critical, for example. I would still be wary of the RAID5 write hole, which exists regardless of whether your backing hardware are disks, partitions or even files. – user121391 Oct 10 '16 at 07:24
  • 1
    @user121391 ZFS raidz doesn't suffer from the RAID5 write hole. See https://blogs.oracle.com/bonwick/entry/raid_z and https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ – nickcrabtree Oct 11 '16 at 20:17
  • @nickcrabtree Thank you for the correction, you are of course correct. I must have been caught up between thinking about the backing SAN (where this problem may arise, but we don't know) and the overlaying Z1 pool, where it does not happen. Therefore I mixed it up and came to the wrong conclusion. – user121391 Oct 12 '16 at 06:39

3 Answers3

5

What you're describing is a bit tacky.

ZFS wants full disks of equal size and capability. This is critical for a variety of reasons, but also just makes common sense.

All you'd be doing in the situation you've outlined is add complexity to the environment and increase your risk.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
4

Let's look at it as the general case:

  • If you have 1 TB of data on disk one, you can replicate it to disk two, and you can afford to lose either disk.

  • If you have 1.5 TB of data on disk two, you can replicate only the first 1 TB of data to disk one. In this scenario, if disk two fails you WILL lose data.

ZFS is very capable but as a general rule, as per the two points above, mixed disk setups are silly and not super-useful. If you care about reliability and redundancy, pretend the second disk is also only 1TB.

  • 4
    I'm putting this bit in a comment because I don't feel like it really fits in an "answer" as such, but if you really wanted to use all the space on your disks, I would set up a 1TB mirror, and then use the last 500GB as a regular, un-replicated volume for scratch space and temp files that I don't really care about (such as my browser downloads folder - nice to have old stuff cached there, but there's nothing I'd really miss if I lost it). – Benny Mackney Oct 04 '16 at 23:02
3

The thought is that if a disk doesn't go totally busted, and only portion of it fails, I can still access the data?

In theory, this thought is correct. As long as you encounter an error on a single device of your RAIDZ1 vdev, ZFS can and will inform you and correct the error, assuming the other devices are error-free.

What may differ in reality are several things:

  • Errors may span over partitions and therefore two or more devices will be affected, which can result in unrecoverable errors or even whole pool loss (depending on location and amount of errors). You could use RAIDZ2 or Z3 to mitigate this somewhat, but the problem is always there.
  • While resilvering a partition, the disk needs to read (2 times) and write (1 time) to the same disk concurrently and randomly. Unless you use Solaris 11.3 with sequential resilvering, this will be very very slow. Until you are finished with the resilver process, you are vulnerable to errors on the other partitions. If your resilvering time is longer, your chance of encountering an additional URE grows. It also places additional load on the drive, increasing the chance of complete drive failure.
  • Imagine your 3rd partition (the last one on the 1.5TB disk) shows enough errors to degrade the pool and call for a replacement. If you cannot add another disk, you cannot do a replacement without shutdown/export, and even then it is more complicated than usual.

Based on those points, I would advise to not do this if reliability is your main goal. Assuming a fixed hardware situation, I would do one of the following:

  1. Use mirrors and lose 500GB, but gain a simple setup with easy expandability in the future
  2. Use two separate pools and copies = 2 if you want some resiliency against smaller errors (whole disk failure would only kill 2/5 or 3/5 of your data compared to your setup)
  3. Use other file systems than ZFS if you want to have your cake and eat it, too
user121391
  • 2,452
  • 12
  • 31