Btrfs: RAID 1 on 3+ devices

10

2

I currently have a Btrfs partition with four devices: three 3 TB drives and a 4 TB drive. Data and metadata are RAID 10, so I have 6 TB of usable space, which is almost full. I'd anticipated that RAID 5 support in Btrfs would be mature by the time my storage filled up, but apparently it's not a priority.

My question is: is there a reason to prefer RAID 10 over RAID 1? I know real RAID 1 on my current hardware should give me 3 TB of usable space with 4 copies of each block, but Btrfs apparently does not behave this way. From the Btrfs FAQ:

btrfs combines all the devices into a storage pool first, and then duplicates the chunks as file data is created. RAID-1 is defined currently as "2 copies of all the data on different devices". This differs from MD-RAID and dmraid, in that those make exactly n copies for n devices. In a btrfs RAID-1 on three 1 TB devices we get 1.5 TB of usable data. Because each block is only copied to 2 devices, writing a given block only requires exactly 2 devices to be written to; reading can be made from only one.

And from Jens Erat on Stack Overflow:

Btrfs distributes the data (and its RAID 1 copies) block-wise, thus deals very well with hard disks of different size. You will receive the sum of all hard disks, divided by two – and do not need to think on how to put them together in similar sized pairs.

If more than one disk fails, you're always in danger of losing data: RAID 1 cannot deal with losing two disks at the same time. In your example given above, if the wrong two disks die, you always lose data.

Does this mean that rebalancing from RAID 10 to RAID 1 will give me an extra 500 GB of data (6.5 TB rather than 6 TB) due to the 4 TB drive? And is there any reason for me to stick with RAID 10?

Mikkel

Posted 2015-01-28T23:31:08.537

Reputation: 585

1BTRFS RAID-5 is (although still experimental) actually pretty stable and even many typical data recovery scenarios work in the current version (without crashing). You should probably scrub from time to time to make sure your data is still correct, but it might be worth a shot. It's possible to convert from RAID-1(0) to RAID-5. – basic6 – 2015-11-12T17:31:48.373

@basic6 Good to know, thanks. I was researching that not long ago and saw that scrub and replace support had been added as of 3.19, but people were still complaining about the lack of devices failure alerts. I do have crons set up for weekly scrubs and weekly/monthly SMART self tests, so I should be able to catch those issues. I'll give it a try. – Mikkel – 2015-11-12T21:36:19.927

Periodic (weekly or monthly) scrubs are important, if a scrub finds and error because it can't read from a drive, it will increase the error count. Check the error count using dev stats, which could be another cronjob.

– basic6 – 2015-11-13T07:45:10.547

Answers

5

Yes, you get an extra 500 GB. Note that determining available space in btrfs remains elusive. Also: have a look at the btrfs disk usage calculator.

On your second question: You may lose some performance on your array. Naturally, your data is equally safe on both RAID configurations. When considering performance you can perhaps have a look at these benchmarks: kernel.org, phoronix.com.

Have you perhaps already tried converting to RAID 1? If so: what are your findings?

Laura

Posted 2015-01-28T23:31:08.537

Reputation: 86

1

This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post - you can always comment on your own posts, and once you have sufficient reputation you will be able to comment on any post.

– DavidPostill – 2015-03-11T13:41:09.553

However, to answer your question, I did convert and the result seems to be consistent with my expectations. Btrfs does a poor job of indicating actual available disk space, but you can see in the output of btrfs fi sh that the extra space on the 4 TB drive is being taken advantage of, and (2.83+1.93+1.93+1.95)/2~=4.30 as expected. I recently sustained a drive failure and successfully rebuilt with no data loss, so I can confirm firsthand that redundancy is intact.

– Mikkel – 2015-03-11T16:13:56.580

I'm sorry my reply was unclear @david. I attempted to answer the two questions in the last two lines of the original post. Additionally I indicated a range of uncertainty (which seems intrinsic to btrfs) and supplied my sources for reference.

Indeed I also asked a question to the author, so others could perhaps learn from his findings. I did this within the answer for reasons you already explained. I will consider your comments next time I answer a question on stack exchange. – Laura – 2015-03-14T16:36:08.930

It may not be the most certain answer, but my results indicate that your assumptions are correct, so I'm accepting it. I'm not certain about your comment on performance differences, though. It's certainly true of hardware RAID 1 vs 10, but I'm not sure if Btrfs sees a performance boost from RAID 10. Furthermore, what it calls RAID 1 is still striped across multiple devices here, so it's still more akin to RAID 10 than real RAID 1. If I'd read your reply before converting, I would've done before and after benchmarks, but I'm not spending another week converting the array to RAID 10 and back. – Mikkel – 2015-03-24T01:56:01.360

Well, there is this page, which shows marginal benefits to RAID 10 over RAID 1, carrying the caveat at the top that the benchmarks are 7 years old.

– Mikkel – 2015-03-24T02:04:47.663

0

Don't use btrfs raid 5 or 6 the implementation is full of bugs, run zfs if you want that.

With raid10 you'll gain speed.

The optimal thing to do is make a 3TB partition on the 4TB drive (make sure all 3 are exactly the same size), combine those in to a RAID10 which will have 6 TB of free space. Use the extra 1 TB on the 4 TB drive for /boot (this has to be vfat anyway if you want to use efistub and avoid grub or any other boot loader which will make your boot a bit faster), swap (you can use a pretty large one, the optimal would be a small zram swap (4 gb works fine with 32 gb ram, 2 gb should be fine with 16 gb, if you have less than 16 idk if it's worth it and you should probably upgrade anyway) that's used first (set the priority to the highest) using zstd and as many compression streams as you have cores/threads... a ssd for / (everything except large media files) and a slightly larger swap with a lower priority (the higher the priority the sooner the swap gets used so the zram one should be highest) on it (8 GB for a 500 GB ssd, 16 for 1 TB, if you have a ssd over 1 TB more than 16 GB swap is probably pointless since if you are swapping that often to it you need more ram and if you have a ssd smaller than 500 GB, consider getting a 500 or 1 TB model since even the best sata ssds are extremely cheap now) and an even larger one on the hdd with the lowest priority (twice as big as the ssd one? Doesn't really matter since if it's getting used often you need more ram... and lowest priority), this still leaves enough space on that drive for another large partition (maybe something for vms or for dual boot with Windows or whatever). Also encrypt your swaps (other than the zram one, since any data on that one will be unrecoverable withing seconds unless someone pours liquid nitrogen on the ram before pulling the power) and / (if you want trim to work on the ssd you need to enable it in crypttab, security is slightly lower since an attacker can see which cells are empty but performance is better).

orange_juice6000

Posted 2015-01-28T23:31:08.537

Reputation: 115