Online Drive Replacement: BTRFS with RAID 6

1

on a testing machine I installed four HDDs and they are configured as RAID6. For a test I removed one of the drives (/dev/sdk) while the volume was mounted and data was written to it. This worked well, as far as I can see. Some I/O errors were written to /var/log/syslog, but the volume kept working. Unfortunately the command "btrfs fi sh" did not show any missing drives. So I remounted the volume in degraded mode: "mount -t btrfs /dev/sdx1 -o remount,rw,degraded,noatime /mnt". This way the drive in question was reported as missing. Then I plugged in the HDD again (it is of course /dev/sdk again) and started a balancing: "btrfs filesystem balance start /mnt". Now the volume looks like this:

$ btrfs fi sh
Label: none  uuid: 28410e37-77c1-4c01-8075-0d5068d9ffc2
    Total devices 4 FS bytes used 257.05GiB
    devid    1 size 465.76GiB used 262.03GiB path /dev/sdi1
    devid    2 size 465.76GiB used 262.00GiB path /dev/sdj1
    devid    3 size 465.76GiB used 261.03GiB path /dev/sdh1
    devid    4 size 465.76GiB used 0.00 path /dev/sdk1

How do I reinitiate /dev/sdk1? Running "$ btrfs fi ba start /mnt" does not help. I tried to remove the hdd, but

$ btrfs de de /dev/sdk1 /mnt/
ERROR: error removing the device '/dev/sdk1' - unable to go below four devices on raid6 

A replacement does not work this way either:

$ btrfs replace start -f -r /dev/sdk1 /dev/sdk1 /mnt
/dev/sdk1 is mounted

Are there other ways to replace/reinitiate the hdd then converting to RAID 5?

Oliver R.

Posted 2014-11-28T08:46:23.397

Reputation: 188

Should I better post this to ServerFault? If so, could an admin please move this question? – Oliver R. – 2014-11-28T13:59:25.863

1What does btrfs scrub say? – basic6 – 2015-07-08T19:07:31.923

Answers

1

I have repeated this test on a test system running kernel 4.3.

Like you, I have created a BTRFS RAID-6 array with 4 drives:

# mkfs.btrfs -m raid6 -d raid6 /dev/sdb /dev/sdc /dev/sdd /dev/sde

I then mounted it and started writing data on it.

While that was going on, I removed one of the drives. Of course, this caused a lot of error messages in the log and everywhere. But as expected, the write process was not interrupted and no files were damaged.

More importantly, BTRFS increased its error counts (dev stats) for write and flush errors. So if this was a production system, it would be monitored, a cronjob such as this one would have generated a notification email:

MAILTO=admin@myserver.com
@hourly /sbin/btrfs device stats /mnt/tmp | grep -vE ' 0$'

Then, I did not run a balance but a scrub, because I wanted BTRFS to scan the full filesystem and fix all errors, which is exactly what a scrub does.

# btrfs scrub start -B /mnt/tmp

Finally, I reset the BTRFS error counts back to zero (this would stop the warning messages if this filesystem was being monitored):

# btrfs device stats -z /mnt/tmp

Another scrub found no more errors.

And the file that I was writing during the test is correct. Its MD5 sum matches the original.

Of course, every test is different. If the 3rd drive (sdd) is assigned a new name like sdf, you can replace it with itself, effectively resilvering it:

# btrfs replace start 3 /dev/sdf /mnt/tmp

By the way, you mentioned removing a drive. You don't need to do that, it will only mix up your devids and be inefficient. The replace command has been around like forever.

Btw. in one case, BTRFS caused the test system to crash while I was trying to read from the damaged filesystem before I ran the scrub. After all, unlike most parts of this filesystem, BTRFS RAID-5/RAID-6 is still considered experimental (although it's constantly being improved, so this statement may be outdated, this is for 4.3). But this was only one single time, I repeated the test and it didn't crash then. Also, this tells us that even though BTRFS RAID-6 could crash while it's still experimental, it protects your data and a scrub tells you reliably if there are errors because it uses the stored checksums to verify the files.

I have also repeated the test, causing errors on 2 drives. This is a RAID-6 so this also worked as expected. Everything was fine after a scrub.

basic6

Posted 2014-11-28T08:46:23.397

Reputation: 2 032