13

I recently set up RAID 1 on Linux with mdadm. When adding a new HDD to RAID 1, data started to sync between my drives which is expected. I didn't expect that it started syncing the entire drive, including unused space. The HDDs were 6 TB with only about 1 TB of data, so this took way longer than anticipated. Why did md have to sync the unused space?

Peter Mortensen
  • 2,319
  • 5
  • 23
  • 24
idunnololz
  • 233
  • 6

1 Answers1

31

RAID works below the filesystem level - it doesn't know or care what parts of the disk are "used" or not, it just sees a bunch of blocks and their mirrored counterpart for RAID1.

So it has to sync the entire disk to make sure that they match. If it didn't, it wouldn't know what differences are an error and which ones are just parts that the filesystem doesn't think are used yet.

There is a --assume-clean flag you can use in mdadm to tell it not to do that - but you should only do that if you are certain that the disks contains nothing but zeros. And I think it only works for RAID1, not for RAID5/6.

chicks
  • 3,639
  • 10
  • 26
  • 36
Grant
  • 17,671
  • 14
  • 69
  • 101
  • This makes sense. I was curious why it did it since I thought there must be a good reason and I think this explained it. Thank you! – idunnololz Jul 20 '22 at 21:18
  • 1
    I don't think `--assume-clean`, or adding a device filled with zeroes can make sense when _extending_ a RAID1 (or replacing a failed drive). The component devices of a RAID1 need to all have the identical data, so that reads can be made from any one of them, or at least the parts that will ever be _read_ will need to be in sync (writes would automatically write to all components). Either that, or the RAID layer needs to have a mapping of the blocks that were ever written to by upper layers so it could sync just the parts that are needed. – ilkkachu Jul 21 '22 at 11:39
  • 5
    if you look at the [man page](https://man7.org/linux/man-pages/man8/mdadm.8.html), the part about devices filled with zeroes under `--assume-clean` is mentioned in context of _a new array_, where there's no upper-layer structure yet. (Also in all likelihood, the drives don't need to be exactly _zeroed_, just filled with _identical_ data, ensuring that sync checks don't return false positives.) – ilkkachu Jul 21 '22 at 11:43
  • 1
    It would be great (esp. for modern SSDs) to have a mode that syncs only non-zero blocks and maybe trims the rest. What the raid1 resync does now is to waste precious write cycles. – fraxinus Jul 21 '22 at 12:09
  • 3
    @fraxinus This is an optimization that belongs into the SSD's firmware though - turn a "write a block full of zeros" into a "trim" - because it's a more general solution to the problem, and doesn't "export" assumptions about how the disk works to parts of the OS that don't really care. – Guntram Blohm Jul 21 '22 at 15:41
  • @GuntramBlohm maybe, but disk manufacturers already started exporting a great deal of these things - e.g. RZAT option or the different kinds of TRIM. – fraxinus Jul 21 '22 at 16:45