7

I have a zpool where I have just replaced a failed disk, and started a resilvering to the new disk.

What I don't understand is, why zpool status says it want to scan 129TB, when the size of the vdev is ~30TB. When I look at iostat -nx 1 then I can see the 5 disks in the vdev are getting heavy reads, and the new disk equal heavy writes. So zfs doesn't scan all the data as it says.

# zpool status tank3 |head
  pool: tank3
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Thu Apr 30 09:59:15 2015
    61.2T scanned out of 129T at 3.03G/s, 6h23m to go
    946G resilvered, 47.34% done

Question

I would say that each vdev is independent of each other, so a resilver of one should not require any scan of the others. Why does zfs scan all used disk space when resilvering?

Jasmine Lognnes
  • 2,490
  • 8
  • 31
  • 51
  • To clarify what you are observing, it would help if you tell more about your pool configuration and usage (full "zpool status tank3" and "zpool list tank3" output). – jlliagre Apr 30 '15 at 22:55

2 Answers2

11

Resilvering (and scrubbing) involves walking the entire B-Tree of the pool, and re-silvering blocks that would have been on the missing disk.

Without walking through every single txg in the tree, it cannot know which blocks would have been on the missing disk, hence it scans the entire metadata universe for the pool.

It doesn't necessarily read all the data, only sufficient metadata to determine whether it actually needs to read the corresponding data or not. You'll probably see the progress info go up faster than the actual amount of data being read, as it's really counting the amount of data referred to by the metadata it has read.

noitsbecky
  • 606
  • 3
  • 13
4

Resilvering is a vdev operation; as you implied only the storage devices in that vdev are used to rebuild the new device. I'm not sure why it quotes the full size of the zpool, but I suspect the developers borrowed code from the scrub functions, or that it simply quotes the full zpool size as that would be the worst case scenario.

user
  • 4,267
  • 4
  • 32
  • 70
Chris S
  • 77,337
  • 11
  • 120
  • 212
  • 1
    See my answer for why it quotes the full pool size. tl;dr: it has to examine the whole pool to figure out which blocks need to be resilvered. – noitsbecky Jun 12 '15 at 21:46