How do I repair a corrupt file on a ZFS replica?

Question

I have two machines: one that has the main 24TB ZFS filestore and one that is a clone of that filestore as a backup. Every now and then, I create a new snapshot on the primary and perform a differential zfs send to the backup machine. The primary is raidz3 but the secondary is just striped.

Recently, the backup copy found a corrupt file during a routine scrub. The file is not corrupt on the primary filestore, so if it were two working filesystems, I could just copy the good file to the backup and be done with it. But with the zfs send to zfs recv workflow, I'm not sure how to handle it. What is the best approach to correcting this corruption so that I maintain the ability to do the differential zfs send? I suppose I could modify the file on the primary machine and hope that it gets pushed as part of the next backup, but I'm not even 100% sure what kind of change to that file would trigger it being refreshed.

score 2 · Answer 1 · answered Sep 19 '20 at 08:04

2

You will need to rewrite the entire file for all the blocks to change. Assuming you aren't using deduplication, something like this should work:

cat brokenfile > brokenfile2
mv brokenfile2 brokenfile
zfs snapshot ...
zfs send

That should get you a complete working file in the backup server.

answered Sep 19 '20 at 08:04

Gordan Bobić

936
4
10

Sounds reasonable. I will give this a try and report back. – Ethan T Sep 20 '20 at 18:23
For anyone's reference ... this worked fine. I had to eventually zap the snapshot containing the corrupt file and force ZFS to re-scrub before all the errors went away. – Ethan T Jul 06 '22 at 17:44

How do I repair a corrupt file on a ZFS replica?

1 Answers1

Linked