As far as I know, zfs uses checksums to protect against data loss caused by bitrot.
But what happens if a bitrot affects the data of the checksum. Then, does zfs think the data is corrupt or think the checksum is corrupt?
Thx 4 any answer,
tbol
As far as I know, zfs uses checksums to protect against data loss caused by bitrot.
But what happens if a bitrot affects the data of the checksum. Then, does zfs think the data is corrupt or think the checksum is corrupt?
Thx 4 any answer,
tbol
ZFS provides fault isolation between data and checksum by storing the checksum of each block in its parent block pointer -- not in the block itself. Every block in the tree contains the checksums for all its children, so the entire pool is self-validating.
Edit: because you asked about the parent:
Observation 1: ZFS detects all [on disk] corruptions due to the use of checksums. In our fault injection experiments on all metadata and data, we found that bad data was never returned to the user because ZFS was able to detect all corruptions due to the use of checksums in block pointers. The parental checksums are used in ZFS to verify the integrity of all the on-disk blocks accessed. The only exception are uberblocks, which do not have parent block pointers. Corruptions to the uberblock are detected by the use of checksums inside the uberblock itself.
End-to-end Data Integrity for File Systems: A ZFS Case Study
You a can test this yourself. Insert a random block in the middle of a ZFS device and see if it maintains integrity.
Note that in the next section of that paper, they show that in memory corruptions went undetected.
I've found the right explanation:
A ZFS storage pool is really just a tree of blocks. ZFS provides fault isolation between data and checksum by storing the checksum of each block in its parent block pointer -- not in the block itself. Every block in the tree contains the checksums for all its children, so the entire pool is self-validating. [The uberblock (the root of the tree) is a special case because it has no parent; more on how we handle that in another post.]
When the data and checksum disagree, ZFS knows that the checksum can be trusted because the checksum itself is part of some other block that's one level higher in the tree, and that block has already been validated.
There's just a single point of failure: if the root node in the tree is sorrupted, but there should be a solution for this
Read @ https://blogs.oracle.com/bonwick/entry/zfs_end_to_end_data