Why is ZFS stored in a tree structure?

1

0

Apparently, in the ZFS (filesystem), there is an uberblock that points to the root of a zpool tree. Does anyone know why this tree makes things more efficient/reliable, and where the tree itself is stored?

Kaitlyn Mcmordie

Posted 2011-10-27T07:08:55.510

Reputation: 699

Answers

4

The purpose of the tree is to improve data integrity, partly by storing checksums away from the data blocks those checksums protect. The entire file system hierarchy forms a self-healing hash tree, or merkle tree. Here's a simplified description I made earlier:

ZFS data integrity example

Going from left to right, directory 1 contains a pointer to file A, file A's checksum, and some other meta-data. But dir 1 is just another data block, which is pointed to by the uberblock. The uberblock therefore contains a checksum of dir 1, and so on. This does mean that every write to a file involves recalculating several checksums, all the way back to the root node (the uberblock). But ZFS's copy-on-write policy and transactional nature mitigate the performance penalty. Furthermore, ZFS is designed to take advantage of Moore's law: CPU cycles are cheap, but hard drives are slow.

ZFS also uses ditto blocks to replicate the more important parts of the tree (i.e., the parts closer to the root), to further protect against corruption.

sblair

Posted 2011-10-27T07:08:55.510

Reputation: 12 231

Just curious, but what is a "checksum", and what is "data integrity" supposed to mean? – Kaitlyn Mcmordie – 2011-10-27T18:57:35.900

What other benefits does this configuration allow zfs to provide, compared to other filesystems? – Kaitlyn Mcmordie – 2011-10-27T19:07:26.573

2

Take "data integrity" at face value - making sure things don't get changed or corrupted, basically. As for a checksum, that's basically a hash taken of a file to verify it matches what you expect (no transcription errors or damage or anything). This is a pretty important concept, so get familiar with it. http://en.wikipedia.org/wiki/Checksum

– Shinrai – 2011-10-27T20:10:15.447

To add to @Shinrai's comment, ZFS actually allows you use either a Fletcher checksum (computationally cheap) or a SHA-2 hash (more secure, i.e., less chance of false match, but more CPU intensive). – sblair – 2011-10-27T20:42:31.180

I have to admit, I'm the bad kind of computer geek who couldn't care less about *NIX anything, let alone Solaris! (I'm a hardware guy mostly, I guess), so I'm not that familiar with ZFS...but it's a pretty neat implementation now that I get to looking. – Shinrai – 2011-10-27T22:12:19.180