According to The First Google Result for "ZFS Deduplication"
...
What to dedup: Files, blocks, or bytes?
...
Block-level dedup has somewhat higher overhead than file-level dedup when whole files are duplicated, but unlike file-level dedup, it handles block-level data such as virtual machine images extremely well.
...
ZFS provides block-level deduplication
...
According to Wikipedia's ZFS Article
ZFS uses variable-sized blocks of up to 128 kilobytes. The currently available code allows the administrator to tune the maximum block size used as certain workloads do not perform well with large blocks. If data compression (LZJB) is enabled, variable block sizes are used. If a block can be compressed to fit into a smaller block size, the smaller size is used on the disk to use less storage and improve IO throughput (though at the cost of increased CPU use for the compression and decompression operations).
I want to make sure I understand this correctly.
Assuming compression is off
If I a randomly filled file of 1GB, then I write a second file that is the same except half way through, I change one of the bytes. Will that file be deduplicated (all except for the changed byte's block?)
If I write a single byte file, will it take a whole 128 kilobytes? If not, will the blocks get larger in the event the file gets longer?
If a file takes two 64kilobyte blocks (would this ever happen?), then would an identical file get deduped after taking a single 128 kilobyte block
If a file is shortened, then part of its block would have been ignored, perhaps the data would not be reset to 0x00 bytes. Would a half used block get deduped?