Is there a file system that stores files under a hash so there are no duplicates? It can be under any operating system. I know Git does that, but I'm looking for something that can run in real-time.
Asked
Active
Viewed 1,245 times
4
-
3What problem are you trying to solve? – ewwhite Feb 20 '12 at 20:53
-
Having many duplicate files, and trying to avoid deleting them or creating links to them. I think a FS should be able to save a hash and the number of links pointing to it... – mik Mar 06 '12 at 17:23
-
Do you mean "inline" deduplication instead of a post-processing? – dmeister Apr 03 '12 at 16:59
2 Answers
6
ZFS does this, but it is not a file-level deduplication. It's two better: block level deduplication (the intermediary between block and file deduplication being byte deduplication).
On Linux, there is SDFS; however ZFS has some better features like the ability to use a solid state drive as a hash table store so you're not eating up enormous amounts of RAM for your hash table. ZFS calls this L2ARC.
As of the writing of this post, please do not use ZFS on Linux. It needs to stay in the oven for a few more years. Use a BSD for ZFS.
1
Yes. ZFS does this, though it might 'bend' your definition of 'real-time'.
Chris S
- 77,337
- 11
- 120
- 212