Is there a git-like file system?



Git stores content uniquely in its repo based on the calculated hash of any file. If my directory has two copies of the same file somewhere inside it, git will only actually store it once.

I am wondering if this same concept has been implemented at the operating-system level as some kind of file system?

If a file system acted this way by default it would nicely help with dll hell issues. Essentially, it would symlink automatically on your behalf. Any application could be packaged (like a jar) in a directory with all of its dependencies and no extra storage cost.

Ruby enthusiasts share libraries by publishing them as rubygems. Still, this effort to share gems resulted in deployment nightmares that lead to the Vendor Everything concept of copying all dependencies into local folders to avoid such nightmares.


Posted 2013-08-09T16:22:44.777

Reputation: 437


I'm not an expert, but check out ZFS.

– ForeverWintr – 2013-08-09T16:31:07.720



What you're looking for is called "deduplication". While it's usually implemented by vendors of specialized storage products, the ZFS filesystem implements it as well. Most Unix-derived operating systems can make use of ZFS, and I'd therefore recommend it as the first place to look.

Aaron Miller

Posted 2013-08-09T16:22:44.777

Reputation: 8 849

1I see "deduplication" can be implemented at the file-level which is what I was concerned with in particular. – Mario – 2013-08-09T16:45:57.250


Network Appliance, Inc. has had storage ability like this for many years, in fact they had complaints filed with Sun Microsystems for their ZFS file system, and does what Aaron Miller is mentioning in his accepted answer. For what it's worth, the complaints were settled with Oracle after about 3 years.

I have used this as a corporate solution since 2000, and it works well. The cost of storage after the first 'copy' comes about once there is a change in the file. Otherwise, with many 'copies' comes only a slight increase for namespace.

I don't think this answers the question about "operating system level" but rather on a "file system level".


Posted 2013-08-09T16:22:44.777

Reputation: 286


The new Apple File System (APFS, so named because there was already an AFS that was something else) does this "auto hardlinking"/"deduplication" magic. macOS 10.13 supports it natively (on most Macs) as does iOS 11.

That being said, I don't feel like that's enough to make it "git like", since if it was "git like" then it would also have a cryptographic checksum of the state of my directory structure at given points in time so that I could be sure no one had hacked my computer or modified my system directories. In fact, I use git repos to track certain critical system directories on my Macs, like Apache config files, LaunchDaemons, LaunchAgents, and a few others -- that way when I install software or run my server for awhile I can see if anything has gotten screwed up.

APFS also has nifty support for offloading things from the file system to the cloud when they haven't been used for awhile, yet they still look like they're there, and will populate back down from the cloud on-demand.

You could always build a Hackintosh and muck around with it. BSD is fun.


Posted 2013-08-09T16:22:44.777

Reputation: 111