Hear me out
I have seen the question asked (in different forms) here, here, and perhaps the best one I found was here, but I do not think this is a duplicate because quite some time has past since those questions were asked, and my question has its own nuances that may help others in similar situations. Please hear me out.
Background
My question comes from there being many great distributed filesystems that advertise that they are amazing, but I suspect not all of them are what I need.
I have looked at this awesome list for suggestions on what is available and am not sure which one fits my needs.
Use case
The purpose for this server is to keep my data safe and available for general use. I will be using it to store my personal backups, and data stored and used by Nextcloud, Gogs, and anything else I self-host in the future.
What I am looking for
I am looking for a distributed filesystem that:
- protects against bit rot
- has erasure coding (or at least data duplication so drive failure doesn't disrupt usage)
- ability to scale
- from 1 server to more later
- from 2 HDDs to more later
- can connect via fuse
Powerful API and ease of use are big plusses.
My current hardware
This may not be important, but it may help with tips on implementation.
I currently have a Raspberry Pi, one 2 TB HDD and one 4 TB HDD. I plan to add one more 2 TB HDD in the near future, and more servers with many more HDDs in the far future (money is tight right now; am poor college student).
My currently proposed solution
I have researched this a lot, and I get this is a little over my head, but here's what I've got so far:
I'm thinking that Ceph is currently my best bet when it comes to flexibility and it seems stable.
My plan would be to to put BTRFS on the drives to handle bit rot, and then run Ceph as a single node cluster for later expansion.
Questions about how this would work
Some specific questions I have about my proposed setup:
- I know that BTRFS can have bit rot protection, but is that by default? What do I need for it to be enabled?
- I know that the inconsistency in drive size can be a problem (one 2 TB, one 4 TB), but will it work until I get another 2 TB drive?
Thanks
I really appreciate you reading this far :)