With Hadoop and CouchDB all over in Blogs and related news what's a distributed-fault-tolerant storage (engine) that actually works.
- CouchDB doesn't actually have any distribution features built-in, to my knowledge the glue to automagically distribute entries or even whole databases is simply missing.
- Hadoop seems to be very widely used - at least it gets good press, but still has a single point of failure: The NameNode. Plus, it's only mountable via FUSE, I understand the HDFS isn't actually the main goal of Hadoop
- GlusterFS does have a shared nothing concept but lately I read several posts that lead me to the opinion it's not quite as stable
- Lustre also has a single point of failure as it uses a dedicated metadata server
- Ceph seems to be the player of choice but the homepage states it is still in it's alpha stages.
So the question is which distributed filesystem has the following feature set (no particular order):
- POSIX-compatible
- easy addition/removal of nodes
- shared-nothing concept
- runs on cheap hardware (AMD Geode or VIA Eden class processors)
- authentication/authorization built-in
- a network file system (I'd like to be able to mount it simultaneously on different hosts)
Nice to have:
- locally accessible files: I can take a node down mount the partition with a standard local file system (ext3/xfs/whatever...) and still access the files
I'm not looking for hosted applications, rather something that will allow me to take say 10GB of each of our hardware boxes and have that storage available in our network, easily mountable on a multitude of hosts.