0

I am currently planning the storage infrastructure for our new application. Since performance is a big and important item on our list, I want to use SSDs for the production environment. We sell a product that generates a lot of data (image/video hosting), but since we are still a startup company, our budget is not endless.

I want to implement HA for our storage, but running two ~ 70 TB SSD clusters in distributed replication mode is pointless for me, since fortunately one node rarely fails. So I thought about running one SSD cluster in production while one HDD cluster acts as a failover that takes over if one node or the whole SSD cluster fails.

Is this feasible with GlusterFS (or a similar scalable distributed file system like cephfs) or is the whole concept stupid? The topic is still quite new to me, so I'm happy to learn something new!

Thank you.

dr-ing
  • 1

1 Answers1

0

Gluster probably is not the right solution to your problem, because:

  • with only two servers, it is going to have low random IO performance;
  • your total write bandwidth will be limited by the server with HDD pool.

I would suggest using ZFS with async send/recv from the SSD server versus the HDD one, but consider that you will need to coordinate a manual failover procedure in the case the first server dies.

shodanshok
  • 44,038
  • 6
  • 98
  • 162
  • Thanks for the answer, some more infos, which I probably should have mentioned in the question: My plan was to start with 5 nodes/servers (4 x 3.84 TB SSD disks per node) and then scale to X nodes over time. As you mentioned, I also planned to synchronize the data asynchronously between the production cluster and the failover so that customers get at a minimum the full speed of the SSD disk. – dr-ing Nov 20 '20 at 07:47