I'm about to set up a Linux cluster of 5 physical server nodes (more nodes to be added later, probably).
- the cluster will be managed by Proxmox (and yes, it works in software RAID)
- shared storage will be implemented with Gluster in redundant setup with each physical server holding a brick (so, data will be redundantly available from all machines)
- Percona XtraDB cluster will be used as main, multi-master database - again with data shared by all physical machines
- each machine will have two HDD about 2-3 TB in size each, in RAID1 setup
- all machines will be hosted in a large datacenter with redundant power supply etc..
- server specs can be seen here
- the scope of the whole cluster is to distribute workload + allow high availability. A machine can go down at any time without being a problem for the whole system.
One of the decisions left to take is whether to use software RAID1 or hardware RAID1 + BBU.
Software RAID is the solution I'm very familiar with (I'm managing a number of servers since 15 years and I know how the tools work). I never had a serious problem with it (mainly only the HDD fail). These are the reasons why I prefer software RAID.
What I dislike about hardware RAID is the incompatibility between controller vendors and the lack of experience I'm having with them: different configuration options, different monitoring method, different utility programs - not a good feeling for creating a cluster system.
I know that, when using a BBU, hardware RAID can both be fast and reliable (write through cache). However, since all data will be stored in a highly redundant manner in the cluster, my idea is to use software RAID1 and disable barriers in the file system to increase write performance. I expect that this will lead to similar performance like hardware RAID1. Of course, I risk data loss due to the volatile write cache, however IMHO that should be handled by the clustering mechanisms anyway (the whole machine should be able to restore data from the other nodes after failure).
I'm not having concerns about the CPU resources needed by a software RAID implementation.
Is my assumption correct or am I missing some important detail that would help me making the right choice?