Is it possible to have multiple storage servers in a failover cluster?

Question

I'm familiar with setting up RAID arrays and am running a few in my home environment.

I was wondering if it's possible to have multiple storage servers in a failover configuration.

What I hope to achieve with this is to have a certain redundancy with these servers. When one server breaks down, it can be replaced entirely without loss of data. I don't mean just a disk failure, but something more serious. Maybe a BIOS corruption. I've had those before... It wasn't pretty.

What would be the best way to achieve this and how would I have to set it up?

What exactly do you want to achieve? Your question isn't very clear. — mzhaase, Sep 27 '16 at 07:51
I was afraid of that, yes. And I'm not sure how to clarify. I'd like to have a rack of servers, all with their own disks. If one of these servers fails in one way or another, it must be possible to take the defective server out of the rack and replace it with a new one and have the other servers restore the data back onto the new one. It should work like RAID 10 but not with disks, rather with entire servers, while still having a RAID setup of some kind on those individual servers. — Mephy, Sep 27 '16 at 07:59
DRBD can be used to keep block devices in sync between hosts. Is that what you're asking about? — EEAA, Sep 27 '16 at 08:03
You want a failover cluster. What OS are you using? Is it a NAS or SAN? Individual servers or heads with HDD racks? — mzhaase, Sep 27 '16 at 08:10
[This Q&A](http://serverfault.com/q/805421/37681) might be a duplicate — HBruijn, Sep 27 '16 at 08:18
Sounds like a job for a regular Failover Cluster and a proper SAN. What you describe sounds overly complicated and like a management hell. — Daniel, Sep 27 '16 at 09:18

score 5 · Accepted Answer · answered Sep 27 '16 at 09:24

Yes, there are a whole slew of solutions for this. However, you haven't stated what OS you intend to use, or what scale this will be run at, or how this data is meant to be used. Those are some of the most important questions.

What you're referring to is a replicated filesystem or block device. In almost all cases, a minimum of two nodes is required to do this, with at least three being recommended for quorum reasons (we need a tie breaker when nodes vote on if a file is bad or a node is down. Majority rules, so node total must be three or higher to do so.) Quorum and fencing are core concepts to clustering of just about any kind, and three node (or more) clusters are the way to go (ESPECIALLY when we're talking about a storage cluster).

Linux has a few great solutions for this. DRDB is a replicated block system that you can use to synchronize one or more partitions or disks from one node to potentially many, and from many to one. DRDB is more appropriate in failover situations rather than simultaneous node access, as it has a master / slave relationship with its nodes.

DRDB makes no assumptions as to what you'll be doing with these block devices after it has sychronized and presented them to you. IT doesn't care if you try to mount a filesystem that can't be clustered, so be careful. Filesystems like OCFS2 will allow you to access DRDB devices simultaneously, but in most cases people use DRDB in a failover capacity, mounting non-clustered filesystems on only one node at a time (such as an EXT4 filesystem). There are very good reasons for doing this. The biggest one is that we can get much better performance if we don't have to replicate data IMMEDIATELY after it is written, as we would have to do in a shared mount scenario (such as with OCFS2). More on DRDB here: https://www.drbd.org

For a replicated and distributed "Cloud" model that you can access simultaneously, GlusterFS is a great solution that is quite simple to set up. It's best used in cloud environments such as storage array clusters and VM host clusters that need a shared filesystem between all nodes.

With Gluster, we replicate filesystems together on nodes, and it acts more like filesharing with a NAS when interfaced with (with one important difference - your server is many nodes rather than just one big storage array). This is fantastic for highly available storage arrays running and supporting Linux systems, and has extraordinary performance at scale. It has no concept of master / slave, and all nodes participate equally and at the same time. Adding more nodes typically means more performance in reasonable configurations. Gluster is highly modular and configurable.

Gluster is more complicated when you begin connecting Windows systems to it, because Windows can't connect directly to Gluster (it instead has to "share" the filesystem with CIFS via Samba - and Samab can get complicated). However, for Linux - Linux sharing, Gluster is incredibly easy to deploy and manage. More on that here: https://www.gluster.org/

In a concept similar to Gluster, you can look into ceph. This is a distributed object store that can have block and filesystem translators running in front of it. It's very commonly used in Openstack clouds - currently even more than Gluster. It has filesystem capabilities, as well as object storage and block storage. It's more complicated than Gluster, but currently is more feature rich. http://docs.ceph.com/docs/jewel/

And finally, for Windows we have XtremeFS that can provide you with a replicated block store (kind of like ceph) and DFS.

DFS performs acceptably at small scales, and can be used to sychronize filesystems on many simultaneous nodes. It's similar to Gluster, if you will. It does not perform as well, and isn't as feature rich, but for small scale gets the job done quickly (and is very easy to set up on a Windows server). Windows clients can connect to it natively, unlike the above solutions.

None of these technologies have anything to do with RAID. If you wish to have a RAID array on your constituent nodes within your storage cluster, you can. However, many deployments of larger cloud fileystems only worry about redundancy at the node level, opting to not use RAID at all. When a node's disk goes bad, take the node offline and replace the disk. Bring the node online and resync the whole node. That's the idea, anyways. At smaller scales, RAID still makes sense.

In general, yeah. What you're asking is possible.

Thanks for the great insights! I can definitely work with this. — Mephy, Sep 27 '16 at 10:46
Much obliged. If you have any specific questions, let me know. Always happy to help. — Spooler, Sep 27 '16 at 15:31

score 0 · Answer 2 · answered Sep 27 '16 at 09:17

Take a look at this cool project : CEPH Project. It enables a fully distributed storage pool with good performances as well.

If you want anyway to keep with old-style solutions, DRBD is a block-to-block synchronization between servers, RAID1-style. But it is resource-consumptive.

score 0 · Answer 3 · answered Sep 27 '16 at 09:28

This is called a failover cluster. There are multiple ways to achieve this. Some work on filesystem level, like GlusterFS or ZFS replication. Some work on hardware level, like driving HDD trays with multiple head units. Usually you want to do both to achieve both geographic and hardware redundancy.

A head unit is basically the file server, the HDD trays are just dumb units full of HDDs, connected to the head. The head unit does all of the work. Usually you can pair these head units together, so that if one fails, the other takes over transparently. Even some entry level NAS support failover.

Aside from that you probably do want to sync with a different geographic location. This is usually done on the FS level, but could be problematic based on usage an network connectivity.

Is it possible to have multiple storage servers in a failover cluster?

3 Answers3