Failover strategies for home server (RAID, gluster, etc.)

1

1

I have a Raspberry PI at home, running Raspian and some servers like Apache, MySQL and ssh. The Raspberry is directly connected (LAN) to the home router and to a 2TB external ext4-formatted hard drive. As there is important stuff on it (backups, pictures, documents, etc.) I rsync the whole external drive every 2 weeks to another external drive. Everything worked fine up to now but lately the main drive seems to have some troubles... (gets mounted ro, fsck fixes several errors)

Because of this (and also because storage will soon run out) I'm currently looking for safer - if possible automatic - methods to have the data saved securely.

First I thought of a RAID that would save files and backups over multiple drives. Although I'm not sure how I would implement this...

Later I found glusterfs which seemed to have some advantages:

  • Gluster can split up large files (AFAIK improves access speed)
  • Gluster can save files on multiple volumes, and is able to manage a drive failure automatically

However I'm again not sure if my Raspberry Pi could act as a gluster master as well as a gluster slave; still running the other services as well.

I'd like to be able to "hot-swap" a broken drive and let the system recover itself, without having to care about data integrity. Safety and availability are more important than the access speed. Storage capacity should be beetween 2TB-4TB.

How many drives and what software configuration would I need to set up to have this comfort?

Thank you for any suggestions!

pentix

Posted 2016-03-17T00:29:17.017

Reputation: 113

Answers

2

First off - RAID IS NOT BACKUP. RAID protects against hard drive failure, thats about it. Its worthwhile to do, but does not provide protection against data corruption, theft, accidental deletes, Cryptolocker type attacks. You do want to use RAID if you intend to do hotswap.

I'd be inclined to look at a 2 part solution - Use RAID to increase reliability of your disks and provide availability and hotswap. (Note you probably need to use RAID1, so a couple of 2TB or 4TB disks - DO NOT USE RAID 5).

In order to use RAID you would implement "Software RAID" - typically provided by "mdadmin".

I'd then look at a way of doing offsite/offline mirroring/archiving - there are a number of ways of doing this - RSnapshot is a good idea which allows incremental backups, or maybe setting up Owncloud in case your gear is nicked or you do something stupid.

davidgo

Posted 2016-03-17T00:29:17.017

Reputation: 49 152

Thank you, I'll have some thougts about it and share it again with you :) Due to a lack of reputation on SF I can't upvote your solution... – pentix – 2016-03-17T06:03:04.443

So am I right in the conclusion, glusterfs would not be an alternative to RAID? Doesn't it also provide the hot-swap possibility? – pentix – 2016-03-17T06:03:43.957

I have not used GlusterFS, but yes, it would be an alternative to RAID. I very much doubt it would offer hot-swap possibility, but it would allow you to continue to operate even a drive fails, and allows you to recreate the data from a backup, giving you similar functionality. I confess to being surprised that a PI is powerful enough to run GlusterFS - although it appears this is the case. You would, I believe, need more then 1 Pi to make Gluster work however. – davidgo – 2016-03-17T09:12:01.903

Thanks again for your answer! I'm currently thinking about setting up a RAID-1 using 2x3TB, using OwnCloud as "version control" software to have the possibility to jump back to an old state of a file. As soon as I connected the two drives (A, B) and created a RAID and mounted it, is there a possibility to unmount drive (B), connect the old 2TB drive (C) to copy all the data from (C) to (A) and then unmount (C) and remount (B) to let mdadmin mirror all the files from (A) to (B)? The reason I'm asking this is because my Pi has only got 2 USB hubs... – pentix – 2016-03-18T00:14:44.757

1

You can do this as a one off - although you would be much better served by setting the RAID up initially as a degraded array, copy the data to it and then building the array - this will save days of building and rebuilding it . (You don't want to do this on a regular basis). https://zmonkey.org/blog/content/create-degraded-raid1-array shows 1 way to do this, but I typically use 2 devices and set 1 drive as missing as per accepted answer at http://unix.stackexchange.com/questions/63928/can-i-create-a-software-raid-1-with-one-device and add it afterwards....cont

– davidgo – 2016-03-18T02:06:50.340

Also note that RAID works at a BLOCK level - ideally at a partition level in the case of MDADM. This means it does not mirror files, it mirrors the entire block device - this is a lot slower as - at least until you become a RAID Ninja - it needs to sync the entire disk - which happens slowly in the background. Also, Owncloud is good for Offsite backup (and maybe or maybe not version control - If you need point in time recovery, RSNAPSHOT may be a better option. – davidgo – 2016-03-18T02:09:22.117

Owncloud would more be used as a convenient way to prevent the accidental delete of a file. And instead of sftp I would rather use WebDAV to provide access for my Windows users... – pentix – 2016-03-18T08:37:30.000

The accepted answer in your second link seems to be very useful for me! Another question: Would it also be possible to set one disk missing, remove it, attach another disk to copy some files from the (uncomplete) RAID to the newly attached disk, unmout it again and then re-add the other drive? (without having to wait days to rebuild the entire RAID array?) – pentix – 2016-03-18T08:40:34.577

No. When you set a disk as missing you remove it from the RAID setup and it will need to resync. I've never heard of anyone other then myself doing it (so higher risk), but I have, in the past used DRBD (which is designed for network RAID) to achieve something similar, and I believe it would work here - http://my.host.net.nz/2012/09/30/on-demand-raid-for-laptop-with-ssd-and-usb-disk/ details what I did, how I did it and how it performed in my usage case. Unlinke MDADM, DRBD tracks the changed blocks to allow resyncing [ because network failures are common ]

– davidgo – 2016-03-18T18:37:33.043

Thank you, I'll have a look for it. I might use a usb replicator to mount the two disks und be able to mount another drive as well... – pentix – 2016-03-18T20:10:43.480