Live file syncronization across multiple Linux servers with millions of files and directories

Question

What is the best way to synchronize huge data of a running production server?

Our server has over 20 million files (small files with 10k and bigger files up to 50MB) stored in 1 millon directories. The size of all data is about 5 TB (steadily increasing).

Is it possible to synchronize the data with lsyncd and what are the limits (especially of inotify)? How much additional space does lsyncd need? What about the load (cpu and memory) and the live time?

Another solution would be GlusterFS. Is it possible to use GlusterFS on a production with none or minimal downtime? GlusterFS stores a lot of magic data in x-attributes files and the storage volume is about 15 to 20% bigger than systems with non GlusterFS. Seems like a huge amount of waste...? What about the load?

And at least rsync and cronjobs could do the job. rsync would only run on the slave... So no additional space is needed on the primary server, but rsync must read the full directory tree every time the cron runs...

What are you trying to achieve? Why? How much is achieving it worth to you, and how much is your time worth (in other words, how much can you spend)? — Law29, Jul 19 '16 at 21:54
We want to have a live time slave server in such a way, that costs less space, memory and cpu as possible of the primary server. It would be great if the slave is ready for use in less than 1 hour (downtime < 1 hour). — Barmi, Jul 20 '16 at 07:27

score 5 · Answer 1 · answered Jul 20 '16 at 00:21

5

I'd seriously recommend using something like ZFS for the filesystem.

Built-in tools like the ZFS snapshot and ZFS send/receive allow you to take block-level snaps of the filesystem and ship it to a second server.

Some third party tools like sanoid/syncoid can set automatic management/pruning and synchronization of your filesystem from one host to another.

This is done at the block device level, so you avoid the rsync checksum/inventory process.

answered Jul 20 '16 at 00:21

ewwhite

194,921
91
434
799

ZFS is really great, but we can not change the filesystem on the production server. So the question is how many ressources do the other solutions cost and how long is the downtime if the primary fails... – Barmi Jul 20 '16 at 07:29
I could _build_ you a clustered NFS and ZFS-based solution that could do what you need for a few thousand dollars. – ewwhite Jul 20 '16 at 11:35
Building a clustered NFS is not the problem... How I said: We can't change the running system at the moment, but thanks for your offer :) – Barmi Jul 20 '16 at 13:57
1

I suggested an option that would ease the pain of the type of synchronization you're doing. That's it. Unless you're doing it at the sab or storage level, we don't have any other options. So it looks like you will have to deal with the pain of a long sync. – ewwhite Jul 20 '16 at 14:02
1

@Barmi Since you seem to be mostly concerned about performance, installing a script that will periodically check all of your 20 million files definitely counts as changing the system. Installing a script that monitors inotify is a good idea if you don't have much changes, but as you said in your question there can be limits and performance issues, and it's also changing the system. In fact, any way you set up a synchronized backup system is going to change your system, so the importance of setting up the backup should dictate how big that change is allowed to be. – Law29 Jul 20 '16 at 14:11
@ ewwhite: "ZFS should only be connected to a RAID card that can be set to JBOD mode, or preferably connected to an HBA. Some RAID cards allow pass-through mode." We have built a raid 0 with severel hard disk using a raid controler. Is there an advantage with pass-through if we do not use the raid capabilities of ZFS, because of raid 0? – Barmi Jul 26 '16 at 10:52

score 3 · Answer 2 · answered Jul 20 '16 at 08:20

3

If you cannot change the filesystem on the production server, I would put the files on another server and mount them with NFS. I would use Linux and ZFS if man-hours are inexpensive, maybe some kind of home NAS distribution or maybe even a home NAS (both probably ZFS-based) if everything is expensive and you can find one that does professional-level redundancy, or a NetApp or an IBM Spectrum Scale if money is not a problem compared to reliability and support.

Once you have the files on a real full-featured file server with professional-level redundancy, you point your backup server either directly to the primary NFS IP if you configured failover, or to the backup NFS server.

answered Jul 20 '16 at 08:20

Law29

3,507
1
15
28

Thanks for your answer. I think we try ZFS on the backup server. We built a raid 0 (performance) with severel hard disk using a raid controler , but I read: "ZFS should only be connected to a RAID card that can be set to JBOD mode, or preferably connected to an HBA. Some RAID cards allow pass-through mode. If it has it, this is the preferable thing to do." Does this make also sense in our case or can we use the raid card? As mentioned above we do not use the raid capabilities of ZFS, because of raid 0. – Barmi Jul 25 '16 at 08:08
I think @ewwhite is a bigger expert on ZFS than I am, I've often used ZFS but never through a local RAID card (only FC HBAs and plain local SATA). – Law29 Jul 25 '16 at 14:31

Live file syncronization across multiple Linux servers with millions of files and directories

2 Answers2