23

By large file tree I mean about 200k files, and growing all the time. A relatively small number of files are being changed in any given hour though.

By bidirectional I mean that changes may occur on either server and need to be pushed to the other, so rsync doesn't seem appropriate.

By distant I mean that the servers are both in data centers, but geographically remote from each other. Currently there are only 2 servers, but that may expand over time.

By real-time, it's ok for there to be a little latency between syncing, but running a cron every 1-2 minutes doesn't seem right, since a very small fraction of files may change in any given hour, let alone minute.

EDIT: This is running on VPS's so I might be limited on the kinds of kernel-level stuff I can do. Also, the VPS's are not resource-rich, so I'd shy away from solutions that require lots of ram (like Gluster?).

What's the best / most "accepted" approach to get this done? This seems like it would be a common need, but I haven't been able to find a generally accepted approach yet, which was surprising. (I'm seeking the safety of the masses. :)

I've come across lsyncd to trigger a sync at the filesystem change level. That seems clever though not super common, and I'm a bit confused by the various lsyncd approaches. There's just using lsyncd with rsync, but it seems this could be fragile for bidirectionality since rsync doesn't have a notion of memory (eg- to know whether a deleted file on A should be deleted on B or whether it's a new file on B that should be copied to A). lipsync appears to be just a lsyncd+rsync implementation, right?

Then there's using lsyncd with csync2, like this: https://icicimov.github.io/blog/devops/File-system-sync-with-Csync2-and-Lsyncd/ ... I'm leaning towards this approach, but csync2 is a little quirky, though I did do a successful test of it. I'm mostly concerned that I haven't been able to find a lot of community confirmation of this method.

People on here seem to like Unison a lot, but it seems that it is no longer under active development and it's not clear that it has an automatic trigger like lsyncd.

I've seen Gluster mentioned, but maybe overkill for what I need?

UPDATE: fyi- I ended up going with the original solution I mentioned: lsyncd+csync2. It seems to work quite well, and I like the architectural approach of having the servers be very loosely joined, so that each server can operate indefinitely on its own regardless of the link quality between them.

dlo
  • 451
  • 1
  • 4
  • 14
  • What kind of changes do you need to handle? E.G. creation, deletion, modification. – sciurus Sep 16 '11 at 22:22
  • Also, do you expect conflicts? Could the same file be modified on both servers? – sciurus Sep 16 '11 at 22:24
  • All changes: creation, deletion, modification. There is a potential for conflicts, but they should be rare. I wouldn't mind if I simply receive an alert on a conflict that I then have to resolve manually. – dlo Sep 17 '11 at 12:30

6 Answers6

5

DRBD in Dual-primary mode with a Proxy is an option.

quanta
  • 50,327
  • 19
  • 152
  • 213
  • The Proxy appears to be neither open source nor free, right? I'm not sure I understand the consequence of not having a Proxy in async mode: during an extended downtime, if there's no Proxy, the [small?] output buffer could fill up and we would lose the sync? Is it hard to recover from that? – dlo Sep 17 '11 at 12:21
  • See my answer above. I don`t think the proxy is the thing you need. Even during a small downtime the drbd-meta-device will mark "dirty" blocks and will transfer them after the connection is up again. I think the main difference between proxy and async-mode is that async-mode uses a maximum buffer of some MBs. After that it syncs up befor filling the buffer again. The proxy propably allows for a bigger buffer (needed if you have big latency or can write much faster locally than remote). – Nils Sep 22 '11 at 20:37
3

In your case I would recommend a combination of DRBD in dual-primary-mode and gfs or ocfs.

The drawback of DRBD in dual-primary is that it will be running in syncronous mode. But write-speed does not seem to be important here right?

An alternative to DRBD might be a Soft-Raid1 using many (2+) iSCSI-Targets - but I would prefer DRBD with two nodes.

Nils
  • 7,657
  • 3
  • 31
  • 71
  • 1
    Synchronous mode would be bad--I don't need it, and I wouldn't want to undermine performance since the servers are connected over a WAN across continents. But can't you have dual-primary in async mode? – dlo Sep 17 '11 at 12:28
  • I am currently using DRBD 8.3.5 - there you have to be in sync-mode ("C") to get into dual primary mode. I have no personal experience with DRBD proxy but it seems to be similar to Veritas Volume Replicator - but this is propably not suited since you want write-access on both sides. Sync mode on block-level might not be as bad as you think - perhaps gfs and/or ocfs can buffer writes. – Nils Sep 17 '11 at 20:25
  • I just checked a [german article](http://www.linux-magazin.de/Online-Artikel/GFS2-und-OCFS2-zwei-Cluster-Dateisysteme-im-Linux-Kernel) comparing GFS2 and OCFS2. From that at least OCFS2 seems to support buffered file-system-access. GFS2 is recommended in that article since it is older. See [RedHat documentation](http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Global_File_System_2/s1-ov-lockbounce.html) on GFS2 for details about GFS2 - it uses buffering, too - but you should use different dirs for concurrent writes to get the best performance. – Nils Sep 17 '11 at 20:36
2

Rather than syncing, why not share the same filesystem over NFS?

Bart B
  • 3,419
  • 6
  • 30
  • 42
  • 2
    NFS is awful, just awful. Anything would be better than NFS – AliGibbs Sep 14 '11 at 14:31
  • 2
    One of the main points of the multi-server setup is failover/redundancy. So one server must be able to continue without the other. – dlo Sep 14 '11 at 15:01
  • You should have mentioned that in your question then - no need to vote down a perfectly reasonable answer! – Bart B Sep 14 '11 at 16:20
  • fyi I didn't downvote it--somebody else did. But yes, I should have mentioned that to begin with. – dlo Sep 14 '11 at 16:41
  • @Bart: Well - he did mention that there is concurrent access on two distant sites. So even if you put up HA-NFS that would be a bad solution, since one side would suffer from latency during NFS-access. And I did not downvote either. But I`ve been NFS admin long enough to support AliGibbs. :-/ – Nils Sep 20 '11 at 19:28
2

Implementing a distributed filesystem is probably better than hacking this together with tools and scripts, especially if the cluster of servers will grow. You'll also be able to handle a downed node better.

I don't think Gluster (or AFS) is overkill at all.

  • Gluster requires 1GB ram? http://www.gluster.com/community/documentation/index.php/Gluster_3.2:_Checking_Minimum_Requirements ... I'm also on a VPS, so I'm not sure about making kernel level changes that AFS might require. But I'm starting to see that a proper distributed fs is the better path. – dlo Sep 17 '11 at 12:23
  • Yeah, sorry I didn't catch earlier that you were using VPS hosts. Gluster memory footprints, both server and client, are not small and they can grow substantially. DRBD sounds more appropriate. –  Sep 17 '11 at 20:17
  • AFS is the way to go. – Anthony Giorgio Sep 23 '11 at 13:34
0

As demonstrated above, many solutions are available, each with its advantages and drawbacks.

I think I would consider placing the whole tree under version control (Subversion, for instance) and periodically checking in/updating from both servers in cron jobs.

0

Having just ended somewhat of a quest regarding the same thing, I'm going with gluster. However, I haven't done or found any performance tests.

cbaltatescu
  • 355
  • 1
  • 4
  • 9