5

My problem is regarding a master-master (3 master nodes) file synchronization setup, where each node is in a different DataCenter. I have three application servers where users can create/modify/delete files and I need to keep them in sync, hopefully with low latency between the sync (2 min is acceptable, real-time is ideal). We have a total of 376,136 files for a total of 100GB modifying (Create + Delete + Modified) at most 1,000 files a day. It's a fair assumption that a file won't be modified on two different servers at the same time.

I have googled a lot on the last week on this issue and I'm yet to find a "THIS IS IT!" solution.

The options I have seen are:

  • Unison: Abandonware (My sysadmin claims it isn't reliable)
  • Rsync: Doesn't work with delete and it's not meant to be bidirectional
  • Osync: It could be, but it seems it may be hindered by a large file tree
  • lsyncd : From their GitHub page it seems the best option so far.
  • Minio (using a aw3 file storage way): It's not designed for a master-master setup, but for a distributed storage solution
  • Cloud Storage: It would be ideal, but there isn't a good cloud provider in our Country and international internet speeds sucks here so off-country storage doesn't work for us
  • GlusterFS / Ceph / DRBD: Black magic hard to configure, maintain, control and debug, and not really suited for sync between DataCenters (From my experience, additional insights would be welcome)
  • Mirror: It seems like it is a nice option, but seems to be designed for intranet and small files.

We work with dockers, but I haven't found a docker volume plugin either that would solve this.

Anyone facing/solving this issue? Which tool is better? Is there any other tool that would be better suited for this problem?

Jimmy
  • 211
  • 2
  • 6
  • The first result on Google for "docker distributed file system" led to https://github.com/moosefs/moosefs . I have no experience with it, let me know if it's any good :) – axus Jul 31 '18 at 17:33
  • 1
    Looks like we're researching almost exactly the same thing, went down all the same ratholes and while I'd like a master-master solution, my use case allows me to go with lsyncd as well, I'm not quite sure how split-brain recovery works there though. All the distributed filesystems do look incredibly complex to set up and I haven't seen a true master-master scenario there either anyway. – duncanwilcox Aug 12 '18 at 15:34

1 Answers1

1

I'll go with GlusterFS (which is not so difficult to setup), but you can also try with CSYNC2:

https://github.com/LINBIT/csync2

I've used to replicate a set of file over a cluster with nice results.

SimoneLazzaris
  • 382
  • 1
  • 4