1

I'm looking for a way to distribute a big folder (~40-60g) to multiple servers (4 or more). A simple scp command in a loop is working already. I would like to have a faster method but simply parallelizing this easily with command & and wait (or GNU parallel) won't improve much as the bandwidth is limited. Also I want a simple method no distributed file system setup should be involved.

And I have read that nfs will be faster than scp or special rsync, which is all good but I think a faster version is possible if there are more than 3 target servers, i.e. a "tree copy" mechanism: copy from the source to server A and B, then copy from A to C and D and in parallel from B to E and F and so on.

                C ...
              /
            A 
          /   \ D ...
         /
source --       E ...
          \   / 
            B 
              \ F ...

Is there already a tool where I can just provide the IPs or hostnames and it does this efficient "tree"-copying? Or a simple script which does this via scp, pssh, sshfs or similar?

Karussell
  • 191
  • 2
  • 15
  • I found this inactive project, but the setup seems also relative complex involving OpenMPI: https://github.com/hpc/dcp – Karussell Jun 12 '16 at 12:27
  • On twitter someone suggested syncthing: https://docs.syncthing.net/users/syncthing.html – Karussell Jun 12 '16 at 19:14
  • Another one I found is https://aria2.github.io/ and the standard torrent client on ubuntu: https://help.ubuntu.com/community/TransmissionHowTo (see here for more http://askubuntu.com/questions/65387/is-there-bittorrent-software-that-runs-in-a-terminal or http://askubuntu.com/questions/29872/torrent-client-for-the-command-line) – Karussell Jun 12 '16 at 19:26

2 Answers2

2

BitTorrent or other peer to peer file sharing. May be some work setting up the tracker, but it will use every host's upload.

You will need to test to see what is faster in your environment.

John Mahowald
  • 30,009
  • 1
  • 17
  • 32
  • Thanks - nice idea! Will see how simple it is to set this up – Karussell Jun 12 '16 at 15:01
  • 1
    Facebook had the idea some time ago. http://arstechnica.com/business/2012/04/exclusive-a-behind-the-scenes-look-at-facebook-release-engineering/ – John Mahowald Jun 12 '16 at 15:07
  • Hmmh, I cannot find which bit torrent I should use. uTorrent, rTorrent and all seem to have 'ads' included. Do I need just the client and server the file over http? Http is a lot slower, is ssh possible too? – Karussell Jun 12 '16 at 15:44
  • Here is another link for bittorrent and they call the technique 'seeding' and 'fan-out': http://www.ebaytechblog.com/2012/01/31/bittorrent-for-package-distribution-in-the-enterprise/ also mentions 'HDFS seeding' – Karussell Jun 12 '16 at 16:09
  • "Web seeding" can download a torrent's contents you serve over http or ftp. It is not required for a BitTorrent swarm. A BitTorrent tracker is required however it is seeded. – John Mahowald Jun 12 '16 at 16:55
-1

If the servers are world wide (i.e. not on your local 10 Gbps network), then https://storj.io might be a solution, too.

Ole Tange
  • 2,836
  • 5
  • 29
  • 45