Parallel file copy from single source to multiple targets?

15

5

I have a several large files on optical media I would like to copy to multiple targets - in this case I have two hard drives attached to the same computer. Is there a utility that can function like:

copy source target1 target2 ... targetN

Goyuix

Posted 2009-08-31T01:25:21.740

Reputation: 6 021

Answers

23

For single files you can use tee to copy to multiple places:

cat <inputfile> | tee <outfile1> <outfile2> > <outfile3>

or if you prefer the demoggified version:

tee <outfile1> <outfile2> > <outfile3> < <inputfile>

Note that as Dennis points out in the comments tee outputs to stdout as well as the listed files, hence using redirect to point to file 3 in the above examples. You could also redirect this to /dev/null as below - this has the advantage of keeping the file list more consistent on the command line (which may make it easier to script up a solution for variable numbers of files) but is a little less efficient (though the efficiency difference is small: about the same as the difference between using the cat version or the version without cat):

cat <inputfile> | tee <outfile1> <outfile2> <outfile3> > /dev/null

You could probably combine one of the above with find quite easily to operate on multiple files in one directory and less easily to operate on files spread over a directory structure. Otherwise you might just have to set the multiple copy operations off in parallel as separate tasks and hope that the OS disk cache is bright and/or big enough that each of the parallel tasks used cached read data from the first instead of causing drive-head thrashing.

AVAILABILITY: tee is commonly available on standard Linux setups and other unix or unix-alike systems, usually as part of the GNU "coreutils" package. If you are using Windows (your question doesn't specify) then you should find it in the various Windows ports such as Cygwin.

PROGRESS INFORMATION: As copying a large file off optical media may take some time (or over slow network, or an even larger file from even local fast media), progress information can be useful. On the command line I tend to use pipe viewer (available in most Linux distros & many Windows port collections and easy to compile yourself where not available directly) for this - just replace cat with pv like so:

pv <inputfile> | tee <outfile1> <outfile2> > <outfile3>

David Spillett

Posted 2009-08-31T01:25:21.740

Reputation: 22 424

For directories and multiple files just use tar instead of cat. For example tar cf - file1 file2 | tee >(tar xf - -C ouput1) | tar xf - -C output2 – CR. – 2018-02-24T15:19:17.330

5Note that tee will also output to stdout, so you may want to do tee outputfile1 outputfile2 < inputfile > /dev/null since outputting a binary file to the terminal could be noisy and mess with its settings. – Paused until further notice. – 2010-10-26T20:24:26.087

I found tee.exe to be part of the UnxUtils package. Thanks for the great tip! – Goyuix – 2009-08-31T14:40:52.290

5

For Windows:

n2ncopy will do this:

alt text

For Linux:

The cp command alone can copy from multiple sources but unfortunately not multiple destinations. You will need to run it multiple times in a loop of some sort. You can use a loop like so and place all directory names in a file:

OLDIFS=$IFS
IFS=$'\n'

for line in $(cat file.txt):
do
   cp file $line
done

IFS=$OLDIFS

or use xargs:

echo dir1 dir2 dir3 | xargs -n 1 cp file1

Both of these will allow you to copy entire directories/multiple files. This is also discussed in this StackOverflow article.

John T

Posted 2009-08-31T01:25:21.740

Reputation: 149 037

1

Google Fu - http://sourceforge.net/projects/n2ncopy/

– Fake Name – 2010-07-27T11:39:33.753

N2NCopy link appears to be broken. – Wesley – 2009-12-28T19:38:57.233

4

Based off of the answer given for a similar question Another way is to use GNU Parallel to run multiple cp instances at once:

parallel -j 0 -N 1 cp file1 ::: Destination1 Destination2 Destination3

The above command will copy file1 to all three destination folders in parallel

rmiesen

Posted 2009-08-31T01:25:21.740

Reputation: 331

2

In bash (Linux, Mac or Cygwin):

cat source | tee target1 target2 >targetN

(tee copies it's input to STDOUT, so use redirection on the last target).

In Windows, Cygwin is often overkill. Instead, you can just add the exes from the UnxUtils project, which include cat, tee, and many others.

mivk

Posted 2009-08-31T01:25:21.740

Reputation: 2 270

1

According to this answer: https://superuser.com/a/1064516/702806

A better solution is to use tar and tee. The command is more complicated but tar seems to be very powerful for transfer AND it needs to read the source just once.

tar -c /source/dirA/ /source/file1 | tee >(cd /foo/destination3/; tar -x) >(cd /bar/destination2/; tar -x) >(cd /foobar/destination1/; tar -x) > /dev/null

To use it in a script, you may need to launch your script with bash -x script.sh

Rémi Girard

Posted 2009-08-31T01:25:21.740

Reputation: 11

Funny. I thought "this does make sense, have an upvote". Upvoted. Then I checked the link… :D – Kamil Maciorowski – 2017-04-10T17:15:47.087

This is fairly clearly superior to David Spillett’s (accepted) answer if you are copying multiple files at once. For a single source file, the only advantage I can see to adding tar is that it will automatically copy (preserve) file attributes (e.g., modification date/time, (protection) mode and potentially ACLs, owner/group (if privileged), SELinux context (if applicable), extended attributes (if applicable), etc.)  … … … … … …  P.S. Why would the user need to use bash -x?

– Scott – 2017-04-10T19:10:03.583

I used #!/bin/sh at the beginning of my script but the syntax of the command is not accepted. You can use bash -x or #!/bin/bash at the beginning of your file. I don't know why there is a difference between sh and bash interpretations. – Rémi Girard – 2017-04-11T15:34:47.897

Kamil Maciorowski - I don't know why your answer is not upvoted. It is the perfect solution. I wanted to share it. – Rémi Girard – 2017-04-11T15:52:45.447

1

Ryan Thompson's solution:

for x in dest1 dest2 dest3; do cp srcfile $x &>/dev/null &; done; wait;

makes a lot of sense: If write speed of the destination dirs is approximately the same then srcfile will only be read once from disk. The rest of the time it will be read from cache.

I would make it a bit more general, so you also get subdirs:

for x in dest1 dest2 dest3; do cp -a srcdir $x &; done; wait;

If the write speed of the dest dirs are very different (e.g. one is on a ram disk and the other on NFS), then you may see that the parts of srcdir read while copying srcdir to dest1 is no longer in the disk cache when writing dest2.

Ole Tange

Posted 2009-08-31T01:25:21.740

Reputation: 3 034

0

If you want to do this in Windows from PowerShell, it is not possible by default, because unlike the -Path argument, the -Destination doesn't take multiple arguments. However, you can use -Passthrough and daisy-chain the commands. (But that's no fun.)

The best solution is to make your own, as is shown here.

not2qubit

Posted 2009-08-31T01:25:21.740

Reputation: 1 234

0

In bash:

for x in dest1 dest2 dest3; do cp srcfile $x &>/dev/null &; done; wait;

Ryan C. Thompson

Posted 2009-08-31T01:25:21.740

Reputation: 10 085

2i don't think this will perform well. in an ideally parallel copy you'd be reading once, writing many times. i think this will do 1:1 reads:writes. maybe if the copies start fast enough and drive cache is big enough you won't actually need to seek the read heads. – quack quixote – 2009-10-04T23:23:17.833