10

I've got to move around 320,000 files, 80,000 folders (only 100 Gb) of data. Some files are > 1GB but most are < 1kB.

I've had a look at Fastest method of copying files but I'm not sure how useful any of these will be, my problem is not about pure transfer rate speeds but really about how quickly Windows can deal with the IO of 320,000 files.

Do you think I will see speed benefits using either xcopy, robocopy, teracopy or fastcopy?

It took us long (12 hours+) to copy them once (using robocopy), and I hate to have to do that again. What can I do to speed it up?

The stuff is on a USB 2.0 External Drive.

Ian G
  • 223
  • 1
  • 2
  • 8

7 Answers7

9

Something like robocopy will be your best bet. USB drives can't handle a whole lot of IO to begin with.

I've pushed millions of small files to and from USB drives using robocopy. It takes time, but it gets the job done.

mrdenny
  • 27,074
  • 4
  • 40
  • 68
3

As mrdenny said, Robocopy would be best, partly because of its robustness. The best advise I can offer, provided you're sure about the cleanliness of the files to be moved, is to make sure the antivirus software is disabled while they are being moved. You really don't want the overhead of having all those files being scanned.

John Gardeniers
  • 27,262
  • 12
  • 53
  • 108
1

Obviously there is a quicker way than all mentioned here. Quicker, but less flexible :-) If you have put the files on a separate partition, you can copy the entire partition to the target disk.

I'm not familiar with any free Windows tool for the job (tool with VSS support would be perfect), but for sure you can boot off Ghost CD, or Partition Magic CD, or boot a Linux standalone CD. In linux you just dd the partition, or ntfsclone if this happens to be an NTFS partition.

kubanczyk
  • 13,502
  • 5
  • 40
  • 55
1

You will almost certainly experience better overall performance for the transfer sequence if you first pack up the source files into a single archive (tar, or compressed into zip, etc), then transfer the archive over the network, and then unpack the archive at the destination.

Please don't forget that when you're transferring the archive over the network, you will be better off using ftp (or another file-oriented transfer protocol) than a simple SMB file copy.

Using a process like the above, I've routinely transfer application directories of about 60GB (with about 50,000-75,000 files) between multiple, geographically-separated, datacenters (US, Europe, Asia). The difference between transferring one file at a time, and transferring a compressed archive over FTP is 10-40 times faster.

Rsync can be your friend here too (as it can in many other file transfer scenarios).

If you are open to commercial options, a UDP-based binary streaming solution that can push bits over multiple UDP streams might be of value to you. Take a look at http://www.filecatalyst.com/

user18764
  • 51
  • 1
0

Another option would be to use Bittorrent using a built-in tracker or DHT turned on. The client would group all the files together in blocks (use 2MB or bigger if available). On the receiving end, you will receive files in big block chunks as they get written to the hard drive. This helps to consolidate your small files into 2MB chunks and you get better transfer rates and

Sun
  • 295
  • 1
  • 4
  • 12
0

If I were you I would change the external Harddrive to a Firewire, the speed transfer is alot faster than USB2.

I think that maybe packing the files into a single "tar", and then transfering the files would save a bit of time. Cause it reduces the I/O times, since your only copying 1 giant file over, compared to thousands of files, it also consumes less resources I believe (during the copy phase). Or you can pipe the tar stream straight to your USB drive.

chutsu
  • 165
  • 4
  • Yes, I know the single tar approach will definitely yield results, but wasn't sure if building the single file would take a significant amount of time. – Ian G Aug 17 '09 at 10:59
  • Firewire's good but eSATA can be faster too. – Chopper3 Aug 17 '09 at 11:17
0

I found the most efficient way to copy large numbers of files was to stream them in to ISO files first and then copy the ISOs instead. This way the disk isn't concerned by issuing the thousands of commands required to copy files.

Of course this depends on your directory structure. I was fortunate enough to have a new directory created at 4.7GB intervals so it made creating a VBScript to automate the process a lot easier.

  • I don't follow your logic. To create the ISO all those files need to be read and written into the ISO. Then the ISO is read to extract the files and finally write them to the destination. How is that more efficient than reading and writing them once? – John Gardeniers Aug 17 '09 at 22:41
  • any sample code in VBSCript, C#, or another script language ? – Kiquenet Jul 21 '11 at 20:29
  • 1
    @JohnGardeniers Lewis is betting that the time it takes to copy small files across the network have a much greater overhead than the time it takes to put them into an ISO and then transfer that file. It really depends on your particular environment, but as unintuitive as it sounds, it can be faster to transfer one consolidated instead of thousands of tiny files. – Sun Feb 27 '13 at 18:26