1

I've got an old win 2003 VM (VMware) on a blade cluster of VMs that I'm moving a considerable amount of files to our new DFS array. There are two main folders with about 1.7 million and half a million smaller files (letters, memos, and other smaller files) respectively. Total size is ~420 GB and ~100 GB.

We're using the gui version of robocopy on the server to copy the files. We had initiated a file copy about a month ago to test the process and found that it was taking around 4 hours for the large file. Now that I'm in the process of actually switching the files over it has been taking 18-20 hours. Nothing has changed on the server side and nothing has changed on the settings of the copy (no logs, 1 retry with a wait of 1 second).

Our intent is to shut off the share and force the copy over again to get all the files that have been left out of the copy due to being locked by users. I can't take a 20 hour outage to do that though.

Does anyone have any theories about what could be causing such a delay for robocopy compared to previously shorter runs?

Edit: I discovered last night the copy must be freezing. It stops at a certain time (seems to be variable) and doesn't continue at all.

  • Are you saying that you did a test run with the 400GB directory and it only took 4 hours to complete? Was this a test between the same two devices? – tony roth Dec 05 '12 at 16:48
  • @tony roth Yep, the test run was the same devices/folders and locations. We even left the robocopy window open to preserve the settings just in case. Still wound up starting at 8am and running until almost midnight (the last time I checked it). – user1588867 Dec 05 '12 at 17:07
  • is the dfs array a new piece of hardware or just a bunch of vms on you current cluster? – tony roth Dec 05 '12 at 17:30
  • @tony roth the DFS array is a completely different piece of hardware. Not the same subnet as the cluster is though. – user1588867 Dec 05 '12 at 17:34
  • 1
    400gb of small files will take much longer to copy than a single 400gb file. I think this is because of extra filesystem overhead. – 1.618 Dec 17 '12 at 15:30

1 Answers1

0

We've been fiddling with this one. Looks like there was something odd going on with the gui interface. When I specifically set the retry and wait to "1" it went through normally. This may have been coincidence, but it began working normally on the first try with those settings.