At what point does Robocopy overwrite a file?

1

I'm running a .bat containing:

robocopy \\server\directory\ .\localdir /MIR

This moves a single file from a server to my local directory.

It's a large file and will take a while to finish, however I can achieve what I need with an existing copy of the file at that location.

At what point exactly will the file be overwritten? Can I ^C out of it and use the file that was in \localdir before the .bat started running? Or do I need to wait for robocopy to finish?

StuperUser

Posted 2016-05-25T14:26:57.393

Reputation: 552

Any reason for the downvote? Is there an existing question I couldn't find or anything in the docs? – StuperUser – 2016-05-25T14:30:29.003

Answers

1

robocopy is no different than any other file copy program. An overwriting copy "truncates" the existing data in the file and then begins writing the new data. You can observe this with a program that monitors the system APIs called by robocopy, such as one of the options listed here. Then you only need to look up each API call on MSDN to understand exactly what's happening.

First, it calls SetFilePointerEx to seek to the beginning of the file (if it's not already there). Then it calls SetEndOfFile to truncate the file to the BOF marker (making it a zero-byte file). This all happens so fast that effectively as soon as this command begins executing, your original file data is "lost" for all intents and purposes (the blocks that represented the file's data stream will very soon be overwritten in whole or part by future NTFS write commands).

If you wanted to do this "atomically", you could do a rename operation, like this:

  1. Copy the file contents (using robocopy) from the server to a unique pathname on your local system that does NOT currently exist (for example, if the file is named foo, you could copy it to foo.bak.)

  2. Do an overwriting rename using the atomic rename Windows API function that was added in Windows Vista (and is therefore unsupported on XP and earlier, which require a non-atomic, possibly buggy or broken, non-atomic rename; ouch): MoveFileTransacted(). In this way, you can instantaneously "swap out" the data contained in the file, from the original file's contents to the new file's contents. This means it is impossible for any program to ever read a partially-complete file -- either it reads the (complete, unaltered) original copy, or it reads the complete, unaltered updated copy. It'll never read an "incomplete" copy or a garbled combination of both files.

The downsides of this alternate method are:

  • I don't know of a program that already does this out of the box;
  • Since you have the original file still there while you're doing your copy, you need twice as much storage. So if you're intending to overwrite a 1 GiB file with another 1 GiB file, you need a total of 2 GiB of storage on your local system, albeit temporarily.
  • It only works with Transactional NTFS. That means if your source filesystem or destination filesystem for the transactional overwrite isn't NTFS, it won't work. You can still robocopy the file from a network share down to your local system, but the atomic rename won't work on a network share, or (unless they add it in the future) on ReFS, which may someday become our default filesystem like NTFS did over FAT32 years ago.

allquixotic

Posted 2016-05-25T14:26:57.393

Reputation: 32 256