How to copy files and create hardlinks instead of copying when files are identical

3

4

Are there any tools for Windows that can

  • copy one dir to another
  • read copied content
  • generate MD5
  • if the current file is identical to a previously copied one, create a hardlink in destination dir instead of writing the content?

est

Posted 2011-10-12T13:44:15.347

Reputation: 536

Answers

5

If you're copying the content just to hardlink it immediately afterwards, why not just generate the hardlinks straight away? Link Shell Extension makes this particular job easy.

If there's a reason you need to go through that particular sequence of actions, LSE's author also wrote a command line tool called dupemerge to do almost exactly what you're asking.

One thing to keep in mind is that NTFS does not do "copy-on-write" semantics for hardlinks. If something modifies the contents of a file, all hardlinked versions are immediately "updated", since they're all essentially directory entries to the same data extent on disk. What's more, many programs do a "save to temp file, delete original, rename temp to old name" procedure rather than overwriting a file, which will effectively break other hardlinks to the data, since they're pointing at the old data extent.

afrazier

Posted 2011-10-12T13:44:15.347

Reputation: 21 316

ockquote>

why not just generate the hardlinks straight away?

because it's two HDD, one removable, the other one have limited free space left – est – 2011-10-17T06:25:25.120

1

You should edit your question with your restrictions and the particular sequence of actions you'd like to take. Are you sure there's enough duplicate files to allow for what you want to do? If so, you could use dupemerge on the source, then LSE or it's CLI companion ln to do a Smart Copy that will preserve the interior links.

– afrazier – 2011-10-17T12:57:19.603

1

You can do this using FINDDUPE, which you can find here.
Consider src as your source folder and dest as your destination folder you can do:

xcopy /I /E src dest
finddupe -hardlink -ref src dest

Note: Hardlinks only work on NTFS

qwertzguy

Posted 2011-10-12T13:44:15.347

Reputation: 1 394

thanks, I know finddupe, but the HDD target has limited free space and source HDD have lots of dup files. How to check BEFORE copy? – est – 2011-10-17T06:26:44.277

I don't think you can make hardlinks between two different HDDs. Is that what you want to do? – qwertzguy – 2011-10-17T10:04:31.717

copy files to another HDD, if dup, create a hardlink instead of copy content. – est – 2011-10-18T05:24:37.897

1Ah ok... now I get it: you want to check duplicate files, then copy only non duplicates and make hardlinks to recreate the reste of the files. Well you can use finddupe -bat find.bat src then in the created file find.bat remove all the del lines, make a unique list of hardlinked files (last parameter of fsutil) and copy those files, then run the script. Of course a script to do this would be best, but I don't have time to do it right now. – qwertzguy – 2011-10-18T09:36:31.080