0

I just joined a SMB as their first full-time tech guy, and the company's backup architecture is a mess. Leaving aside all other issues (of which there are many), the office has multiple different NAS devices and disorganised backups all over the place, about a third of which are duplicates by volume.

I want to clean that up, but without disrupting the existing file structure. (I'll worry about reinventing the filesystem after I have copies of everything.) So, before I set up automated backups, I intend to go through the different NASs and:

  1. Consolidate their contents onto one volume.
  2. Replace as many duplicates as I can with hard links.
  3. Compress the older backups into archive files and back those up.

If I compress hard-linked files, then move them to a different device and extract the archives, the links should still point to the correct files (unlike Windows shortcuts, Mac aliases or symbolic links). My question is: am I right? Is there a better method of consolidation than this?

Also, if I replace dupes with hard links on one server, move the resulting fileset over to another server, then replace all dupes across the new collective server, will there be any resultant issues that I need to watch out for?

  • Your question about hard links in archives is answered here: https://unix.stackexchange.com/questions/43037/dereferencing-hard-links – Gerald Schneider Aug 11 '22 at 09:01
  • The main thing to consider when relying on hard links: Not all tools properly recognise hard links as such (sometimes not at all, sometimes their default behaviour needs adjusting with switches/flags). When they don't, you end up where you started , each hardlink becomes another full-sized copy of the original when you copy the hard linked files or make/restore a backup of them. – HBruijn Aug 12 '22 at 10:17

1 Answers1

1

Replace as many duplicates as I can with hard links

You can do that only if the duplicates files are never going to be changed; otherwise, the change would affect all hardlinked files. To avoid that, rsync explicitly break the hard-link (by copying and renaming) before overwriting a file.

If I compress hard-linked files, then move them to a different device and extract the archives, the links should still point to the correct files

Hardlinks do not work as you suggested above. An hardlink is nothing more that another name for a single file, or inode, of a single filesystem. Hardlinking a file and then moving the hardlink to another filesystem will end with a copy of the original file.

I strongly suggest practicing with hardlinks before using them for backup purpose, otherwise you will end with very unexpected results. As an alternative you can try rsnapshot, which does incremental backups via rsync and hardlinks.

For transparent compression, I suggest using a filesystem with native compression as zfs or btrfs. But, again, be sure to understand what you are doing before implementing it for backing up production data.

shodanshok
  • 44,038
  • 6
  • 98
  • 162