-1

I need a method that may be able to accomplish what I am trying to do. This is a one time thing, any direction would be appreciated.

I need to archive a massive Windows share but I would like to be able to reference the structure on my back up based on year. Let me explain a little more in detail.

  1. Move all 2006 files/directories to a backup directory while maintaining file structure.
  2. Run my archive on that backup location.
  3. Repeat steps 1 & 2 for each year.

This way if a user comes to me and says I need XYZ client files from 2010, I can simply reference the files of 2010 and pull that year's archive. At the end I think I will just back up using Backup Exec and off to tape.

What can I do to accomplish this? I may be asking this question incorrectly...

RogueSpear00
  • 162
  • 1
  • 1
  • 9
  • I'll be looking at Robocopy in the meantime, but I'm not clear as to whether it will accomplish exactly what I require. – RogueSpear00 May 10 '13 at 15:36
  • Do you really mean "move"? If you move all of 2006's files to (e.g.) `/backup/2006/......` then they wouldn't be available on `/share/wherever/.....` anymore. – voretaq7 May 10 '13 at 15:47
  • Yes, move. I don't need users to have files from the dawn of time. – RogueSpear00 May 10 '13 at 15:50
  • Hmmm.... can you tell us a little more about the nature of the data in question? (especially how often you have a project/file that spans multiple years -- that often screws up this sort of backup strategy). [Normally this is something sysadmins dump on the users](http://chat.stackexchange.com/transcript/message/9369974#9369974) -- They maintain the directory structure,and we just back it up. If they need a restore they tell us what, when, and we go get it. Every environment is different though - knowing more about your situation will help us give better suggestions :-) – voretaq7 May 10 '13 at 15:57
  • Sure! So this share contains invoices for customers. The customers all have directories, and then their particular files are stored accordingly. Archiving these files off my SAN unloads about...500GB / 10Mil files. These are usually invoices that accounting would reference if they were required by a government audit. Any audit would ask for a particular set of years to retrieve. Was this helpful? – RogueSpear00 May 10 '13 at 16:02

2 Answers2

5

Any backup software can do this. You don't have to write it to tape, you can use dedupe pools, which are common in modern solutions and write it to disk at an off-site server (or on-site server if you have no DR requirements).

Backup Exec, NetBackup, Commvault, TSM, Avamar, even the built-in Windows Backup can do this.


It was pointed out in chat that I may have misunderstood your question. I don't see a benefit to doing tapes based on year. If I were designing a solution for this, I'd get an ocean of slow, large disks and use them as data storage for a dedupe pool. Then, I'd do a full backup once and incrementals forever. You can restore right from the disk pool. Write the whole thing off to tape occasionally for DR and call it a day. It's getting far less common to be pulling tapes for restores nowadays. People are shifting to D2D2T strategies where tape is only for DR.

MDMarra
  • 100,183
  • 32
  • 195
  • 326
  • I use Backup Exec now, however I clearly missed this part... – RogueSpear00 May 10 '13 at 15:41
  • 1
    @RogueSpear00 Usually it's an option in the restore process (you can browse files by date, or restore a specific file at a specific date) - It's been a while since I've used Backup Exec, but I know Bacula can do this so I assume Backup Exec can too. The caveat is it means you need to keep tapes around as far back as you want to be able to restore (which can mean a lot of tapes, or a lot of big slow disks) – voretaq7 May 10 '13 at 15:49
  • I don't have any budget to change methodologies or acquire anything at this time. I suppose the short-hand explanation is that this company has files from when about 15 years ago, of which they only actually access the last two years of data. If we were audited, then I would pull whatever data is required. The reason that it would be easier to section off each year is because that is how the data would be referenced. "Hey I need files from 2005 for XYZ client." Also, this is a one time shot. Once I've **moved** the files off my SAN, I will only ever have to get them for audit. – RogueSpear00 May 10 '13 at 15:55
4
  1. Move all 2006 files/directories to a backup directory while maintaining file structure.

This is obviously the tricky part. The rest you can do as MDMarra points out.

From an IT ownership, you'd need a staging area of space to use with a script that grabs the files and folders based on timestamps using something like robocopy with MINAGE/MAXAGE set properly to encompass a year, places the files there, and then runs the backup against that staging area.

However, honestly, your best bet here is to:

  1. Move the whole thing as is and back it up. Then mess around with the robocopy and see if it gets you what you need. Then back it up again.
  2. Delegate setting up the production share to the end user/dept with a new "yearly" structure. Make them responsible for their own file structure. Task them with breaking out the structure based on year if that's the best tree topology. Then they won't have to even ask you for a restore unless the files/folders are literally gone. And then you can say "hey it's 2014, on x day I'm archiving off the 2011 folder and its contents".

As was pointed out in chat though...projects/tasks/etc. typically don't just wrap up with a nice bow on 12/31. Things that span over years or between Dec-Jan will cause you headaches, which is why the onus should be on the user that actually uses the data on how the folder/file structure should exist and not IT.

TheCleaner
  • 32,352
  • 26
  • 126
  • 188
  • @RogueSpear Given your "invoices and audits" use case I would suggest Option (2) from this answer -- invoices fortunately DO have nice defined dates, and don't tend to run on into the next year the way project-related files do :-) – voretaq7 May 10 '13 at 16:20
  • 1
    And "usually" invoices are stored within an ERP system (or at least the raw data is), and can be retrieved fairly easily through a report/query. – TheCleaner May 10 '13 at 16:22
  • 1
    That's the ideal scenario -- I work for a company without an ERP system, so I feel the pain of "We're storing PDF invoices by year" myself. I dream of the day when I have time to work with finance to deploy a proper billing system :-) – voretaq7 May 10 '13 at 16:23
  • @voretaq7 - That's exactly what I have going on...Invoices on PDFs... – RogueSpear00 May 10 '13 at 16:26
  • @RogueSpear00 - this is where the Quality dept. should be stepping in and requiring doc control, naming conventions, etc. Especially if you are under audit requirements. – TheCleaner May 10 '13 at 17:59
  • @TheCleaner - I agree with you. However, I've walked into this mess of an environment recently, and right now the project I have is to migrate from one failing SAN to a new NetApp. In order to reduce load, migration, and overall backups, I would like to do this. I suppose if there are no options, then I will simply just archive the entire structure and then delete files that I do not need locally. – RogueSpear00 May 10 '13 at 18:02
  • I gave you an option with the robocopy minage/maxage, but you'll have to play with it and see if it suits you. It might work, but I wouldn't rely on it being perfect for an audit – TheCleaner May 10 '13 at 18:57