-1

A company performs a full backup for its data in a daily basis for disaster recovery purposes. However, their backup process cannot be completed within the assigned backup time window.

What would you recommend to this company about how to restructure its backup environment in order to minimize the backup time? We got 4 candidates,

1. Perform LAN based backup

2. Weekly full backup and daily incremental

3. Weekly full backup and daily cumulative

4. Add more ISL to increase bandwidth

when comparing incremental backup with cumulative backup ,incremental backup time is surely shorter than cumulative backup time .But I don's know adding more ISL is allowed in an existing storage system,or can this operation really shorten backup time ?

wuchang
  • 101
  • 2

2 Answers2

0

I'm not sure what you mean by option 1.

I don't know what option 4 is.

Option 3 is going to mean increasing backup times and increased backup storage capacity utilization as the week progresses.

Option 2 is the option that is going to cover 95% or more of your recovery needs. In the event that you need to restore a complete system you'll need the latest Full backup and all of the subsequent Incremental backups but IMO this far outways the deficiencies of daily Cumulative backups regarding increasing backup windows and increased backup storage capacity utilization.


EDIT

IMO = In My Opinion.

A Cumulative backup is equivalent to a Differential backup, so every successive Cumulative backup will take longer and use more backup media capacity. A Cumulative backup will back up all data that has changed since the last Full backup, so if your Full backup runs on Saturday then a Cumulative backup on Sunday will back up all of the changed data since Saturday. A Cumulative backup on Monday will backup all of the changed data since Saturday, including the same data that was backed up on Sunday, etc., etc. until the next Full backup. As you can see, every successive Cumulative backup will be larger and take more time since you're backing up all of the data that has changed since the last Full backup, essentially backing up the same changed data with every iteration. If fileA changes on Sunday, but not on Monday, Tuesday, etc. it will still be backed up by a Cumulative backup on Sunday, Monday, Tuesday, etc. You're backing up the same file with each Cumulative back up even though the file has only changed once, on Sunday. If you need to restore a system completely you only need to restore the latest Full backup and the latest Cumulative backup since the latest Cumulative backup will contain all of the data that has changed since the latest Full backup.

An Incremental backup only backs up data that has changed since the last Full or Incremental backup so if your Full backup runs on Saturday then an Incremental backup on Sunday will back up all of the changed data since Saturday. An Incremental backup on Monday will backup all of the changed data since Sunday, excluding the data that was backed up on Sunday. An incremental backup on Tuesday will back up all of the changed data since Monday, excluding the data that was backed up on Sunday and Monday. With an Incremental backup you're only backing up changed data once, so as you can see, if you need to completely restore a system it may be neccessary to restore the Full backup and every Incremental backup since the Full backup to make sure that every changed file is restored.

http://en.wikipedia.org/wiki/Differential_backup

http://en.wikipedia.org/wiki/Incremental_backup

joeqwerty
  • 108,377
  • 6
  • 80
  • 171
0

I'd highly suggest using rsync's "--link-dest" option to do a sync (or backup) of all of the data each time you would normally do an incremental backup. By utilizing the hardlinking option, you're not only saving destination disk-space, but drastically cutting down the amount of duplicate files being transferred across the network.

Given a directory tree of backups, hardlinking simply links files that have not been changed on the source machine to where they already exist on the destination machine.

More than anything though, you'll need to provide more metrics on what your current bottleneck is (CPU, disk IO, network...).

dayid
  • 1