-1

I have recently been assigned as a systems administrator to look after a streaming service and its related server architecture.

Once a week i travel 3hours to turn around a magnetic tape library which holds server files.

To save on driving, I want to implement a strategy of saving and transferring files over ssh or equivalent with either tar files or zip files. I am considering IaaS or simply a periodic script to backup and transfer to a separate location.

What are the considerations if I were to implement this kind of alternative to magnetic tape backups?

I am unsure of the business requirements, data volumes etc. I am currently in the literature review stage and know this site is such a wealth of helpful knowledge and reference.

I now have more information.

  1. 3-2-1 rule for backups. 3 Copies, on 2 media, 1 offsite.
  2. 14.1 Tb consisting of server images, volumes and folders.
  3. 20% additional capacity
  4. RTO 48hrs (Desired 24)
  5. RPO 1 hour
  6. Daily change in data is 5%
  7. Currently 15Mbs of 40Mbs pipe dedicated for backup traffic
  • 2
    Please provide more information. What are your business requirements? Data volume, point in time recovery, acceptable delta? What are your business reasons for wanting to get away from using magnetic tape? – Tilman Schmidt Jun 25 '22 at 14:35
  • 2
    This feels like a student assignment to me – Chopper3 Jun 25 '22 at 16:52
  • 1
    I can assure you it's not. I'm a Network engineer who has now moved over to sysad and looking for ways to make effective differences to how we work. – python_starter Jun 25 '22 at 17:25
  • 3
    "I am unsure of the business requirements, data volumes etc." - you need to pick up on that then, check your business's requirements first. Then you can think about a better concept. – Zac67 Jun 25 '22 at 17:41
  • As Andrew S. Tannenbaum said in one of my books: "Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway." My point being: The time to create the backup on tape, transport said tapes to remote location and upload the backup to remote site can be actually shorter than doing the whole thing on the Internet. – Lasse Michael Mølgaard Jun 26 '22 at 15:12

3 Answers3

4

Not having business requirements is a big red flag - so I strongly suggest to understand what is required before investing in any solution.

That said, to save large amount of off-site data I can suggest using rsync or zfs send/recv

rsync is maximally efficient when you are dealing with many small files which rarely change - in this case, it effectively skip the unchanged files. It is also quite efficient when dealing with small changes to bigger files, only transferring deltas. Coupled with rsnapshot (for periodic snapshots rotation), it can work really well. However, it suffer greatly when very big files (eg: virtual machine disks) accumulate changes.

zfs send/recv works at a lower level - the block layer. Due to this, it can efficiently transfer bigger files deltas with no significant overhead. However, it require a zfs-capable sender and target.

Another possibility is to use some (relatively) low-cost cloud storage solution, as Amazon Glacier. These services can be very helpful but they generally comes with some significant quibbles (ie: append only storage), meaning they are not a silver bullet at all.

shodanshok
  • 44,038
  • 6
  • 98
  • 162
2

First, figure out the rough amount of data you'd need to transfer, and compare it to the network bandwidth available on-site. If you find you'd need to get a new fiberoptic line into a site that's "in the middle of nowhere" to achieve the necessary bandwidth, you've probably discovered the reason why it was not done that way before.

A classic quote from Andrew S. Tanenbaum (the father of Minix) is: "Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway."

The second thing you need to figure out is why you are traveling to the site once a week: if the primary purpose is to get the tapes rotated to an alternate location so that any major disaster on-site (think fire or flooding) will not destroy the data, the answer to your question is going to be entirely different from the case where it's just an old tape library that is running out of capacity. In the former case, visiting the site is part of an existing Disaster Recovery solution; in the latter case, it might be just a matter of getting a new, higher-capacity tape library.

Tape libraries are still quite strong contenders for archival-style bulk storage, particularly if getting high-bandwidth connections on site would be too expensive.

telcoM
  • 4,153
  • 12
  • 23
1

When your main concern is the valuable time you spent driving - consider simply out-sourcing the task of changing the tapes.

Most datacenters offer a remote hands service for what is probably not more than the 10 minute job of switching tapes.

A courier / parcel service can then come to collect the tapes and ship them to your office.

Although that comes with a price tag, it will be cheaper than your time.

But consider that maybe after a while you will like a regular opportunity to leave your office, avoid boring meetings and spent a couple of hours driving while listening to your favourite pod casts...

Rob
  • 1,137
  • 7