9

I need a better storage and archive system for my small business's files. Specifically the files are completed video projects. Beyond time and cost limitations what is holding me back is I don't believe in any of the solutions I have pondered. Therefore I am laying outthe problem and my thoughts. I would appreciate any opinions.

Budget: I believe in spending what it takes. That being said, we are a small business. I am hoping I can get out of this for <5k and more around 1-3k. That might be a pipe dream. Just tell me so.

The Problem:

  • Raw video files are huge in filesize. We have accumulated probably 10+tb so far and that is growing fast.
  • Video editing require fast read/write access to files so a central or cloud based file server will not be fast enough. Therefore we probably need an achieve solution for old projects and current projects will have to stay local.
  • We want some sort of redundancy and offsite solution.

What we currently do:

  • We use large, high quality, external hard drives.
  • We always buy in pairs and manually duplicate content. In other words, we work off of one, and duplicate the files to the other which serves as a backup/fall back.
  • These HDs are fast enough with firewire800 or USB3 to directly work off of.
  • Once filled, we set the pair aside.

What's wrong with the current solution:

  • Although the data is duplicated across two drives, these drives are not "backed-up" or stored offsite.
  • Organization across these many external HDs is hard. What project is on what drive? etc.
  • Eventually we are going to have a ridiculous amount of hard drives.
  • Duplication is not RAID.

Options:

A Local Server

  • Buy a rack mount server and a rack mounted hard drive array enclosure, like a Norco, (SAS) (20 bays).
  • All video files would be stored on this server. We could install and pay a cloud service to back up this one computer/server. CrashPlan works on Linux and has no limits on how much data. The har ddrives would be physical drives connected to the server so we get around the "no NAS" rules companies like CrashPlan have. It is not a personal computer so the syncing can run 24/7/365. This would solve the offsite issue.
  • Instead of using an online backup service like CrashPlan we could write a script to sync these files to an Amazon Glacier account.
  • A policy that video peeps work off of external hard drives for current projects but must put the project on this new computer when complete. In other words, continue using external hard drives for current projects and store archived projects on this server.

Cloud based backup services (CrashPlan.com, BackBlaze.com, Carbonite.com)

  • Typically only let you backup an external harddrive that is physically connected to a computer. (no NAS or network drives).
  • Typically they expect a backed up external drive to stay connected to your computer and all data to remain on the drive. If you don't hook up an external harddrive for months, what happens to the backups? If you clean up space by deleting old projects, they will be deleted from the online service too.
  • Requires our users to leave the external harddrives connected to their computer until all data is in the cloud. This can take weeks for a big project.
  • Restoring a project would be very slow due to internet transfer speeds.
  • These cloud backup accounts are usually specific to one user/one computer. So if a harddrive is backed up by one user. Then a second user works on the project, what does that mean?

A Big NAS

  • A NAS is "Network Area Storage". You stick in as many hard drives as it will hold. It will raid them. You can access this via the network connection or maybe USB3/Firewire.
  • Most have an Operating System baked into it. So you can't run other software like cloud based backup services. Nor can you do any customization or run your own software. You get what you buy.
  • Big NASs are pretty expensive and not really that big. You don't find many with more than 4 bays. Currently a big HD is 3tb. So 4bays might be somewhere around <12tb of storage. Not super comfy for the future.

Other ideas are:

  • Tape Backups.
  • Just archive the older projects directly to Amazon Glacier, Skip building a local server to store them.

Thanks for any advice!!! Jed

maestrojed
  • 221
  • 1
  • 1
  • Once a project is finished, how often do you need to go back and work with its data? – Michael Hampton Feb 19 '14 at 19:45
  • You understanding of "Big NAS" is remarkably small. There are companies ranging from NetApp to EMC to IBM to Isilon to many many others that would like to sit down with you. – mfinni Feb 19 '14 at 19:58
  • We will need to go back to this data once or twice a month. If there was a delay in accessing this data (Amazon Glacier) I think it would be acceptable if we were talking hours and not days. – maestrojed Feb 19 '14 at 20:34
  • Don't use Glacier, then. Retrieval fess will be high, and your accountant will stroke out when he gets the bill. – HopelessN00b Feb 19 '14 at 20:36
  • @mfinni I am sure EMC and IBM would have great ideas and awesome hardware (didn't know of NetApp). Not sure that would fit into my described budget nor does it seem appropriate for small business. It looks like NetApp makes hardware similar to what I was calling a "Hard Drive Array" made by Norco. I mentioned this in the build my own server idea. Is that approach you are suggesting? – maestrojed Feb 19 '14 at 20:39
  • @HopelessN00b Thx for the feedback. Good to know. – maestrojed Feb 19 '14 at 20:39
  • Do you need off-site backups? – Martin Schröder Feb 26 '14 at 23:27

3 Answers3

12

Tape. Simple like that. Quantum has a SuperSTore system that can handle way more than that and I have seen them for less than your 5000 price point - new. The good thing is that you can pull tapes out for storage so scaling this is going to be quite cost efficient, and tapes last.

TomTom
  • 50,857
  • 7
  • 52
  • 134
  • Since you're happy with retrieval times on teh order of hours, I'm completely with TomTom on this. Tape is much underappreciated, and excellent for this kind of thing. – MadHatter Feb 19 '14 at 20:46
  • If you go with tape, just make sure you have something that allows you test the tape frequently. In my experience, about 75% of tape backups don't work because people use the same cassettes multiple times and are surprised when they can't retrieve data off of them 3 years later. – Matthew Feb 19 '14 at 20:52
  • 2
    Surprising enough given that proper take properly stored (cough) has aarchival length guarantees (i think 30 years). And you could easily make 2 copies. Take scales really well. It is more the - sorry - idiocy of ppl who likely never test the restore even once.... although I would do tha in a schedule (1 week, 1 month, 1 year). – TomTom Feb 19 '14 at 21:19
  • 1
    +1 - This is practically a textbook case for high-capacity tape. The incremental cost of adding storage to a tape-based archive (even with cutting two tapes to store on and off-site) is lower than hard disk drives and tapes are _meant_ for archiving. LTO is backed by an industry association that has show a commitment to building products that allow for access on older media. Even so, in a few years, when you replace the tape element, you should probably migrate the old data to new tape formats, if only to combat potential bit rot. If you need this to be "OPEX" consider a leasing option. – Evan Anderson Feb 19 '14 at 21:44
  • 1
    Tape is good but it requires discipline - regular testing, offsite rotation, etc. I would personally go with near-line NAS (probably nas4free) and a rate-limited rsync to an offsite identical box. – quadruplebucky Feb 20 '14 at 08:09
  • Yes. You would do that. Many of us woulc consider that a rotten decision ignoring the little reality issue called money. Want to rsync offsite? What about putting the tape offsite? Besides, offsite is overrated for smaller setups - a nice fireproof data safe (physical) really covers most thing. Replace magazines once per week. Yes, requires discipline. Like paying employees, taxes, doing what you were contracted for. And seriously, expet a lot less problems than with your rate limited rsync setup. – TomTom Feb 20 '14 at 08:32
4

First, I would advise avoiding Glacier. It sounds good, until you crunch the costs on actually restoring a large amount of data. This is an unofficial calculator you can use to calculate Glacier storage and retrieval costs, and judge for yourself. Restoring terabytes of data from Glacier is a pretty unattractive prospect.

Second, I would advise that for simple backup purposes, you could get away with a a single NAS server with a lot of drives. It sounds to me like you've only looked at home and small office NAS options, and you should consider a proper NAS offering. Preferring Dell, I would point out Dell's PowerVault NAS Servers, but HP, IBM, SuperMicro, and just about everyone else have similar offerings. I have an older Dell PowerVault NX at home that's serving as my media library, and has twelve 2 TB near-line SAS disks in it. 4 TB nearline SAS drives are available these days too, so you could always fill up a proper NAS server with those. (Or buy a couple NAS servers.)

You could easily use one of these on your local LAN, install backup software of your choice (such as Bacula, if you like free, or any one of a dozen commercial offerings if you want vendor support) and use a large RAID volume as your backup target. You could then use a cloud backup service to backup this NAS server, and have the benefits of local and remote backups. Again, this is what I do at home. Proper NAS server, terabytes of data backedup to a cloud service.

And of course, you could use tape too... buy an LTO tape drive or library - personally, I'll go to great lengths to avoid tape or optical disc media, but they are legitimate options, and may be cheaper than a disk-to-disk solution.

Finally, I would suggest that you need to consider the main drawback of cloud backup services, which is the size of your internet pipe. It may take weeks or months to upload terabytes of data over your internet connection, and/or incur extra fees from your ISP. So while they are a viable option for backing up data, even enterprise data, that's a constraint most people don't consider until they've already hit it.

HopelessN00b
  • 53,385
  • 32
  • 133
  • 208
  • +1. Glacier is archival - it seriously S++++ for backup. Backup means no restore for ages, then a LOT - and the 5% quota means paying in case of a restore. – TomTom Feb 20 '14 at 08:41
1

I think it depends on your budget. If you can only spend ~ $6k you'll need to build your own NAS probably. I'd look at nas4free and what a server costs you. If you can spend $20k, you probably can fill a server with a bunch of disk and a decent RAID card or software RAID under Linux or whatever.

For about $40k you can have a highish end 1U (IBM x3550 M4, 2 port Emulex 10GBit nic, 4 Gbit NIC, 128GB RAM, 2 local 10k SAS disks) with 10Gbit iSCSI to an Infortrend SAN box with 24 4TB SAS disks you can slice and dice however you want. RAID6 is a reasonable config.

Tape is also a good idea, but I don't know how cheap it is really. It depends on how big a library you get. If a 48 tape library is good, you can again do that with a 1U and external SAS card for maybe $30k and 2 LTO6 drives... But then you need software licenses to manage tape backups or something. I've only used NetBackup, which probably isn't a great fit for you here. Just don't forget you'll probably want to drive the tape library some way in software. But once you're out of the library, don't forget about going to find the tape and load it up, plus a staging area for the access...

jmp242
  • 668
  • 3
  • 13