1

If you run a SaaS app, or work on one, I would love to hear from you. Where the safety and security of your customer's data is paramount, how do you secure it and back it up? I would love to know your main host (e.g. Heroku, Engine Yard, Rackspace, MediaTemple, etc.) and who you use for your backup.

Be as detailed as possible - e.g. a quick overview of your service and the data you store (images for instance), what happens with the images when the user uploads them (e.g. they go to your Linode VPS, and posted to the site for them to see - then they are automatically sent to AWS or wherever, then once a week they are backed up to tape by the managed hosting provider, and you also back them up to your house/office).

If you could also give some idea as to what the unit cost (per GB/per user/per month) of storage is - on average, I would really appreciate that.

Getting ready to launch my app, and I would love to get some more perspective on the nitty gritty details involved.

Thanks!

marc.gayle
  • 205
  • 2
  • 3
  • 9

1 Answers1

0

I handle operations for a relatively large online collaboration service. We have nearly 200GB of relational data in MySQL databases, and about 20TB of user-generated content.

We own our own server hardware, so we have a few file storage servers for the user-generated content. These servers run MogileFS, which is configured to replicate 3 copies of every file across the cluster. Since we have copies on 3 different servers, we don't use RAID on the file servers. Storing 3 copies means that we can handle a drive failure, or take a server down for maintenance, and still have 2 copies of every file.

Every hour, we perform offsite backups with a homegrown script. The script looks for new files and makes a backup on an offsite server. This script keeps its own database of exactly which files have already been backed up. (We could just use the timestamp to determine which files to copy, but that wouldn't allow us to "backfill" backups when needed. We could also just check the remote server for the presence of each file, but with 150 million files in MogileFS, that would take forever!)

Our remote backups are encrypted using a symmetric key. This is actually quite elegant. We have a single master backup key. The private key is stored in a safe deposit box, and the public key is available on all of our file servers. Backups are encrypted using the master public key. This means they can all be decrypted with the master private key, but even if our servers are all compromised, the attacker would not be able to gain access to the encrypted backups.

We use Backup Manager to handle our MySQL database backups and other miscellaneous server backups. Backup Manager is awesome -- it handles everything automatically, like encrypting the backups and sending them off-site. Backup Manager performs incremental backups of specified directories on the servers, and runs mysqldump for our databases.

You can even pipe other tools through Backup Manager -- for example, I'm testing Percona XtraBackup with Backup Manager for faster MySQL backups. (Technically, XtraBackup doesn't speed up the backups -- it speeds up restores. Without XtraBackup, it was taking almost a week to restore our 40GB MogileFS database. It helped to tune MySQL, but even then the restore performance was not adequate. I've just started testing XtraBackup, which may allow us to restore the database in hours, not days.)

I strongly recommend using Backup Manager -- it's ridiculously quick to set up, and handles all the mechanics of establishing remote backup. I've even started using it on two of my own personal servers, storing encrypted backups on Amazon S3. Until your data set is very large, Backup Manager should be able to handle everything for you, including server config, source code, and user data. It's automatic, encrypted, and very simple to use.

Ryan
  • 251
  • 1
  • 2