Duplicity Backups (via Duply) to S3 - high server load

Question

We have been doing incremental backups using Duply on our main server, to an S3 bucket. However, we've found that there is a high server load during the backups (its an Amazon EC2 server).

We're thinking this may be due to it checking every file on S3 to see if there are any changes.

What ways could we reduce the server load?

Since we are doing the backups every four hours, perhaps we could only perform a backup on files/folders newer than 4 hours.

Check hints about tuning system for read/write intensive loads at http://serverfault.com/questions/639921/running-find-command-generates-high-load — Hrvoje Špoljar, Oct 29 '14 at 10:01

score 0 · Answer 1 · answered Oct 29 '14 at 18:11

If you need a full system backup, you could switch to snapshots...

if you need individual files, have you looked at AWS CLI tools? If it's an Amazon Linux instance, they're already installed. If not, see install instructions here.

You could set a scheduled job using a command something along these lines:

aws s3 sync /your/path/to/backup s3://yourbucket/path --recursive

in a sync operation, the source file will only upload if the size of the file is different than the size of the s3 object, the last modified time of the local file is newer than the last modified time of the s3 object, or if the local file does not exist.

For extra recoverability, enable versioning on the S3 bucket and you'll be able to recover older versions of the files if something is inadvertently modified....then you can use S3 lifecycle policies to minimize the number of versions / age of versions you maintain.

Duplicity Backups (via Duply) to S3 - high server load

1 Answers1