5

On http://aws.amazon.com/ebs/pricing/

it says: "EBS Snapshots

[...] For the first snapshot of a volume, Amazon EBS saves a full copy of your data to Amazon S3. For each incremental snapshot, only the changed part of your Amazon EBS volume is saved."

I intend to snapshot some of my instances daily and keep snapshots for 7 days after which the snapshots are destroyed.

What happens when, eventually, I destroy the first snapshot? Will the subsequent snapshots be worthless given the first is no longer available?

Jepper
  • 356
  • 1
  • 4
  • 13

1 Answers1

9

When you delete any EBS snapshot, if there's a later snapshot for the same volume, every block in the first snapshot that wasn't included in the later snapshot (because it didn't change) is logically rolled forward (in a sense) into the later snapshot, so the later snapshot is still perfectly valid when any or all earlier snapshots are deleted.

Conceptually, you can also think of the snapshot process consisting of two separate pieces: the compressed,¹ backed-up data that is stored in chunks in Amazon S3² during the snapshot process, and then the snapshot itself, which is only a container of pointers to those chunks of raw data. Every snapshot references a particular archived data chunk for each block of the volume, and those chunks are fetched and reassembled when a snapshot is restored. When a snapshot is deleted, any archived data chunk that is no longer referenced by any other snapshot is purged, but any chunk referenced by any other snapshot... isn't.

So you can freely delete any snapshot in a series, without impacting the validity of earlier or later snapshots. Note, though, that purging snapshots of volumes that don't change much will also not save you very much in monthly snapshot storage fees, because you already are paying very little for them, since they contain very little unique data.

See also:

https://stackoverflow.com/questions/19501192/confusion-over-ebs-snapshots-in-amazon/19503615#19503615

How to determine actual size of an amazon snapshot?


¹ compressed ...well, maybe. The internal workings of EBS snapshots have always been a black box but when this answer was originally written in 2014, the conventional wisdom held that EBS compressed the backup chunks before storing them in S3. This may have been incorrect all along or may no longer be the case because it subsequently changed -- perhaps due to the fact that compressing already-encrypted data is much less efficient than compressing unencrypted data and EBS volumes are so easy to create with transparent encryption that the prevalence of encrypted volumes made the compression unproductive -- but there does not appear to be any official, documented source currently indicating that the backup data is actually compressed. The actual charge for snapshot storage is almost always substantially less than the logical size of the snapshot, so this particular detail is not necessarily important, but I've striken the word from the original answer, in the interest of accuracy.

² in Amazon S3 is where snapshot data is stored, but the buckets are owned and controlled by the EC2/EBS service, so the buckets are not visible in the console and the raw snapshot data isn't accessible to the end-user.

Michael - sqlbot
  • 21,988
  • 1
  • 57
  • 81
  • So the Amazon host, does a LVM (?) snapshot of lets say a 10GB volume. The entire volume (or non-zero blocks at least) are copied to S3. But so far it's not costing us anything..? Later we take another snapshot, where 100MB has changed. We are now charged for 100MB of storage? – Jepper Aug 21 '14 at 20:49
  • Not LVM, it's a raw block device snapshot, with no awareness of the actual filesystem. The first snapshot stores the entire volume, subsequent snapshots store the changes. You pay storage for both. Deleting either of the two will only reduce your cost by the cost of storing the changed blocks. – Michael - sqlbot Aug 21 '14 at 21:29