1

I have an s3 bucket named media and in it I have a folder named contracts that stores pdf files ~400kb each. For the files in this folder I keep a record in a database as well with the following information:

  1. ID
  2. Filename in the s3 bucket
  3. Date Uploaded into s3 bucket
  4. Date Archived into glacier
  5. Glacier filename

And I am making a script that copies files from the s3 bucket media and specifically from the folder contracts into a glacier S3 Vault, and I wonder which is the most optimum way to save it. Should I zip any file that have not moved into the glacier and then upload into glacier alltogether, or just download the file contents and save it into glacier as is?

If I save the files one-by-one what are the benefits and what are the drawbacks? The script will run every 30 days and I estimate that will backup about 5-6 files per month.

Dimitrios Desyllas
  • 523
  • 2
  • 10
  • 27
  • 1
    The cost is so low either way, that I don't think it makes much difference. Do what makes most sense to you. – Ron Trunk Apr 05 '21 at 17:28
  • If I had to archive more that 5-6 pdf files for example 1000 pdf files/month would by making them as a zip would suffice cost-wize? – Dimitrios Desyllas Apr 05 '21 at 17:45
  • Perhaps you should look at [Glacier pricing](https://aws.amazon.com/glacier/pricing/) to make your own estimates. But we're still talking about trivial costs. – Ron Trunk Apr 05 '21 at 17:50
  • 2
    The "S3 Glacier" service (not the storage class - quite confusing) is mainly useful for compliance as you can do things like vault lock. S3 has many similar features now, and the Glacier service is fairly stagnant. Do you need to use the Glacier service or can you use an S3 bucket and simply use a lifecycle policy to change your object class to Glacier / Deep Archive? DA would be cheaper. If you actually do need S3 Glacier service then it's made more for batch operations, adding one file at a time would be fairly fiddly wouldn't it? – Tim Apr 05 '21 at 19:00

0 Answers0