How to delete all glacier data?

29

12

I was using a tool on Mac OS X called Arq to backup my data, but i found it so hard to upload all my stuff since I don't and can't have an internet connection that is fast enough for it.

So I decided to delete all my backups, but whenever I try from the software itself it does nothing.

I also tried FastGlacier on my other windows machine, it hangs up and takes too much resources.

I was wondering if there is an easy way to do this.

P.S. My glacier has ~450 GB in 341907 archives

Shereef Marzouk

Posted 2013-12-13T06:11:39.997

Reputation: 432

Note to Arq users - see the answer from Arq developer Stefan Reitshamer below. Avoid the headache of setting up mtglacier, and just use the tool built into Arq! – joewiz – 2016-11-22T04:58:58.343

Answers

0

How to delete Vault (AWS Glacier)

This Gist give some tips in order to remove AWS Glacier Vault with AWS CLI (ie. https://aws.amazon.com/en/cli/).

Step 1 / Retrive inventory

$ aws glacier initiate-job --job-parameters "{\"Type\": \"inventory-retrieval\"}" --vault-name YOUR_VAULT_NAME --account-id YOUR_ACCOUNT_ID --region YOUR_REGION

Wait during 3/5 hours… :-(

For the new step you need to get the JobId. When the retrive inventory is done you can get it with the following command: aws glacier list-jobs --vault-name YOUR_VAULT_NAME --region YOUR_REGION

Step 2 / Get the ArchivesIds

$ aws glacier get-job-output --job-id YOUR_JOB_ID --vault-name YOUR_VAULT_NAME --region YOUR_REGION ./output.json

See. Downloading a Vault Inventory in Amazon Glacier

You can get all the ArchiveId in the ./output.json file.

Step 3 / Delete Archives

Powershell

from @vinyar

$input_file_name = 'output.json'
$vault_name = 'my_vault'
# $account_id = 'AFDKFKEKF9EKALD' #not used. using - instead

$a = ConvertFrom-Json $(get-content $input_file_name)

$a.ArchiveList.archiveid | %{
write "executing: aws glacier delete-archive --archive-id=$_ --vault-name $vault_name --account-id -"
aws glacier delete-archive --archive-id=$_ --vault-name $vault_name --account-id - }

Python

from @robweber

ijson, which reads in the file as a stream instead. You can install it with pip

import ijson, subprocess

input_file_name = 'output.json'
vault_name = ''
account_id = ''

f = open(input_file_name)
archive_list = ijson.items(f,'ArchiveList.item')

for archive in archive_list:
    print("Deleting archive " + archive['ArchiveId'])
    command = "aws glacier delete-archive --archive-id='" + archive['ArchiveId'] + "' --vault-name " + vault_name + " --acc$
    subprocess.run(command, shell=True, check=True)

f.close()

PHP

from @Remiii

<?php

$file = './output.json' ;
$accountId = 'YOUR_ACCOUNT_ID' ;
$region = 'YOUR_REGION' ;
$vaultName = 'YOUR_VAULT_NAME' ;

$string = file_get_contents ( $file ) ;
$json = json_decode($string, true ) ;
foreach ( $json [ 'ArchiveList' ] as $jsonArchives )
{
    echo 'Delete Archive: ' . $jsonArchives [ 'ArchiveId' ] . "\n" ;
    exec ( 'aws glacier delete-archive --archive-id="' . $jsonArchives [ 'ArchiveId' ] . '" --vault-name ' . $vaultName . ' --account-id ' . $accountId . ' --region ' . $region , $output ) ;
    echo $output ;
}

Mark: After you delete an archive, if you immediately download the vault inventory, it might include the deleted archive in the list because Amazon Glacier prepares vault inventory only about once a day.

See. Deleting an Archive in Amazon Glacier

Step 4 / Delete a Vault

$ aws glacier delete-vault --vault-name YOUR_VAULT_NAME --account-id YOUR_ACCOUNT_ID --region YOUR_REGION

Gist originally by @Remiii

Ok So a few years ago I closed my account and just reopened it a few month ago and guess what amazon still has my 3TB there on my account and now I got billed for them for the last few months.

So I came back to this question and found that:

  • mt-aws-glacier is almost impossible to setup on the latest ubuntu then went to 12.04 awscli is not there, then when to 14.04 got an error about my signature...
  • The Arq Answer is no longer relevant in Arq 5
  • Then I found the above gist and copied it here because it is better for the community
  • Tried cloudberry and it looks like it should work I will update here in 4~10 hours

Shereef Marzouk

Posted 2013-12-13T06:11:39.997

Reputation: 432

26

The purge-vault from this project works nicely: https://github.com/vsespb/mt-aws-glacier

Install, then run these commands (replace vault-name with the name of your vault):

mtglacier retrieve-inventory --config glacier.cfg --vault vault-name

wait for about 2 hours, and then

mtglacier download-inventory --config glacier.cfg --vault vault-name --new-journal vault-name.log
mtglacier purge-vault --config glacier.cfg --vault vault-name --journal vault-name.log

Ran Rubinstein

Posted 2013-12-13T06:11:39.997

Reputation: 361

I had to wait closer to 4 hours to be able to download-inventory – Parag – 2015-11-29T03:49:01.853

1This method seems to be much faster compared to glacier-vault-remove. This method was able to remove 350GB of data in a few hours, while glacier-vault-remove was removing only about 30GB of data every 12 hours. – gbmhunter – 2016-11-07T07:30:06.990

I realize this answer is marked as the confirmed solution, but for Arq users like the original poster, Stefan Reitshamer's answer below is the best, hands down. Arq has a built-in tool for deleting Glacier Vaults. No need to mess around with mtglacier. Just read that answer, and you're done. – joewiz – 2016-11-22T04:54:36.280

Thank you very much for this, but sadly I don't have any glacier storages to test with it, so please if anyone tests it let me know to mark it the correct answer. – Shereef Marzouk – 2014-03-18T11:37:43.327

Thanks for the feedback @CamiloNova I have chosen this as best answer based on your feedback ^_^ – Shereef Marzouk – 2014-05-21T08:14:51.830

16

https://github.com/leeroybrun/glacier-vault-remove was created for this exact purpose.

To remove a vault, first install the dependencies:

$ git clone https://github.com/leeroybrun/glacier-vault-remove.git
$ cd glacier-vault-remove
$ python setup.py install

Then create a credentials file, credentials.json in the same directory:

{
  "AWSAccessKeyId": "YOURACCESSKEY",
  "AWSSecretKey":   "YOURSECRETKEY"
}

Then run the script like this

$ python removeVault.py REGION-NAME VAULT-NAME

Example :

$ python removeVault.py us-east-1 my_vault

onionjake

Posted 2013-12-13T06:11:39.997

Reputation: 323

1This script is much slower than mt-aws-glacier at the current time – Dan Poltawski – 2015-09-22T12:18:48.120

Also, it eats a lot of RAM. I’m trying to delete roughly 120.000 archives—at 1142 of 125413 it already uses more than 1 GB of memory (and it’s increasing with each archive). – aaronk6 – 2017-01-06T09:02:15.600

7

If you remove a Glacier-backed folder in Arq it goes into Arq's trash. If you select it in Arq's trash and click "Delete Permanently", Arq will delete all the Glacier archives and attempt to delete the Glacier vault. The vault delete might fail because Amazon has to update its "inventory", which it does once/day. The next day, browse under "Other Backup Sets" in Arq, find that vault, select it and click "Delete" to delete it.

If you have a vault that's not associated with any Arq backups, pick "Legacy Glacier Vaults" from Arq's menu, select the vault, and click the button to delete.

Stefan Reitshamer

Posted 2013-12-13T06:11:39.997

Reputation: 71

Thanks, Stefan! I struggled for days to figure out how to delete my Arq vaults—failing to install mtglacier on my Mac, creating a dropcloud ubuntu instance to run mtglacier—and this whole time, the solution was right there in Arq. – joewiz – 2016-11-22T04:57:47.040

5

You can use a freeware product like CloudBerry Explorer http://www.cloudberrylab.com/free

Note, Glacier data doesn't become available immediately. you need to wait 24 hours for the global inventory to occur on the Amazon side, then you should click Get Inventory button and wait another 5 hours to get the inventory for your account.

Thanks

Marc Jacobsohn

Posted 2013-12-13T06:11:39.997

Reputation: 61

I had nothing but glacier on that account, so i just deleted my aws account, will mark it as the correct answer since, i think it would have worked out if i had tried it. – Shereef Marzouk – 2013-12-16T11:10:45.250

Not really a good answer because this product doesn't run on OSX. – user3353 – 2014-04-28T01:08:19.613

1

I know this question has been answered a while ago, but I think this might help some people since deleting Glacier data is still extremely cumbersome.

I didn't see this suggested anywhere... but if you're only using AWS for Glacier (which I assume must be the case for many), you might consider simply closing your AWS account! That's what I did after days of mind-bendingly ineffective tries at deleting the data with various tools.

When you close your account, Amazon deletes your data (supposedly; they should eventually reclaim the disk space at least) and you get a final receipt for the month in progress. Goodbye Amazon!

Form

Posted 2013-12-13T06:11:39.997

Reputation: 198

I was using many other amazon services and didn't want to lose them, and i guess many use amazon for buying stuff, but it's good to have this written somewhere for people that never used amazon for something else – Shereef Marzouk – 2016-08-22T22:59:28.263

@ShereefMarzouk Well, when you close your account in the AWS control panel, it's actually your AWS account you're closing, not your Amazon account that you're using to make purchases. So you'll still be able to use the other Amazon services (as long as they're not part of AWS) as usual. – Form – 2016-08-24T00:23:12.400

-1

On Mac you could try using ForkLift app (free for evaluation) which can connect to Amazon S3.

Marius

Posted 2013-12-13T06:11:39.997

Reputation: 891

I connected to Amazon S3 but it doesn't show me anything. Do I have to specify a server other than s3.amazonaws.com to access glacier? – Kevin – 2014-07-25T17:55:25.170

Sorry it was a while ago for me now... I can't quite recall how I eventually fixed it... I think it might have been via these command-line tools listed in one of these other posts. – Marius – 2014-09-21T17:50:15.883

1Glacier is not S3.  They're both part of Amazon Web Services and they're both used to store files, but they have different use-cases, payment structures, restrictions and APIs.  Because of this, S3 tools don't work with Glacier and Glacier tools don't work with S3 (though that's not to say there aren't tools out there that are both S3- and Glacier-compatible, written with distinct network handlers and app logic for each service). – Slipp D. Thompson – 2015-08-17T04:00:59.407