Using AWS Glacier

This is a surprisingly difficult task and infinitely harder than using AWS S3. So why store stuff in Glacier? Because it’s cheap.

Here’s how much you will pay to store a 1TB (1000GB) file today.

Storage Cost per GB Monthly Cost
EBS SSD $0.10/GB $100/month
EBS Snapshot $0.05/GB $50/month
S3 $0.023/GB $23/month
Glacier $0.004/GB $4/month

What else do you need to know about it? Glacier retrieval is not instant. You make a request, it takes awhile to get fulfilled, and then you pick it up when it’s ready. That’s what makes it different from other storage models.

Anyways, here are the steps

1) In the AWS Console, go to the Glacier service and create a vault (e.g. my-vault)

2) Make sure you have a user that has permissions to the vault. Here’s a sample policy

    "Version": "2012-10-17",
    "Statement": [
            "Action": [
            "Sid": "Stmt1376667184000",
            "Resource": [
            "Effect": "Allow"

3) On your machine, ensure you have the awscli

4) Upload your files to the vault using the awscli

aws glacier upload-archive --vault-name my-vault --account-id - --body

Output will look like this:

    "checksum": "e5d002bf40...",
    "location": "/112233445566/vaults/my-vault/archives/KYKdL...",
    "archiveId": "KYKdL..."

5) To download, you have to make a request. Depending on the Tier service, it could take minutes to hours before your request is fulfilled.

First, create a request.json file like the following:

  "Type": "archive-retrieval",
  "ArchiveId": "KYKdL...",
  "Description": "Retrieve archive on 2015-07-17",

Type defines the type of job. In this case, you want “archive-retrieval” to retrieve the archived file.

ArchiveId is the archiveId returned in the output when you uploaded the file.

Tier determines how quickly your request is fulfilled. It has several choices that vary in speed and price.
For the most up-to-date tier pricing and speeds, check Data Retrievals section of FAQ
Here’s the pricing and speeds as of this writing

Tier Price Fullfillment
Standard (default) $0.01/GB + $0.05/retrieval 3 – 5 hours
Bulk $0.01/GB + $0.05/retrieval 5 – 12 hours
Expedited $0.03/GB + $0.01/retrieval 1 – 5 minutes

SNSTopic allows you to register for alerts when the request is done.
You can create SNS Topics in the Simple Notification Service tab of the AWS Console and then copy the Topic ARN here.

Next, run this command to initiate the job request

aws glacier initiate-job --vault-name my-vault --account-id - --job-parameters file://request.json

You’ll get a job with a JobID which you’ll need later.

6) You can also check the status of the job like this:

aws glacier describe-job --vault-name my-vault --account-id - --job-id WI6sdXS...

7) Once the job is complete, you can get the output of the job

aws glacier get-job-output --vault-name my-vault --account-id - --job-id WI6sdXS... [MY-OUTPUT-FILE]

4.5) Yes, 4.5 because this should have gone between steps 4 and 5. You can list the files in your vault like so:

aws glacier initiate-job --account-id - --vault-name my-vault --job-parameters '{"Type": "inventory-retrieval"}'

So why didn’t I just tell you about this earlier? Well because, you don’t get back a list of archives after calling this. Notice you “initiate-job” again. Which means, you have to wait for the job to complete (step 5) and then get the output of the job (step 6). So you have to learn steps 5 and 6 before you can do step 4.5.

Clear as mud? Good.

Tagged , ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: