AWS upload folder to S3 as tar.gz without compressing locally

10

2

In AWS CLI, how do I upload a folder as a tar.gz file without creating a tar.gz locally?

For example, I have a folder at /var/test and I want to upload it to /tests/test1.tar.gz

How do I do that without turning it into a tar.gz locally? (I want to save local space, as I don't have much space on my HDD.)

Michael Samsung

Posted 2017-09-07T15:00:34.450

Reputation: 101

Answers

13

What you're really looking for is not saving a local file. You can use pipes to send the data from tar through gzip to s3 without saving anything to disk.

tar c /var/test | gzip | aws s3 cp - "s3://tests/test1.tar.gz"

Breaking this down (where stdin and stdout refer to the standard input/output streams via the pipeline):

  • tar c /var/test creates a tar archive out of /var/test and outputs it to stdout...
  • ...which is read by gzip from stdin, and the gzipped file (.tar.gz) is output to stdout...
  • ...which is read by aws s3 cp - "s3://tests/test1.tar.gz" from stdin and sent to S3. The - tells the AWS CLI to copy from stdin.

This still performs the gzip operation locally, but does not require the creation of a temporary file, since the entire stream is sent straight over the network.

Bob

Posted 2017-09-07T15:00:34.450

Reputation: 51 526

Bob, this answer looks like it's correct for SSHing files to other servers, but doesn't seem to address the question of how to upload to S3. It's probably a reasonably simple extension for someone who understands the S3 command line tools to apply this technique. – Tim – 2017-09-08T00:59:18.707

@Tim ...somehow, I completely missed that. I'll update. – Bob – 2017-09-08T01:04:42.670

1@Tim Fixed. Probably only looked at the AWS bit and assumed EC2 while half asleep last night. – Bob – 2017-09-08T01:11:02.640

1A few questions about this solution:

  • will it work with directories too?
  • will the entire contents of the files be loaded in memory? Doesn't this give problems with large files?
  • is there any way to see progress?
  • < – murze – 2018-06-10T09:51:40.677

1@murze (1) of course, that's the whole point of packaging, (2) no, (3) no, (4) no. – Ekevoo – 2018-09-12T17:50:09.733

Will tar -cz /var/test | aws s3 cp - "s3://tests/test1.tar.gz" also work? I'm passing in -z to gzip during the tar command rather than piping to it. – Keven – 2018-10-29T15:53:48.660

@Kevin I don't see why not. – Bob – 2018-10-29T22:30:27.967

You can get an estimate of progress using pv - though if transferring a directory, you'll need to estimate the size and provide it with -s, otherwise you'll see transfer rate and total transferred stats. – Attie – 2018-12-17T18:33:26.830

If you see a message like [Errno 2] No such file or directory: /path/to/your/dir/- it means your version of AWS CLI doesn't understand how to accept content from stdin, and you need to upgrade it. This happens for the stock apt version of awscli on Ubuntu 14.04. The aws cli bundled version works well on older systems.

– Dale Anderson – 2019-06-25T18:52:16.510

3

tar cvfz - /var/test | aws s3 cp - s3://tests/test1.tar.gz

you dont have to separately gzip, tar does that for you with the z toggle...

This works both directions, I use it almost daily..

Robv

Posted 2017-09-07T15:00:34.450

Reputation: 31