How can I remove the leading path from a tar and retar it in memory?

7

1

I was thinking I could do something like this:

wget -O- http://example.com/funky.tar.gz | \
  tar --strip-components 1 -Ox | tar -cf fixed.tar.gz

to remove the leading path from all items in the downloaded tar, but it appears that there is no way to create a tar from stdin. Please prove me wrong.

Michael Hale

Posted 2010-03-15T05:32:14.837

Reputation: 171

Answers

2

You most certainly can create a tar from stdin. Use - as the source, and pipe whatever you want to tar into it.

http://ss64.com/bash/tar.html

http://www.google.com/search?q=tar+stdin

Alex

Posted 2010-03-15T05:32:14.837

Reputation: 2 094

I think I've gotten a bit further. I tried testing with this intermediate command:

cat cookbooks.tar.gz | tar -xO --strip-components 1 | tar -cf recipe.tar @-

But I get this error: tar: Error reading archive (null): Unrecognized archive format: Inappropriate file type or format tar: Error exit delayed from previous errors. – Michael Hale – 2010-03-15T06:01:00.323

You don't use @-, rather you use - as the input, as in tar -cf recipe.tar - – Alex – 2010-03-15T06:22:33.593

Thanks for the link. I'm still stuck after reading it though. Perhaps an example? I tried tar -Oxf cookbooks.tar.gz | tar -cf recipe.tar - but I get this error: "tar: no files or directories specified". When I look at the output of tar -Oxf cookbooks.tar.gz it does not have file and directory names, just the content of the files. – Michael Hale – 2010-03-15T15:07:29.570

@Michael: What you want is basically a tarpipe, minus the SSH: http://www.google.com/search?q=tar+pipe+ssh

– Alex – 2010-03-15T23:57:08.157

2

The Python tarfile module supports both stream reading and writing. You can take the result of TarFile.extractfile() from one tar file and feed it right into TarFile.addfile() of a second file. Obviously this would require a bit of programming, but it would do as you ask.

Ignacio Vazquez-Abrams

Posted 2010-03-15T05:32:14.837

Reputation: 100 516

0

It seems to me that one might describe what you want as a “stream editor” for tar files that allows you to apply the --strip-components pathname translation. The idea is to take a tar file as input and write a modified tar file as output.

None of the tars who's documentation I checked (GNU tar, star, bsdtar) seem to support your exact operation.

bsdtar is interesting though. Its @archive syntax seems like it would come close to letting you read a tarfile and write a modified one, but the manpage entry for --strip-components says that it only works in x and t modes). If it did work in c mode, you could use something like this:

wget -O - <url> | bsdtar -c --strip-components 1 -zf new.tar.gz @-

I do not have bsdtar on my machine (it is tar in FreeBSD, as well as tar in Mac OS X 10.6 (older releases use GNU tar); it is available as bsdtar on some Linux distributions: Debian GNU/Linux, Ubuntu, and some RPM-based distributions), but looking at the code, I am sure that invocation will just cause an “Option --strip-components is not permitted in mode -c” error.

If you want this, you will probably have to roll your own program (or get someone to do it for you). Fortunately, this may not be as hard as it sounds. bsdtar is based on the very nice libarchive library. It looks like it would be fairly straightforward to make a program that does what you want. Since bsdtar already has most of the code you would need to copy one archive to another (through its @archive handling), you could probably even do it by adding some functionality to bsdtar. An simple “damn the architecture, just get it done” approach might be to enable --strip-components for c mode and add a call to edit_pathname inside append_archive. The problem with this approach is that all the edit_pathname transformations would be applied to both entries from @archives and the pathnames of actual files specified through other means (command line args, -T pathname lists, etc.). This behavior may or may not in the best interest of official bsdtar (there is probably some reason --strip-components is not already enabled for c mode).

Chris Johnsen

Posted 2010-03-15T05:32:14.837

Reputation: 31 786