piping curl to tar crashes, while wget to tar is ok

0

When I pipe curl to tar for large files I occasionally encounter errors:

jeremyr@w6:~/pelias_metal/whosonfirst$ curl -s https://dist.whosonfirst.org/bundles/whosonfirst-data-postalcode-jp-latest.tar.bz2 | tar -xj --strip-components=1 --exclude=README.txt -C /mnt/data_science/pelias_w6/data/whosonfirst 

bzip2: Compressed file ends unexpectedly;
perhaps it is corrupted?  *Possible* reason follows.
bzip2: Inappropriate ioctl for device
        Input file = (stdin), output file = (stdout)

However, this seems to be ok:

jeremyr@w6:~/pelias_metal/whosonfirst$ wget -q -O - https://dist.whosonfirst.org/bundles/whosonfirst-data-postalcode-jp-latest.tar.bz2|tar -xj --strip-components=1 --exclude=README.txt -C /mnt/data_science/pelias_w6/data/whosonfirst
jeremyr@w6:~/pelias_metal/whosonfirst$

Does someone have a handle on why this might be?

jeremy_rutman

Posted 2019-04-12T16:07:25.387

Reputation: 101

It's giving you a error message, so technically it does not crash. – davidbaumann – 2019-04-12T16:17:36.560

Answers

0

Wget and Curl are similar, however wget downloads things recursively--almost in a directory-like structure whereas Curl does not. This MAY be the reason why.

Curl is more versatile in that it can run on more platforms, but I think it may be the structure of the download that is giving it trouble.

whiskywoof

Posted 2019-04-12T16:07:25.387

Reputation: 9

the download is just a single .gz file , and the weird thing is that curl | tar works most of the time , hinting that maybe the problem is a timeout – jeremy_rutman – 2019-04-13T09:07:12.813

Well curl is saying that it is possibly corrupted--sure wget downloaded it, but did you check the integrity of the download? Is it all there, and what you needed? – whiskywoof – 2019-04-13T13:24:05.643

i think its the tar -xj that is extracting the .gz and complaining abt format, so if wget worked then the file is ok. But I will check in any case – jeremy_rutman – 2019-04-13T17:33:16.197