2
1
I'm downloading a large file that's not an archive, and I want to combine the tasks of downloading and decompressing.
How can I do them simultaneously?
2
1
I'm downloading a large file that's not an archive, and I want to combine the tasks of downloading and decompressing.
How can I do them simultaneously?
2
This:
wget -O - -o /dev/null http://download.freebase.com/datadumps/latest/freebase-simple-topic-dump.tsv.bz2 | bunzip2 > freebase-simple-topic-dump.tsv
Where bunzip2 is an unzipping command for your compression format of choice. It must support piped input. And the file must be a single compressed file, not an archive.
It uses wget to pipe the downloaded file to the unzipping application, outputting to the specified filename.
4
The question is tagged with curl but the answer only uses wget.
With curl it's a little easier than wget because it can request compression and decompress without piping (url truncated for clarity).
curl --compressed http://freebase.com/topic.bz2
Looks like it does the trick. Great answer @user23337 – Max – 2013-09-30T15:50:13.610
1I agree with @Alec as it is also faster – Karussell – 2015-11-03T16:57:50.800
3Technically, it doesn't do them simultaneously, it just performs the download via wget first and then pipes the results as a whole into bunzip. If you attempted to unzip a file that wasn't completely written you'd get an error indicating that the end of file was reached too soon. – MaQleod – 2012-10-07T04:20:52.167
@MaQleod Sorry, I'm pretty sure that's incorrect. Try running it without forwarding the output to a file; it starts printing straight away. – Max – 2012-10-07T04:32:24.960