1
I would like to know how to wget files from a url which are stored with bz2 compression. However, I would like to only wget the new and updated files, relative to my destination, where the files are stored uncompressed.
Typically I would use the following code:
wget http://sourcedirectory/ --mirror
As you can imagine, since the source files are compressed, and the destination files are uncompressed, this approach downloads the entire set of files instead of just the new and updated files.
A similar but different question can be seen here http://stackoverflow.com/questions/4944295/wget-skip-if-files-exist
I am adding an additional level of complexity, where skip if files exist, even if they have different compression state.
My current guess is that I'll need to create a list of files at the source, another list of files at the destination, and amend those lists to look for differences while ignoring compression states. I don't really know how to do that, though, except in theory. – bsuttonq – 2015-03-25T19:48:46.197
Do the archives contain a single file each or many? If they're single files, you could compare the timestamps of the remote
foo.gz
and the localfoo
. – terdon – 2015-03-26T01:17:18.880Each archive will contain only a single file. Thanks for the suggestion, I'll explore comparing foo.gz and local foo. – bsuttonq – 2015-04-25T00:43:27.907