wget only new or updated files when the source files are compressed but the destination files are uncompressed


I would like to know how to wget files from a url which are stored with bz2 compression. However, I would like to only wget the new and updated files, relative to my destination, where the files are stored uncompressed.

Typically I would use the following code:

wget http://sourcedirectory/ --mirror

As you can imagine, since the source files are compressed, and the destination files are uncompressed, this approach downloads the entire set of files instead of just the new and updated files.


Posted 2015-03-25T18:56:46.530

Reputation: 11

A similar but different question can be seen here http://stackoverflow.com/questions/4944295/wget-skip-if-files-exist

I am adding an additional level of complexity, where skip if files exist, even if they have different compression state.

– bsuttonq – 2015-03-25T19:44:51.120

My current guess is that I'll need to create a list of files at the source, another list of files at the destination, and amend those lists to look for differences while ignoring compression states. I don't really know how to do that, though, except in theory. – bsuttonq – 2015-03-25T19:48:46.197

Do the archives contain a single file each or many? If they're single files, you could compare the timestamps of the remote foo.gz and the local foo. – terdon – 2015-03-26T01:17:18.880

Each archive will contain only a single file. Thanks for the suggestion, I'll explore comparing foo.gz and local foo. – bsuttonq – 2015-04-25T00:43:27.907

No answers