How can I download a file the moment it becomes available on linux?

How can I set up something on a Linux machine so that the moment a file on a web server returns something other than a 404 error it will download it. I would like it to stop checking the moment the file is started to download successfully, but still retry if it fails.

I have tried using a cron job with wget, but I can't seem to figure out how to make it not create an empty file on a 404 error.

Is this possible and how?

geek1011

Posted 2016-04-20T23:23:35.513

Reputation: 1 180

Answers

You can either check (with test -s) that the file has contents, or simply use the return value of wget. Download to a temporary file, and only if the test passes, then copy to the real output:

$ wget -q -O /tmp/a http://localhost/nonexistent && mv -v /tmp/a /tmp/b
$ wget -q -O /tmp/a http://localhost && mv -v /tmp/a /tmp/b
‘/tmp/a’ -> ‘/tmp/b’
$

Toby Speight

Posted 2016-04-20T23:23:35.513

Reputation: 4 090

I have almost solved it using a variation of your answer: wget -q -O /tmp/a http://localhost/nonexistent && (mv /tmp/a /tmp/b; echo Success) – geek1011 – 2016-04-23T15:35:20.613

You can write a bash script that will wget the url and then will check the output. If the output contains lets say "404", delete it. Else you got your file, and you can remove the script from cron (from within the script). You can also use a scripting language to solve this.

Liran

Posted 2016-04-20T23:23:35.513

Reputation: 9

1The string 404 could legitimately occur in the file; besides which, on my system, I get a completely empty file when I give wget a non-existent URI. – Toby Speight – 2016-04-21T10:03:19.527

@TobySpeight I gave a general solution. Of course a more precise parsing is needed. And you can check if a file is empty easily. – Liran – 2016-04-23T12:50:01.230