wget -o writes empty files on failure

14

1

If I write wget "no such address" -o "test.html" it first creates the test.html and in case of failure, leaves it empty. However, when not using -o, it will wait to see if the download succeeds and only after that, it'll write the file.

I'd like the latter behavior to be applied to -o too, is it possible?

akurtser

Posted 2010-07-22T09:00:24.653

Reputation: 635

Answers

17

wget returns a non-zero exit status when the URL is not found, so you can append a remove command on failure:

wget "url" -O file || rm -f file

Or create a temporary file and only move it where you want on success:

wget "url" -O /tmp/wget && mv /tmp/wget file

The second has the benefit of not deleting an existing file on failure, but be sure to use unique temporary names (see man tempfile) if you're running multiple instances in parallel.

Ian Mackinnon

Posted 2010-07-22T09:00:24.653

Reputation: 3 919

Also, adding --retry-connrefused can help in preventing the empty file in the first place. – akom – 2017-12-29T15:40:01.213

If this is happening in an exec in a puppet manifest, changing creates => file to unless => "[ -s file ]" can make it self-healing. – akom – 2017-12-29T15:41:42.323

13

As written in the comments, wget -O is more like a shell redirection which always writes into the file regardless of errors.

You can use curl -f instead:

curl -f http://nonexistent/file.jpg -o localfile.jpg

It will not touch the local file if there is an error fetching the file.

cweiske

Posted 2010-07-22T09:00:24.653

Reputation: 1 010

4

The correct syntax is

wget "url" -O file

notice the UPPERCASE O. The -o options tells wget to write a log file, that's why it's always written even on failure.

Mr Shunz

Posted 2010-07-22T09:00:24.653

Reputation: 2 037

At first I thought it was working, but then I found it didn't. try wget "http://host.does.not.exist" -O "emptyFile"

An error is returned, yet the emptyFile is created.

– akurtser – 2010-07-22T09:38:53.967

1

@akurtser you're right. I think there's no way to tell wget not to create the file. I found this thread: http://www.mail-archive.com/wget@sunsite.dk/msg08586.html in which they discuss the matter. The baseline is that you can have MULTIPLE downloads to the same file so it gets created because wget cannot be shure that ALL urls will fail.

– Mr Shunz – 2010-07-22T11:41:49.677

Well thanks, it's a part of a bash script I'm writing, so I'll just try to first save it a temp file, which in case of successful download, will be be renamed. Not very elegant, but can't think of anything better. – akurtser – 2010-07-22T12:19:30.757

1@akurtser Surely you can check the return code from wget then... it should tell you if you can delete the file "if not found". So no need for temp/renaming. – Mr Shunz – 2010-07-22T12:42:13.427

1The -O option is a redirection, which redirects the downloaded content to a file, even in cases when there are no contents. Therefore, a file is always created, even if the download failed. – Quan To – 2017-09-15T09:13:05.530

0

According to the help doc(wget -h), you can use --spider option to skip download(version 1.14).

Download:
  -S,  --server-response         print server response.
       --spider                  don't download anything.

rocky qi

Posted 2010-07-22T09:00:24.653

Reputation: 101