Why does `wget` download index.html instead of a direct file?


I'm just trying to download this, but it always redirect to the main page and in the end just download the index.html file, not the file I'm trying to download:


Do you guys know how to download it correctly? I used --user-agent="firefox+linux, IE+windows, (anything you can think of)" but it doesn't work.

This is the output, is the same with --user-agent enabled:

jaheaga@jaheaga:~$ wget  http://www.tweaking.com/files/setups /tweaking.com_windows_repair_aio.zip--2012-04-13 19:40:07--  http://www.tweaking.com/files/setups/tweaking.com_windows_repair_aio.zip
Resolviendo www.tweaking.com...
Conectando con www.tweaking.com[]:80... conectado.
Petición HTTP enviada, esperando respuesta... 302 Found
Ubicación: http://tweaking.com [siguiente]
--2012-04-13 19:40:08--  http://tweaking.com/
Resolviendo tweaking.com...
Reutilizando la conexión con www.tweaking.com:80.
Petición HTTP enviada, esperando respuesta... 302 Moved Temporarily
Ubicación: http://www.tweaking.com [siguiente]
--2012-04-13 19:40:08--  http://www.tweaking.com/
Reutilizando la conexión con www.tweaking.com:80.
Petición HTTP enviada, esperando respuesta... 200 OK
Longitud: no especificado [text/html]
Grabando a: “tweaking.com_windows_repair_aio.zip.1”

    [ <=>                                                                            ]     46.913       234K/s   en 0,2s    

2012-04-13 19:40:09 (234 KB/s) - “tweaking.com_windows_repair_aio.zip.1” guardado [46913]


Posted 2012-04-13T23:25:49.873

Reputation: 51

1What errors do you get? – Nifle – 2012-04-13T23:31:35.377

The link is not working at all. At least, for me. How about uploading it to somewhere? And use the direct link from there? – Apache – 2012-04-14T00:00:10.743

it gives me the main page, but go to http://tweaking.com/files/setups/ and you can check it, weird behavior of that link

– Jaheaga – 2012-04-14T00:18:47.450

BTW: I am curious. What's the reason for downloading the file with wget instead of inside the browser? I mean you definitely used a browser to find the download url :) – zpea – 2012-04-14T00:30:55.657

duty noted, is a batch script I use to fix really broken windows computers. – Jaheaga – 2012-04-14T00:38:33.993

Ah ok, makes sense. And the question looks much better now, thanks. (Removed my comment). – zpea – 2012-04-15T02:26:55.257

In case you would like to have English output next time: Just execute export LANGUAGE=en_US:en LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 once before all the other commands, and they put out everything in (US) English with US number/date formats etc. (This setting is only for the current shell (and subshells) and everything is back to normal, when you close it and/or open another one) – zpea – 2012-04-15T02:39:44.210



The user-agent is a good start, but not sufficient in that case. Another HTTP header value that is often checked for is 'Referer' [sic!]. See Wikipedia: HTTP Referer.

wget has a --referer=url option to specify the referring page. Analysing the traffic for a successful download in Wireshark shows that it used following request from a testing system of mine:

GET /files/setups/tweaking.com_windows_repair_aio.zip HTTP/1.1
Host: www.tweaking.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:11.0) Gecko/20100101 Firefox/11.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Referer: http://www.tweaking.com/content/page/windows_repair_all_in_one.html

For this case it even seems you don't need to fake an User-Agent.

wget --referer=http://www.tweaking.com/content/page/windows_repair_all_in_one.html  http://www.tweaking.com/files/setups/tweaking.com_windows_repair_aio.zip

Does the trick.


Posted 2012-04-13T23:25:49.873

Reputation: 1 363