1
This is the output of tree command in one directory:
.
|-- asdf.txt
|-- asd.txt
|-- fabc
| |-- fbca
| `-- file1.txt
|-- fldr1
|-- fldr2
| `-- index.html
|-- fldr3
| |-- cap.txt
| `-- f01
`-- out.txt
6 directories, 6 files
I start a local http server in this directory. Next I run the following command:
wget -r -nv --spider --no-parent http://localhost:3000 -o -
...and get the following output:
2017-01-02 20:07:24 URL:http://localhost:3000/ [1580] -> "localhost:3000/index.html" [1]
http://localhost:3000/robots.txt:
2017-01-02 20:07:24 ERROR 404: Not Found.
2017-01-02 20:07:24 URL:http://localhost:3000/fabc/ [897] -> "localhost:3000/fabc/index.html" [1]
2017-01-02 20:07:24 URL:http://localhost:3000/fldr1/ [536] -> "localhost:3000/fldr1/index.html" [1]
2017-01-02 20:07:24 URL:http://localhost:3000/fldr2/ [0/0] -> "localhost:3000/fldr2/index.html" [1]
2017-01-02 20:07:24 URL:http://localhost:3000/fldr3/ [896] -> "localhost:3000/fldr3/index.html" [1]
2017-01-02 20:07:24 URL: http://localhost:3000/asd.txt 200 OK
unlink: No such file or directory
2017-01-02 20:07:24 URL: http://localhost:3000/asdf.txt 200 OK
unlink: No such file or directory
2017-01-02 20:07:24 URL: http://localhost:3000/out.txt 200 OK
unlink: No such file or directory
2017-01-02 20:07:24 URL:http://localhost:3000/fabc/fbca/ [548] -> "localhost:3000/fabc/fbca/index.html" [1]
2017-01-02 20:07:24 URL: http://localhost:3000/fabc/file1.txt 200 OK
unlink: No such file or directory
2017-01-02 20:07:24 URL:http://localhost:3000/fldr3/f01/ [548] -> "localhost:3000/fldr3/f01/index.html" [1]
2017-01-02 20:07:24 URL: http://localhost:3000/fldr3/cap.txt 200 OK
unlink: No such file or directory
Found no broken links.
FINISHED --2017-01-02 20:07:24--
Total wall clock time: 0.3s
Downloaded: 7 files, 4.9K in 0s (43.4 MB/s)
- Is wget written to always seek
index.html
? Can we disable this? - What are those numbers such as 1580, 536, 0/0, etc?
- Why does it say
unlink: No such file or directory
?
Okay, so what is
0/0
then? (In response to answer 2) – deostroll – 2017-01-02T19:27:55.170Looks like error when downloading the file - for example receiving HTTP 200 OK from web server while no file is provided (due to incorrect permissions, misconfiguration... etc.). Did wget download the file contents or is the file empty? I'm afraid no one can tell you cause just from the zero file size. Here is someone facing similar issue: http://unix.stackexchange.com/q/91785 (answers suggest enable wget debugging option).
– Marek Rost – 2017-01-03T23:17:16.217I ran it with the
--spider
option...now does that specifically mean anything? – deostroll – 2017-01-04T05:18:18.4931spider just means "do not download files". With recursive this will change to "temporarily download files that can contain links to other resources". As mentioned in updated answer, if file should be downloaded is determined by its content type. – Marek Rost – 2017-01-06T22:36:15.983