recursively wget in specific folder

0

I'm trying to get files from HTTP using wget command.

I need every file in:

http://9.9.9.9/a/b/c/d/

which is not a directory.

I was trying it by:

-A log,html,tgz,zip,txt,css,js

In addition, from that point:

http://9.9.9.9/a/b/c/d/needed_folder/

I need recursively all the files there (inside it there are few sub folders) I tried it by using:

-I /needed_folder

I was using the following command:

wget -r -A log,html,tgz,zip,txt,css,js -I /needed_folder -np -nH --cut-dirs=4 -R index.html http://9.9.9.9/a/b/c/d/needed_folder/some_files_needed/

This retrieve only index.html.1 file What is wrong?

raptor0102

Posted 2019-01-13T19:45:13.730

Reputation: 3

Answers

0

This is one of the ways in which the web is not like a filesystem: URLs are not paths, even though they are often mapped to paths. And, in the general case, even if they do wget has no way of knowing which ones they are. To explain why your command in particular doesn't work:

  • -r or --recursive means that wget will download the URLs you specify, parse the markup to find links to other files, and then download those, repeating by default five times (which could end up being billions of links if it follows links outside the site).
  • -A/--accept, -I/--include-directories= and -R/--reject specify patterns to filter the set above.
  • -np/--no-parent ensures that only URLs starting with the URLs you've given are downloaded.
  • -nH/--no-host-directories puts files from all hosts in the same directory.
  • --cut-dirs=number generalizes the above to not create multiple directories on the local storage when downloading files.

l0b0

Posted 2019-01-13T19:45:13.730

Reputation: 6 306