How to recursively download an entire web directory?

13

4

i have a web directory that has many folders and many sub folders containing files.

i need to download everything using wget or bash.

bohohasdhfasdf

Posted 2010-02-03T22:43:20.517

Reputation:

Answers

17

Try: wget -r and see if that works.

AJ.

Posted 2010-02-03T22:43:20.517

Reputation: 3 491

10

The best way is:

wget -m <url>

Which is short for wget "mirror":

  -m,  --mirror             shortcut for -N -r -l inf --no-remove-listing.

dlamotte

Posted 2010-02-03T22:43:20.517

Reputation: 211

9

$ wget \
 --recursive \
 --no-clobber \
 --page-requisites \
 --html-extension \
 --convert-links \
 --restrict-file-names=windows \
 --domains website.org \
 --no-parent \
     www.website.org/tutorials/html/

This command downloads the Web site www.website.org/tutorials/html/.

The options are:

  • --recursive: download the entire Web site.
  • --domains website.org: don't follow links outside website.org.
  • --no-parent: don't follow links outside the directory tutorials/html/.
  • --page-requisites: get all the elements that compose the page (images, CSS and so on).
  • --html-extension: save files with the .html extension.
  • --convert-links: convert links so that they work locally, off-line.
  • --restrict-file-names=windows: modify filenames so that they will work in Windows as well.
  • --no-clobber: don't overwrite any existing files (used in case the download is interrupted and resumed).

Link to source

Or try solution from ask Ubuntu.

boucekv

Posted 2010-02-03T22:43:20.517

Reputation: 239

6

wget --recursive (or whatever) didn't work for me (i'm on CentOS). lftp did it:

 lftp -c "open http://your.server/path/to/directory/; mirror"

Keshav

Posted 2010-02-03T22:43:20.517

Reputation: 61

lftp solved "invalid character encoding" problem I faced with wget recursive downloading when file names contain European characters like äöå. – ajaaskel – 2019-06-09T13:24:48.433

1The "whatever" flags are quite important... yes, wget's flags are a bit over the top, but what do you expect from a swiss army knife. – vonbrand – 2013-02-24T19:50:19.207

0

See Wget Recursive Retrieval.

wget -r -l 5 -O whatever http://example.com/

glomad

Posted 2010-02-03T22:43:20.517

Reputation: 101

0

You have a web directory? Is it situated on a remote machine and you can only access it through HTTP, or do you have shell access? Your mention of bash implies shell access, unless you mean using wget from the bash prompt.

Wget is not always very efficient so if you have shell access to the machine where the web directory is located and you want to download it, you could do this

$ tar cjf webdir.tar.bz2 webdir 

and then transfer the archive with ftp or scp.

Duncan

Posted 2010-02-03T22:43:20.517

Reputation:

I don't know why all the confusing questions were necessary here but this is what I ended up doing because the server configuration would not let me wget the directory. – Stack Underflow – 2019-05-23T19:16:01.440

0

You could also try the following if you have an FTP account:

lftp USER:PASSWORD@FTPSERVER -e "mirror&&exit"

Janez

Posted 2010-02-03T22:43:20.517

Reputation: 1