Is there a way to use wget to download a site and ALL of its requirement documents including remote ones

2

3

I want to do something simular to the following:

wget -e robots=off --no-clobber --no-parent --page-requisites -r --convert-links --restrict-file-names=windows somedomain.com/s/8/7b_arbor_day_foundation_program.html

However, the page I'm downloading has remote content from a domain other than somedomain.com. It was asked of me to download that content too. is this possible with wget?

General Redneck

Posted 2011-08-10T15:13:47.277

Reputation: 123

Answers

1

add -H or --span-hosts (same thing)

-H, --span-hosts go to foreign hosts when recursive.

barlop

Posted 2011-08-10T15:13:47.277

Reputation: 18 677

You are a genius! For some reason I looked right over that one when I was looking through the man file. – General Redneck – 2011-08-12T13:39:10.163

1@General Redneck I have a file for various commands and when I find a new switch I jot it down 'cos so much in the man pages aren't relevant to me and so man doesn't refresh my memory very well. You can use -k(convert links) -p(images/page requisites) e.t.c. easier to remember. there is also a wget mailing list if really stuck, questions there are really tough I can't answer those. Was a fluke that I could answer yours! A good way to learn wget, and I haven't done this yet, is on a local web server simple pages. I did once hear curl is better than wget. – barlop – 2011-08-12T14:33:12.010

Actually this worked perfectly and did exactly what I needed it to. I personally like verbose commands because in my mind, it's easier to remember. Thus the reason they have both styles I guess. I'll defiantly take your advice if/when I have more problems. I agree, I've heard curl was better, however, I first started the problem with wget and was apparently one option off :P, so finishing it wasn't a big deal. – General Redneck – 2011-08-12T18:25:16.737