-1
Hi I'd like to download all PDF's from http://www.allitebooks.com/ and would like to use wget. my command is "http://www.allitebooks.com/" -P "C:\dummydir" -c -A pdf -r
but I believe it cannot follow the links to the subdomain for now, how can I fix it so it downloads http://file.allitebooks.com/20170105/Internet%20of%20Things%20and%20Big%20Data%20Technologies%20for%20Next%20Generation%20Healthcare.pdf for example.
but is there no tool in the world thats visits all links to a certain depth and downloads all files that end with
.pdf
extension? I believe there should be one right? – Thomas – 2017-01-06T11:41:42.563There definitely are ways to do it. In fact, I wrote a blog post about Recursively Downloading a Website.
The problem here is not that the tool does not exist but that the website you want to download PDFs from is secure enough that it prevents any sort of recursive download of the site.
OK, I will write my own crawler then if there are no out of the box tools. I'd like to fill an e-reader with those ebooks to have some info to read on the go. – Thomas – 2017-01-06T11:54:22.117
HTTrack or ScrapBook may be able to do what you're looking for but as far as that specific site goes you won't be able to download all the PDFs non-interactively. I would suggest that you find a few eBooks that you'd like to read from the site and just manually download them. Best of luck with your crawler program :)
If you find my answer helped provide a solution of some kind then please remember to accept it as a solution! – stuts – 2017-01-06T12:20:31.803
Yo stuts, I upvoted it, but it's not an answer that helps out to achieve my goal, so no accept man – Thomas – 2017-01-06T12:36:02.670
That's totally understandable dude. Still trying to get to grips with the answering system! – stuts – 2017-01-06T12:59:10.567