HTTRACK doing partial download

2

We are using HTTRACK to download an entire website for offline viewing.

The problem is even downloading the whole site with level 4 (-r4) some links fail to work.

For example if you use httrack for capturing the site:

http://advaitasharada.sringeri.net/display/bhashya/Gita

It captures only a portion of it, but it leaves the links on the right side. The other links containing the other chapters of the Gita, is marked with #fragments.

http://advaitasharada.sringeri.net/display/bhashya/Gita#BG_C02 (the link only works when clicked on the browser)

  1. Why doesn't HTTRACK download all the links? What to do?
  2. Also the search is not working. It leads to the original domain of the site( which needs internet connection).

Br. Sayan

Posted 2017-12-09T10:54:24.423

Reputation: 145

1Answer to the first part: The website uses server-side scripting, that is, the server generates webpages on the fly when a request is made. HTTRack can only download static webpages and thus misses out the portions that are generated on the fly. – Karan Karan – 2017-12-09T11:30:40.707

1it is downloadable, rather large though 2.6 GB, you could try wget – Junme – 2017-12-10T11:42:42.847

Thanks @KaranKaran and @Junme! This is what I suspected. It involves server-side scripting! Junme, I tried wget, but it was not working. Size is not a problem. I gave -r6 and HTTRACK downloaded 6.6 GB! God knows how it got so much data! Could you please post the wget command you used as an answer? Have you checked the righthand side links of the site are working? – Br. Sayan – 2017-12-10T11:49:05.097

Answers

0

The website uses server-side scripting, that is, the server generates webpages on the fly when a request is made. HTTRack can only download static webpages and thus misses out the portions that are generated on the fly.

Karan Karan

Posted 2017-12-09T10:54:24.423

Reputation: 160

Is there any other ways to download the website? – Br. Sayan – 2018-01-13T12:22:27.467