Why can't we save 100% of a site?

2

1

We all know that we can't obfuscate HTML or Javascript. We need to remember that if the browser can read it, the user should be able to read it too.

Therefore, why is it that sometimes when we download a site locally, it does not work properly? This is especially true with Javascript. If the browser can display it properly, why does it breaks when it is stored locally? In short, what causes it to work differently locally than on the browser?

Thank you!

EDIT

To give an example, I was trying to download this site:

https://www.mashape.com/howitworks

To see how they made the animation, however the animation is not working locally.

Vilarix

Posted 2013-08-02T18:58:53.737

Reputation: 123

1First sentence in your 2nd paragraph has wrong grammar, please fix it. – None – 2013-08-02T19:05:23.160

1That's using html5 canvas element with javascript. (The site you mentioned.) – None – 2013-08-02T19:41:30.017

Well for example ajax doesn't work when loading a site from disk – None – 2013-08-02T20:02:39.447

Answers

1

Reasons:

  • Dynamic pages - sometimes a URL doesn't give you a file, but the output of a program that runs on the server. This program may return different results each time.

  • Javascript - exact URLs a page may access won't be known unless all Javascript is executed, and, depending on the events that fire it off, it's possible some Javascript in a page never gets executed, or only gets executed at certain times, etc.

  • Filtering based on IP, etc. - sometimes a webserver may give you a different data for the same URL based on some attribute such as your IP, etc.

Purely static sites with no server-side processing should always be possible to download with a tool such as wget, but those types of sites are increasingly rare.

LawrenceC

Posted 2013-08-02T18:58:53.737

Reputation: 63 487

Another point: Javascript may use randomization, either for load balancing reasons or for obfuscating, it might have called "a.stalitemaps.example.com" when the page was loaded, but might now call "b.stalitemaps.example.com", as the prefix is calculated using random – Ferrybig – 2019-01-24T08:46:44.543

8

A website is not meant for downloading. When you do so, very often the dependencies get corrupt, links to script files don't work, images are missing.

Even though some browsers have a function to download whole website, it is not perfect, and is bound to fail on dynamic websites that use server-side scripts, such as PHP.

MightyPork

Posted 2013-08-02T18:58:53.737

Reputation: 309

2

Frequently it's because the links in the HTML and/or JavaScript code, including CSS, etc., are still referencing the online site, not the locally downloaded files. You would have to go through all of the files and update any links so that they reference the local files rather than the original site. In addition, some parts of the site may be dynamically loaded - i.e., the mark up that is actually displayed is different than that originally downloaded by the browser. When you "save" the files out of the browser, it likely saves the response to the original request, not the current document object model that reflects what is displayed in the browser.

tvanfosson

Posted 2013-08-02T18:58:53.737

Reputation: 501

When using the Scrapbook Firefox extension it downloads every stylesheets and scripts from a site to keep a copy locally, yet it is still not working 100%. What I am wondering is what cause this little things to not work properly, like a navbar 2px lower, a CSS animation not appearing etc. – None – 2013-08-02T19:16:39.780

1

If you download a webpage you haven't necessarily downloaded linked files, e.g. images, css, javascript. Even if you did have them you'd have to change the references to them, or have them in proper locations for them to work. "/mycss.css" would have to be in the main folder "mycss.css" would have to be in the same folder as the current page, etc.

Matt Healey

Posted 2013-08-02T18:58:53.737

Reputation: 11

1

If you'd like to download a whole website (or portion of one), check out HTTrack. It will download the site or part of the site and change all the dependencies to work locally as needed.

http://www.httrack.com/

Samuel Reid

Posted 2013-08-02T18:58:53.737

Reputation: 111

I tried it but I prefer the Firefox extension Scrapbook. They both don't make the site work 100% like the original though which leads me to this question: why, even when editing the links, it is still not displaying the site like it was in the browser? – None – 2013-08-02T19:18:06.427

0

Right click -> View Source -> Click on all the js/css files, save those as well. Make sure you reference the files on your computer.

Jacques ジャック

Posted 2013-08-02T18:58:53.737

Reputation: 101