What is loading a website?
Loading a web page is more or less like downloading a file. What you get by the server is – in most cases – just a HTML file transferred over HTTP. First, you make a HTTP request to the URL of the site, like GET http://superuser.com
.
As William Jackson said, HTTP uses the Content-Length
header field to show you the size of that file in advance. This is something the browser can evaluate to guess how much progress it has made downloading the whole site.
However, this fails to cover all the resources a HTML file can load by referencing them. These might include:
- External images
- External stylesheets
- External scripts
- Frames
- AJAX loads
How does the browser know how much to load?
It is now the task of the browser to find these references and request them too. So, for each external reference, the browser will either consult its cache, or send a new HTTP request. For Super User, this would be the following files hosted on content distribution networks for faster performance:
GET http://ajax.googleapis.com/ajax/libs/jquery/1.5.2/jquery.min.js
– the main jQuery file
GET http://cdn.sstatic.net/js/stub.js
– some JS functions
GET http://cdn.sstatic.net/superuser/all.css
– the stylesheet
- ...
You can actually see this using Firebug or Chrome's debugger, when you enable timeline tracking. This is the timeline of loading Super User, filtered so that only requests are shown. Click to enlarge:
As we can see, the main Super User site would take the longest time to load, but cascading from it, there are other page loads (i.e. HTTP requests or cache requests) involved. All of those also expose their Content-Length
, therefore the browser can make a good guess of how long it will take to load all these files.
And since all of this is happening within a very short time frame, you won't notice the small irregularities in the progress bar. Sometimes you will see the progress bar hang at two thirds – this might be when the browser fails to load an external resource as fast as the others.
How do browsers implement this?
Google Chrome
I've looked into the sources of Google Chrome (a.k.a. Chromium) and found this class called ProgressTracker.cpp. Actually, it's written by Apple, so it most probably stems from the WebKit rendering engine. It includes the following fields:
ProgressTracker::ProgressTracker()
: m_totalPageAndResourceBytesToLoad(0)
, m_totalBytesReceived(0)
Thus, as I said, the total number of resource bytes will be identified, and the progress will be changed accordingly. There's an interesting comment that shows you how the actual importance of the first loaded page is elevated:
// For documents that use WebCore's layout system, treat first layout as the half-way point.
Therefore, if the first page is loaded (and its external resources are still to load), the progress will be 50%.
Firefox (Fission add-on)
Now, there's also a slightly easier metric. I've looked into Fission, the progress bar extension for Firefox. If I'm not reading it the wrong way, it does something one could easily think of.
Every web site consists of a number of DOM Elements. By parsing the first HTML site, the total number of DOM elements to be loaded can be estimated.
For every loaded DOM element, increase the counter, and just display a progress bar according to it.
Brings back memories of Netscape 1.1 and its "interesting" take on this... – James – 2011-09-09T14:12:54.847
Firefox has a progress bar? – William Jackson – 2011-09-09T15:12:31.443