How does browser know how much page has been loaded?

8

4

Looking at the progress bar of browser which sometimes slows down near the end during a web-page loading, I was wondering whether the browser shows progress based on the size of elements present on the page or the no. of elements or something else?

Maybe someone who has checked source of Firefox or some other browser knows about this in a little more detail?

Atul Goyal

Posted 2011-09-09T11:13:53.673

Reputation: 307

Brings back memories of Netscape 1.1 and its "interesting" take on this... – James – 2011-09-09T14:12:54.847

Firefox has a progress bar? – William Jackson – 2011-09-09T15:12:31.443

Answers

15

What is loading a website?

Loading a web page is more or less like downloading a file. What you get by the server is – in most cases – just a HTML file transferred over HTTP. First, you make a HTTP request to the URL of the site, like GET http://superuser.com.

As William Jackson said, HTTP uses the Content-Length header field to show you the size of that file in advance. This is something the browser can evaluate to guess how much progress it has made downloading the whole site.

However, this fails to cover all the resources a HTML file can load by referencing them. These might include:

  • External images
  • External stylesheets
  • External scripts
  • Frames
  • AJAX loads

How does the browser know how much to load?

It is now the task of the browser to find these references and request them too. So, for each external reference, the browser will either consult its cache, or send a new HTTP request. For Super User, this would be the following files hosted on content distribution networks for faster performance:

  • GET http://ajax.googleapis.com/ajax/libs/jquery/1.5.2/jquery.min.js – the main jQuery file
  • GET http://cdn.sstatic.net/js/stub.js – some JS functions
  • GET http://cdn.sstatic.net/superuser/all.css – the stylesheet
  • ...

You can actually see this using Firebug or Chrome's debugger, when you enable timeline tracking. This is the timeline of loading Super User, filtered so that only requests are shown. Click to enlarge:

enter image description here

As we can see, the main Super User site would take the longest time to load, but cascading from it, there are other page loads (i.e. HTTP requests or cache requests) involved. All of those also expose their Content-Length, therefore the browser can make a good guess of how long it will take to load all these files.

And since all of this is happening within a very short time frame, you won't notice the small irregularities in the progress bar. Sometimes you will see the progress bar hang at two thirds – this might be when the browser fails to load an external resource as fast as the others.

How do browsers implement this?

Google Chrome

I've looked into the sources of Google Chrome (a.k.a. Chromium) and found this class called ProgressTracker.cpp. Actually, it's written by Apple, so it most probably stems from the WebKit rendering engine. It includes the following fields:

ProgressTracker::ProgressTracker()
    : m_totalPageAndResourceBytesToLoad(0)
    , m_totalBytesReceived(0)

Thus, as I said, the total number of resource bytes will be identified, and the progress will be changed accordingly. There's an interesting comment that shows you how the actual importance of the first loaded page is elevated:

// For documents that use WebCore's layout system, treat first layout as the half-way point.

Therefore, if the first page is loaded (and its external resources are still to load), the progress will be 50%.

Firefox (Fission add-on)

Now, there's also a slightly easier metric. I've looked into Fission, the progress bar extension for Firefox. If I'm not reading it the wrong way, it does something one could easily think of.

Every web site consists of a number of DOM Elements. By parsing the first HTML site, the total number of DOM elements to be loaded can be estimated.

For every loaded DOM element, increase the counter, and just display a progress bar according to it.

slhck

Posted 2011-09-09T11:13:53.673

Reputation: 182 472

Sometimes one finds onesself longing for simplicity. IBM WebExplorer showed a percentage progress indicator for the primary page, and additional individual percentage progress indicators for each worker that was loading images and whatnot. – JdeBP – 2011-09-09T17:48:56.907

1

When a browser requests a file from a server, the server has the option to tell the browser up-front how large the file is. The server does this by sending a Content-Length header.

There is some other information about how a browser can determine the size of a file it is downloading.

William Jackson

Posted 2011-09-09T11:13:53.673

Reputation: 7 646

Is it just me or this answer appears to be talking about a "file download" instead of loading of a web-page. Though you can say that loading of a web page is basically downloading of several files, but I think the progress of a web page loading is li'l more complex than a simple file download. – Atul Goyal – 2011-09-09T14:57:29.833

1The HTML is a file download. The CSS is a file download. The images are file downloads. The javascript files are file downloads. No, loading a page is not "a simple file download"; the browser accounts for the Content-Length of every singe resource on the page when it shows you that fancy progress bar. – William Jackson – 2011-09-09T15:11:50.477