Web server doesn't send "entire website", but documents that browsers request.
For example when you access https://www.google.com/ the browser queries server for the document https://www.google.com/
. The server processess the request and sends back some HTML code.
Then the browser checks what the server has sent. In this case it's HTML webpage, so it parses the document and looks for referenced scripts, stylesheets, images, fonts etc.
At this stage the browser has finished downloading that document, but still doesn't have downloaded referenced documents. It can choose to do so or to skip them. Regular browsers will try to download all referenced documents for best viewing experience. If you have an ad blocker (like Adblock) or privacy plugin (Ghostery, NoScript), it may block some resources too.
Then the browser downloads referenced documents one by one, each time asking the server explicitly for a single resource. In our Google example the browser will find following references, just to name a few of them:
(actual files may be different for different users, browsers and sessions and may change over time)
Text-based browsers don't download images, Flash files, HTML5 video etc. so they download less data.
@NathanOsman makes a good point in comments: Sometimes small images are embedded directly in HTML documents and in those cases downloading them cannot be avoided. This is another trick used to reduce number of requests. They are very small, though, otherwise the overhead of encoding binary file in base64 is too big. There are few such images on Google.com: (base64 encoded size / decoded size)
- 19×11 keyboard icon (106 B / 76 B)
- 28×38 microphone icon (334 B / 248 B)
- 1×1 px transparent GIF (62 B / 43 B) which shows up in Chrome Dev Tools Resources tab, but I couldn't find it in the source - probably added later with JavaScript
- 1×1 px corrupted GIF file that appears twice (34 B / 23 B). Its purpose is a mystery to me.
19Or actually download images.. – Journeyman Geek – 2014-09-20T13:26:20.703
4Think in terms of expecting a three or more orders of magnitude reduction. – user2338816 – 2014-09-21T00:36:34.703
1@user2338816 Yes, there may be a difference of three orders of magnitude. Try YouTube! [ Adding later: ] Oops, that's another three! – Volker Siegel – 2014-09-21T09:52:16.593
3@user2338816 three orders of magnitude would be unlikely. For example, for this particular page the original html document is some 10% of the whole downloadable sources disregarding caching, so just a single order of magnitude; and many heavyweight items (javascript libraries, large images, etc) are successfully cached, often reused across many pages and thus get downloaded very rarely, so their size doesn't really represent their impact on total network traffic. – Peteris – 2014-09-21T10:09:10.800
1@Peteris 3 might be a little high, but 2 certainly is not. Let's say the 10% you notice here is the same across most regular sites. Then take into account that video traffic amounts to 78% of all video traffic. This means for the rest of the 22% of the traffic, we can expect 2.2% to be text. Now, this is napkin math, but 2 orders of magnitude seems to be where it is. – corsiKa – 2014-09-22T16:13:34.070
@Peteris Yeah, you're probably right. Two is a better guide. I was actually slightly surprised at the size of this page, which is close to an order of magnitude larger than reported sizes of, say, a small number of random pages from a few microsoft.com and ibm.com tech forum sites that I compared. The physical sizes of many sites' pages have gotten so high that they've apparently reduced the ratio. (Not by reducing non-text content, perhaps unfortunately.) – user2338816 – 2014-09-23T05:35:52.657
Did we factor in HTML is highly compressible in stream so the source transmitted is much smaller sent via gzip, while many images can not be compressed further. – Sun – 2014-09-25T04:58:38.137