What goes on when requesting a URL in a browser?

5

3

Not sure if this is the right forum to ask but I am wondering if there are any resources that talk about how requests from the browser can passed on to the server and then the information required is passed back to the browser. Specifically, I would like to know more about the innards - protocols used, the entire works. Cheers!

Tereno

Posted 2011-01-13T16:51:39.933

Reputation: 151

Comments deleted since they distracted from the question. – BinaryMisfit – 2011-01-14T07:41:43.180

Answers

9

  1. The browser pulls the domain name from the URL (e.g. superuser.com from https://superuser.com/posts/232820) and asks the operating system to turn that into an IP address.

  2. The operating system consults whatever name resolution methods are configured. Normally it will be the in-memory cache, local hosts file, and finally DNS. (Some browsers have their own caches, and some operating systems support more protocols than just DNS.)

    1. If the name is not found locally, the OS sends a DNS query to the configured DNS server (on Unix-like the address is in /etc/resolv.conf), UDP port 53.

    2. The DNS server responds with one or more IP addresses for the browser to try connecting to.

  3. The browser makes a connection to the provided IP address on TCP port 80.

  4. The browser sends a HTTP request with headers which contain the file to retrieve, along with other information about the capabilities of the browser, any cookies for this domain and other meta information.

  5. The server (using software like Apache) looks for the file and reads it.

  6. The server sends the content (HTML, images, JavaScript code, etc.) to the web browser. On first request, this will usually be just a single block of HTML.

  7. The browser parses HTML returned for requests for additional assets -- e.g. JavaScript, CSS, images, etc.

  8. The browser issues subsequent requests for the additional assets. Requests made to the same server do not need to look up the IP address. Usually the existing TCP connection is reused, too.

  9. The browser processes the content and displays it to the user.


Here is a quick diagram of the whole process (note however, that the numbering is not the same as the explanation used above). I think it provides a decent overview of the whole process.

A diagram of the whole process

new123456

Posted 2011-01-13T16:51:39.933

Reputation: 3 707

It's a good enough start to get my vote. Minor nit-picks... In steps 1-3, the browser talks to the operating system to get the IP address. The operating system talks to the DNS server(s) (or hosts file). DNS lookup is done by the domain name within the URL, not the whole URL. I would modify step 7 to indicate that just the main HTML is returned and then add a step where the browser parses the HTML for subsequent assets to request (e.g. for js, css, images, etc). Steps 4-6 are repeated for each of those assets, usually with a connection that is kept alive from the initial request. – Doug Harris – 2011-01-13T17:11:26.223

@Doug I'll CW it - I don't know enough and the comment section will fill up really quickly – new123456 – 2011-01-13T17:12:24.137

You forgot about Arp and mac resolution. – Jeff F. – 2011-01-13T17:17:37.907

1@Jeff what about conversion of bits to electrical pulses? – Doug Harris – 2011-01-13T17:20:17.993

1@Doug Harris Hey if IP resolution is important then mac resolution should be as well :) – Jeff F. – 2011-01-13T17:30:52.680

@Doug I'm all for adding what happens at the atomic level, if need be ;) – new123456 – 2011-01-13T17:44:00.790

This looks like a good start. Doug's answer fills up some of the questions that I have as well. – Tereno – 2011-01-13T21:01:56.647