Why the difference in the URL formatting?

6

  • Why is that some URL's end with .html while some do not, while most of them are HTML pages?

  • Why is that some URL's begin with www and some do not, while all of them are on World Wide Web?

Lazer

Posted 2010-03-16T08:03:07.197

Reputation: 13 841

not all URLs are on the world wide web. ftp://example.com doesn't point at an HTTP server, even if http://example.com does. – quack quixote – 2010-03-16T16:21:18.443

Answers

4

Because frequently nowadays, the HTML pages are dynamically generated.

Most of the time, the extension describes the producer of the HTML page. For instance, .asp means the page has been generated by ASP code (programming code embedded in a page). Same for .jsp, Java Server Pages, which are on the server pages containing a mix of HTML and Java code. There are plenty of other extensions that use the same mechanism (.do, .aspx, .cf, ...)

In the end, all the browser receives is HTML, but all the compilation and the logic has been run on the servers.

For the www.mydomain.com, it actually means you contact a server (or router) called "www" in the domain mydomain.com. While it's a convention, you're not forced to follow it. Domains (in the DNS entries) can be configured to say "if no explicit servername is specified, send requests to the web server").

You can also give any other name to the Web server and have it known externally, like http://mywebserver.mydomain.com.

Note the external name (www, mywebserver) does not, most of the time, relate to the physical name of the web server. Actually, on big sites, several servers are processing requests coming to a single name.

Snark

Posted 2010-03-16T08:03:07.197

Reputation: 30 147

What is the webserver name for http://stackoverflow.com? – Lazer – 2010-07-22T14:10:47.243

2

an url consists of several parts:

  • a protocol part
  • a server part
  • a file/resource part

    protocol://server/file_or_resource

the protocol part is the http:// or ftp:// or ssh:// or whatever you can think of. the server part is everything between the protocol part and the file/resource part

http://google.com/index.html

in this case its "google.com", in other cases its "user@machine:port". so, this is the answer to your 2nd question: some machines are called "www.hostname.com" and some other are called "hostname.com".

as soon as your browser / protocol handler connects to the server described in the server part through a protocol described via the protocol part of the url, it asks the server for the resource given in the resource part. and thats the answer to your first question: you ask the server for a file/resource and the server answers.

http://google.com/index.html <- you ask it for "index.html"

if the server has it, fine. if the name is "foo.bar" and the file exists, fine. if the server knows what to do when you ask it for "more.money" .. cool.

read more about it 'at wikipedia'.

akira

Posted 2010-03-16T08:03:07.197

Reputation: 52 754