How do web servers know whether you're using direct IP address access?

65

10

Some web servers, when accessed using their IP address, return an error that direct IP address access is not allowed.

I've been wondering for some time how this works. I mean, doesn't the browser always resolve the IP address and connect to it? Isn't "Direct IP address access" just skipping DNS? How does the remote server even know you skipped DNS?

Joseph A.

Posted 2016-03-13T14:35:54.903

Reputation: 1 844

2As I recall, what he really asked for was added to the http protocol very early, in order to provide for virtual servers on the same real host. – JDługosz – 2016-03-13T22:55:45.290

3It’s basically the same process that allows a single server to differentiate between different virtual hosts. The real server maps a URL to one of its virtual hosts. Many servers do not have a fallback for an unmapped URL, either by design or default. – Manngo – 2016-03-14T10:55:24.480

You can skip DNS but avoid this error if you create an entry in your hosts file for the domain name in question. Your browser will be looking for the domain name, and will include it in the Host: header, but no DNS query will be made due to the hosts file entry. – Monty Harder – 2016-03-15T14:31:28.643

The answer to these kinds of questions usually is, because you told them. – Thomas – 2016-03-16T06:42:28.370

Answers

91

To answer your question of how it knows, it has to do with what your browser sends the server.

You're right that the system always resolves it to an IP address, but the browser sends the URL you attempted to access in the HTTP header.

Here is a sample header that I found online, modified to look as though you used Firefox on Windows and typed apple.com into the address bar:

GET / HTTP/1.1
Host: apple.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache

Here's what the header would look like if you used its IP address:

GET / HTTP/1.1
Host: 17.142.160.59
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache

Both of these would be sent to the same IP address over a socket, but the browser tells the server what it accessed.

Why? Because web servers with the same IP address may host multiple sites and give different pages for each. It cannot distinguish who wants which page by IP address because they all have the same one - but it can distinguish them by the HTTP header.

iAdjunct

Posted 2016-03-13T14:35:54.903

Reputation: 1 570

7Ahh, makes much more sense now! So basically, the browser sends TO the IP the header with either the IP or the domain, and the site makes its assumption on that. So really, these restrictions are easy to bypass? – Joseph A. – 2016-03-13T15:24:37.293

7It's not that it's a restriction that you're bypassing, it's just that you're not playing ball and you're going to get some strange results. – iAdjunct – 2016-03-13T15:29:09.857

These HTTP requests are what you'd get if you are using a proxy. Without a proxy, the information comes in the host header. See this example.

– 0xFE – 2016-03-13T15:55:37.280

Ahhhh that explains why there was that field there. I didn't pay too much attention to the example I grabbed. It at least served its purpose. – iAdjunct – 2016-03-13T15:59:36.900

2bytec0de: The other piece of this is that web server configurations will often be set up based on host name. The IP packet specifies the IP address, the TCP segment specifies the port number, and the HTTP header specifies the hostname. So commonly servers are configured to say "if client/browser asks for example.com, then give them this." They can be set up to also respond to IP addresses or wildcards (respond to anything), but many people just copy examples, and many pre-existing examples are based on the domain name supplied by the browser. – TOOGAM – 2016-03-13T17:58:00.687

14@bytec0de It's not a restriction. It's more like using the correct phone number, but the wrong extension - you called the right building, but not the right person. And the reason for its introduction is also pretty much the same as with phones - it allows you to host multiple separate sites on the same IP address (and TCP port). For example, our development server hosted hundreds of separate web sites at the same time, and plenty of web hosting solutions use the same approach ("register a domain, point it at our IP address, we'll take care of the rest"). – Luaan – 2016-03-14T09:14:06.680

21

With the HTTP 1.1 protocol (the prior HTTP 1.0 version has been obsolete for quite some time, so is unlikely to be used by any recent version of a browser), the host header was introduced. For HTTP 1.1 that is a required header line that must be issued by a browser. The domain name is included by the browser in that line, e.g. Host: example.com. So the web server knows which web site the browser wants to access from that line. Since a webserver may be supporting dozens of websites, that line is important to it to determine which web site the requested page resides on. Supposing the browser wants to access the home page for a site on example.com, It issues the following line to the server when it connects to the server:

GET / HTTP/1.1

That line specifies the browser wishes to get the root document, i.e., "/" for the website. If you wanted to access /somedir/testpage.html, GET /somedir/testpage.html would be in the "get" line. The line will be followed by the line below:

Host: example.com

So if the web server is supporting the websites example.com, someothersite.com, yetanothersite.org, etc., it knows that it should return the main page for example.com. If it doesn't get that line, or doesn't have a domain name listed in the Host line, it doesn't know which website's home page should be returned. So it may return an error message, instead, or return the home page for a "default" site for the server.

You can issue the same commands a browser issues using the telnet protocol, e.g., telnet example.com 80 from a Linux shell prompt or an Apple OS X Terminal window, to connect to the default HTTP port, port 80 - see Testing access to a website using PuTTY for steps to do so with PuTTY on a Windows system.

moonpoint

Posted 2016-03-13T14:35:54.903

Reputation: 4 432

3Just a note: the host header was also used in HTTP 1.0, it just wasn't required. HTTP 1.1 made the field mandatory. In practice, many HTTP 1.0 servers simply didn't work if the browser didn't send the host header (for all the reasons outlined above), so most browsers sent it anyway. – Luaan – 2016-03-14T09:16:45.943

6

This is due to the Host: HTTP header. This is quite useful for hosting multiple sites on the same IP address. For example, http://www.k7dxs.net/ and http://www.philipgrimes.com/ are both on the same IP address. However, because of the Host: header, they can show two different sites.

For HTTPS, as @Toothbrush pointed out, they use TLS Server Name Indication because the Host header is part of the encrypted request, and the server doesn't know which cert to offer without this.

Fun experiment: Get Tamper Data for Firefox (I haven't been able to find an equivalent for Chrome) and start tampering. Open http://slipstation.com/ and edit the Host: header in the request to be http://www.zombo.com/. You'll see a possibly familiar website where anything is possible.

Duncan X Simpson

Posted 2016-03-13T14:35:54.903

Reputation: 1 171

Actually, those sites use Server Name Indication. There is no way to tell what site to display if both sites are hosted on the same server over HTTPS without SNI since the server does not know which certificate to use.

– Toothbrush – 2016-03-15T16:17:19.877

Oh, interesting. Will my experiment still work? – Duncan X Simpson – 2016-03-15T20:24:51.280

Yes, if you find two sites that are hosted on the same IP address over HTTP. – Toothbrush – 2016-03-15T20:29:06.723

But not HTTPS is what I was asking. – Duncan X Simpson – 2016-03-15T20:31:02.850

No, it shouldn't work over HTTPS. If it does, there is a security vulnerability in the web server. – Toothbrush – 2016-03-15T20:32:01.297

@Toothbrush, It does work, if you bypass the security warning of the client. A web server does not request identification from the client normally, though it is part of the HTTPS that it may do so, but then each client needs their own certificate. – Motes – 2016-03-16T06:17:44.153

@Motes You must have misunderstood my comments. The web server (i.e. IIS, Apache) should not send data from a different server than has been negotiated with SNI. – Toothbrush – 2016-03-16T11:02:56.360

@Motes SNI identifies the server before the HTTP handshake happens. If you host two different sites on the same IP and they use 2 different certs, without SNI it has no idea which cert to identify with. And at that point SNI functionally replaces the Host: header. – Duncan X Simpson – 2016-03-16T17:37:22.950

5

The web server can be configured to only accept connections to a particular domain or subdomain. It could be hosting multiple domains.

What the web server does when a direct IP address is used is configurable. In the case of Apache, it will by default go to the first named vhost out of the enabled sites, which are sorted alpha-numerically.

This is the most relevant part of the Apache documentation that I have found, after a quick search:

https://httpd.apache.org/docs/current/vhosts/name-based.html

paradroid

Posted 2016-03-13T14:35:54.903

Reputation: 20 970