Why doesn't a simple HTTP request to display a remote web page violate the same-origin policy?

Question

On a W3Schools page, I found that HTTP requests work like this:

A client (a browser) sends an HTTP request to the web
A web server receives the request, and runs an application to process it
The server returns an HTTP response (output) to the browser
The client (the browser) receives the response.

On the same page I found that an XMLHttpRequest works like this:

A browser creates an XMLHttpRequest object and sends it to the server
The server processes the request, creates a response and sends data back to the browser
The browser processes the returned data using JavaScript and updates the page content.

The above two processes appear pretty much the same to me. However, the latter one violates the same-origin policy (SOP) if the server runs on a remote domain. This question on Stack Overflow about the URL in the open() method says that

As we can only send requests to our own web server, I assume that we don't have to rewrite the website's name in the URL.

Applying the same logic to the first case (HTTP requests) would mean that I couldn't open a web page if it is not on my own computer. Luckily, this is not the case.

So, why doesn't an HTTP request to display a remote web page violate the SOP? What is the key point/difference here?

I assume it's about the fact that the second process (XMLHttpRequest) is initiated from a script, while the first one is triggered by the user. However, isn't the HTTP request sent from a script when I click a hyperlink on a web page? And how can a web server distinguish between requests coming from a script and coming from a user?

The "As we can only send requests..." premise in that question is simply false. You can absolutely send requests to other domains. If they're "simple" requests (no weird headers or methods) then the server will even receive the request as intended and most likely respond, although the browser won't let your script *see that response* unless the server trusts your domain for cross-origin requests (see "CORS", which can also allow non-simple requests and authenticated requests, if configured correctly). — CBHacking, Oct 18 '20 at 21:18
The same-origin policy is about a script on one page accessing data on a different website (e.g. a script on Stack Exchange accessing your online banking). If you just open the page, scripts from other pages can't access it - there is no problem. — user253751, Oct 19 '20 at 08:59
You should read the CORS protocol. https://fetch.spec.whatwg.org/#cors-protocol — Joaquin Brandan, Oct 19 '20 at 22:16
@JoaquinBrandan, thank you for your advice. Could you be a bit more specific? — K. Gabor, Oct 20 '20 at 20:03
The mechanics of how CORS work are in that link. CORS is the protocol used by browsers and servers to agree on when it is OK to make a cross-origin HTTP request. By default some requests are allowed,others need what is called a "preflight request" that the browser will automatically send before your request and the server must respond to in order to tell the browser that "it's ok, I as a server expect this". If the server does not respond according to CORS then the browser will refuse to send your request. This is true for all requests triggered by JS. — Joaquin Brandan, Oct 20 '20 at 21:09
To clarify. You violate SOP if while the URL of your browser says "mydomain.com" and the js or html of that site makes a request to "notMyDomain.com". If you click on a link that loads a diferent site, you change the domain (the url you are standin on) , and the SOP now applies for that domain. The browser is the one that tells apart a "js request to another domain, violates SOP" from a "click on a link to change to another site, does not violate SOP" To all cases where a SOP violation is required, CORS must be implemented. I hope that helps — Joaquin Brandan, Oct 20 '20 at 21:21
@JoaquinBrandan, yes, it definitely helps, thank you very much for the comprehensive answer. Also, thank you all for your detailed answers. If it was possible, I would mark each answer as accepted but so far as I know this is only possible with one single reply. — K. Gabor, Oct 21 '20 at 18:13

bdsl · Answer 1 · 2020-10-23T12:23:52.373

And how can a web server distinguish between requests coming from a script and coming from a user?

It doesn't. The same origin policy is enforced by the browser, not the server.

The purpose of the Same Origin Policy (SOP) isn't to protect the server itself. Instead it's to protect confidential information which the server wishes to share with the user, but not to share with other parties. This information may be protected by checking the user's cookie, authentication header, or IP address when they send a request, but those checks can be bypassed by an attacker getting the legitimate user to open the attacker's website with a script to request the information.

This is when the SOP provides protection. The request may still be sent, but the browser can refuse to allow the script to see the information in the response.

If there is a need to protect the server against potentially harmful requests that it could be tricked into carrying out based on its trust in the user, the SOP is not enough. At that point the server needs other techniques to defend against CSRF.

score 14 · Accepted Answer · edited Oct 21 '20 at 19:59

If one enters a URL in the browser one starts with a new empty origin, i.e. no domain and port belong to the origin initially. Everything can be put into a window/tab with an empty origin and once it is put there the origin changes depending on where the data came from.

If one instead calls a HTTP request from inside a loaded web page, one starts with a non-empty origin. In this case the same origin policy comes into action and restricts what can be done from inside this non-empty origin.

Note that if one has already a loaded page in the browser and now replaces the URL in the URL bar, the same origin policy does not apply since this new URL is not called from inside the window/tab but from outside. Thus it will again start with an empty origin.

score 14 · Answer 3 · answered Oct 18 '20 at 21:27

The simple answer to your question is that "requests to display a web page" are what set the origin, so obviously they cannot violate same-origin policy. Things that happen within a page (such as JS execution and notably XHR/Fetch) are subject to various restrictions due to same-origin policy, but top-level navigation is always allowed*.

* Iframes in general, and sandboxed ones in particular, get a bit weird here. An iframe is part of the parent page's origin, but its content is part of the origin of the iframe's src (which could be totally different!). Cross-orign parent/iframe relationships have severely limited interaction, similar to any two cross-origin pages, with the notable exception that by default either can navigate the other (that is, pages can set the src of an iframe they contain, and iframes can set the location of their parent (although the iframe cannot set the location to a javascript: or data: URI, as that would be injecting content into the parent's origin). It is possible to sandbox iframes such that they cannot perform parent navigation... or indeed such that they can't execute JS at all!

score 3 · Answer 4 · answered Oct 19 '20 at 18:37

The difference is simple: the computer user chooses which website address to type into their browser. They do not choose which website addresses that site then tries to exchange information with.

That key distinction means we need protections on the latter, but not the former.

Of course, an equivalent of that protection for the former would mean you could not actually navigate to any websites at all. At some point, you need an "original" origin, and so we define that to be the one the user typed into their browser window.

It's about trust, and about choice, and about control, and about protection from malicious website programmers (including those "in the middle" who may have modified otherwise-good website code). A user generally shouldn't expect protection from themselves, nor is such a thing generally feasible without some mind-reading mechanism.

Of course, even the original HTTP request is potentially subject to attacks (interception & modification or silent redirection), but that's why we have HTTPS.

rexkogitans · Answer 5 · 2020-10-21T11:58:40.513

However, the latter one violates the Same Origin Policy (SOP) if the server runs on a remote domain.

No, it does not necessarily. Otherwise, XMLHttpRequest would be useless, as you observed.

The point is that the HTTP request triggered by the JS program by means of XMLHttpRequest has to point to the server which delivered the website, otherwise it violates the SOP.

In a short example: The host mydomain.org delivers a website containing a JS program with this snipped:

let hr = new XMLHttpRequest;
hr.open("GET", "http://mydomain.org/path");

is fine,

hr.open("GET", "/");

is fine (because the origin server is implied), whereas

hr.open("GET, "http://differentdomain.org");

violates the SOP.

Addendum: If the website is loaded from a webserver on the local host, then the SOP is violated always when loading from the Internet - of course. This pitfall is even more important if the website is loaded as local file rather as delivered from a local web server.

would `"differentdomain.org"` imply a file called `differentdomain.org` in the current directory? — user253751, Oct 21 '20 at 10:21

Why doesn't a simple HTTP request to display a remote web page violate the same-origin policy?

5 Answers5