28

Suppose I am using a web browser to look at example.com. Now, from the same web browser tab, I enter example.org in the address bar and go to that completely different website operated by another entity.

Does example.org know that the previous website I used was example.com?

I understand that example.org can look at the HTTP Referer header to know that I came from example.com if I clicked on a link on example.com to reach example.org. What if I manually entered the address in the address bar instead? Will example.org know the previous website I came from?

Flux
  • 593
  • 4
  • 10
  • 21
    There is no built-in mechanism for websites to determine previously visited sites. However, there are plenty of techniques to track people across websites, which may or may not work depending on whether your browser blocks trackers or not, you have third party cookies blocked or not and several other factors. See, for example: [How to fight browser fingerprinting?](https://security.stackexchange.com/q/23053/235964) – nobody Nov 16 '21 at 03:31
  • 1
    Related, but not a duplicate: https://security.stackexchange.com/questions/105142/websites-know-which-other-websites-i-visited – schroeder Nov 16 '21 at 08:24
  • 2
    Side note: You can make a clickable link from example.com to example.org, and tell the browser not to disclose information in the HTTP Referer header, using the _rel_ attribute of the HTML anchor tag: [MDN link](https://developer.mozilla.org/en-US/docs/Web/HTML/Link_types/noreferrer) – sudoqux Nov 16 '21 at 11:58
  • If both sites load an ad or image from doubleclick.net, or google ads, etc., all bets are off. – Lee Daniel Crocker Nov 16 '21 at 21:06

4 Answers4

43

Do websites know which previous website I visited?

There is no direct cross-site access to the browsers history. But there are ways to "probe" the history and thus detect previous access to a specific page or site. Techniques to do such cross-site detecting of the users browser history are known under the term "history sniffing". Apart from that, use of cross-site trackers and advertisement networks (Google Analytics and others) offer cross-site profiling of a user based on the users history.

History sniffing basically works by observing side effects (usually timing differences) when including well known resources from other sites. This way one can detect if the user has visited a site or a specific page before, because the timing to load the resource might slightly differ if the resource was loaded from browser cache (i.e. page already visited) or if the server processing differed between the browser sending a cookie or not (i.e. site visited or not). Similar differences could be observed by including a resource from a HSTS-enabled site with plain HTTP and thus checking if the browser already knew about the HSTS enabled and thus directly accessed the site with HTTPS.

Doing history sniffing got harder in the last years with at least some browsers focusing more on preserving the privacy and limiting cross-site interactions with history associated stored data (cache, cookies, ...), even at the cost of some performance loss (i.e. not loading data from cache cross-site). But it is still possible.

To get some links about older techniques see Browser cache information disclosure or Workarounds for :visited CSS History reconnaissance on this site. Some newer paper in this regard are Cookies from the Past: Timing Server-side Request Processing Code for History Sniffing from 2020 and Browser history re:visited from 2018.

Steffen Ullrich
  • 184,332
  • 29
  • 363
  • 424
  • 20
    *"Doing history sniffing got harder in the last years "* - indeed, in the good old times it was enough to have a link on your site pointing to the other site, in 1px font (or hidden under an image) and check what color the link was painted by the browser, as visited links are displayed in a different color. I don't know if there are browsers around which still allow this, but I doubt it. – vsz Nov 16 '21 at 12:34
  • @vsz Seems like that isn't very useful. You can't have links pointing to every possible website. And the color is specific to a URL (including all the query parameters), not just the domain. So it will only work if you want to know if they recently visited a specific URL, you can't use it for general history enumeration. – Barmar Nov 16 '21 at 15:45
  • 12
    @barmar most organizations didn't care about every possible website, only the ones they had arrangements with, or ones who were competitors, etc. And there weren't that many web sites. Things were a lot simpler back then. – barbecue Nov 16 '21 at 20:20
  • @barbecue Good point. I was thinking more of an organization like Google, which wants to track all your activity, not just a few specific sites. – Barmar Nov 16 '21 at 20:22
  • 3
    @Barmar: Many sites have well known URL like logos, which are embedded in each page. Some sites have well-known URL which are visible only for logged in users. Such URLs can be used to create user profiles, i.e check which from a list of well-known sites the user might have visited and for some even check if the user is logged in there. – Steffen Ullrich Nov 16 '21 at 20:23
  • 2
    @Barmar: *" I was thinking more of an organization like Google"* - these organizations have other mechanisms, like having trackers (Google Analytics) or Ads (Doubleclick) embedded in a significant part of all websites. It's integral part of their business model to make profiles based on your history. – Steffen Ullrich Nov 16 '21 at 20:28
  • “There is no direct cross-site access to the browsers history.” - way back when I was young, it used to be possible to read the recent history for the current session because the history object presented an array of the most recent URLs visited before the current one. With JS enabled it was trivial to collect this and post it back in an iframe (my experience here predates xmlhttprequest) or by including it in a hidden element of whatever forms are on the page, so it can be read server side. Oh, the naïveté of earlier times! – David Spillett Nov 18 '21 at 14:28
12

Do sites have a simple mechanism to do this? No. Is it possible to do this? Absolutely yes.

Most advertising networks use this type of functionality. Their advert scripts are running on many sites around the world, so they know where you've been and can give you advertisement based on previous visits. Google analytics is probably the worst offender in this area as of 2016. Google adverts and analytics are found on the majority of English websites and Google uses this information to track your path across the internet.

N-ate
  • 221
  • 3
  • _"Their advert scripts are running on many sites around the world, so they know where you've been"_ - how? Scripts themselves don't "know" such things. All they could do is phone home with _"I'm now being loaded on site XYZ with cookie/device ID ABC123"_, where that information can be remembered. – CodeCaster Nov 18 '21 at 13:30
  • 1
    @CodeCaster: well, yes, they do exactly that. Then on a different site ABC, they phone home, and get a response giving a list of previous sites for this ID (including XYZ). Now this history is available to the script. Of course this is predicated on the device ID being constant, but that's a separate topic; even fingerprinting aside, often just a very basic cookie storing a UUID is "good enough", if the user isn't privacy-conscious enough to clear their cookies (which the vast, vast majority of people aren't). – Boris Nov 18 '21 at 14:34
  • @Boris I know (though I highly doubt advertising platforms to actually expose a list of previously visited sites to the client), my point was that the answer doesn't explain that, and affixes "knowing" things to scripts, which they don't. – CodeCaster Nov 18 '21 at 14:41
  • 2
    @CodeCaster On second reading I think the answer states that the ad networks "know" it, not the scripts. The ad networks have scripts running on many websites, and that lets them - the ad networks - know people's browsing histories. (Personally, I find this answer entirely clear - if e.g. GA "knows" your history, and you include a GA script on your site, then technically the website is running a script that has access to your browsing history. It may or may not share that data with first-party scripts - it may not even fetch it client-side - but the data is available to it.) – Boris Nov 18 '21 at 14:45
  • @Boris alright, I have misinterpreted that, but the _advertising network_ knowing on which other sites it has seen a user does not spill to the visited sites. So there is _someone_ who knows which sites you've visited, but it's not the sites you've visited that do. It's the same as the Facebook Like button "knowing" where it's being included and who is viewing it (again, by logging that serverside), but different sites (apart from the originator, being Facebook) won't get to know on which other sites the user has been on. – CodeCaster Nov 18 '21 at 14:46
9

I think Steffan Ulrich's answer is technically correct. However if the OP's question were asked by a non-savvy user, I think I would answer differently.

Do all websites know which previous websites were visited? Probably not without some technical gymnastics related to testing for specific previous sites.

Do some websites know which previous websites were visited? Sadly (qualified) yes. For example, Facebook knows every time you visit a site that contains a Facebook "like" button. So by the time you get to Facebook, they can show you an advertisement for the Herkimer Battle Jitney you were just looking at on a completely different site. The same is true with any company suffiently influential to have their ads or links or whatever on millions of sites. N-ate's example of Google is a good one, too.

bmb
  • 191
  • 3
  • "For example, Facebook knows every time you visit a site that contains a Facebook "like" button." - there is facebook container plugin in Firefox that tries to prevent this. – akostadinov Nov 18 '21 at 10:40
  • bmb, first, welcome to [security.se] and second, thank you (!) for providing me with "What New Thing I Learned About Today®" - I think one of those Jitney's is just perfect for taking the family to Walmart... – CGCampbell Nov 18 '21 at 16:01
3

No

not without being creatively clever

Here's an example. I have a simple static webserver that logs headers. I have index.html and test.html. Here is the browser request when I manually type them in one-after-the-other:

index.html:

GET /index.html HTTP/1.1
Host: localhost
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="96", "Google Chrome";v="96"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36      
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9

test.html:

GET /test.html HTTP/1.1
Host: localhost
Connection: keep-alive
sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="96", "Google Chrome";v="96"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36      
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9

And here is the data when I click a link on index.html -> test.html

GET /test.html HTTP/1.1
Host: localhost
Connection: keep-alive
sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="96", "Google Chrome";v="96"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36      
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Referer: http://localhost/index.html
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
If-None-Match: W/"14d-17d2a478c97"
If-Modified-Since: Tue, 16 Nov 2021 19:44:31 GMT

Most importantly, the Referer [sic] is only sent when you navigate. On the other hand, manually typing a URL into the browser (Chrome in this case) creates a new request.

Finally, I was going to try inspecting javascript objects to identify history etc. on same-site and cross-site examples but it doesn't look like there are any built-in methods to query other than Referer and its dependent methods.

While this is the expected result, not all browser agents are made the same. If you are expecting privacy, use a modern browser agent that is up-to-date.

Finally, if example.com and example.org are coordinating and sharing data, they can, with a high degree of confidence, identify that you visited both locations. Additionally, if they both embed a 3rd party component, then the 3rd party can identify you browsing between both locations. Even if they embed separate 3rd party components, but the 3rd parties share data with each other, there is tracking they can perform.

Nathan Goings
  • 858
  • 6
  • 14