32

Issue:

Oftentimes people enter google.com directly in the browser's address bar without including either the http:// or https:// prefixes.

Using Chrome DevTools on a fresh incognito session, I ran the following experiment:

STEPS:
-----------------
Enter "google.com" (or equivalently "http://google.com") directly in the 
browser's address bar.

 1. Request: http://google.com;
    Response: Status Code: 301 Moved Permanently
              Location: http://www.google.com/
              Cache-Control: public, max-age=2592000

 2. Request: http://www.google.com;
    Response: Status Code: 302 Found
              Location: https://www.google.com/?gws_rd=ssl

 3. Request: https://www.google.com/?gws_rd=ssl;
    Response: Status Code: 200
              strict-transport-security: max-age=31536000

NOTES: 
-----------------
* To get the same results, start a fresh incognito session (close all incognito 
  windows and open a new one). If you already have an incognito window open you 
  might not get the same results. Having "disable cache" checked won't help either.

* If you repeat the experiment from the same incognito session, you'll notice the 
  following differences from the first time around:

    * Request 1: If "disable cache" is unchecked (which mimics the browser's 
                 behavior during normal usage), the response will be from cache 
                 due to the "Cache-Control: public, max-age=2592000" response 
                 header returned the first time around. This means that the http 
                 request is not sent out (even though it still shows a 301 
                 response) which is probably a good thing.

    * Request 2: The response will be a 307 instead of 302. This is due to the 
                 "strict-transport-security: max-age=31536000" returned by the 
                 third request the first time around. This is the case regardless
                 of whether "disable cache" is checked or not.

* Once the browser becomes aware that a domain is HSTS protected (either via the 
  HSTS preload or the STS response header) the browser will "internally" redirect
  all http requests to https for that domain. These redirects are displayed in
  network tab as "Status Code: 307 Internal Redirect" (which is kind of 
  misleading since it looks like the response is coming from a server when in 
  reality its all happening within the browser. Notice that there is no 
  "Remote Address" in the "General" section for these requests

* Another (perhaps easier) way to check if a domain is HSTS protected is by 
  entering the domain in https://hstspreload.org/ but there are caveats! 
  https://hstspreload.org/ reports the following for "www.google.com":
     - "Response error: No HSTS header is present on the response."
     - "`http://www.google.com` does not redirect to `https://www.google.com`"
  Neither of these findings is consistent with what is observed in the network 
  tab in the above experiment! I emailed the hstspreload mailing list and 
  received the following interesting response: "The server for 
  http://www.google.com doesn't always redirect http to https, which is why this 
  error appears. E.g. if I use curl, I don't get the redirect."
-----------------

Privacy/security concerns:

  • The initial request to google.com is made over http since google.com is not included in the HSTS preload list. This request is vulnerable to MITM attacks.

  • At no point is the browser redirected to https://google.com, therefore the STS header is never set for this domain. This means that even future requests to google.com will not be HSTS protected and may therefore be vulnerable to MITM attacks!

    It's worth noting that the cache-control max-age=2592000 (30 days) response header included in the initial 301 redirect does seem to provide a level of protection similar to what HSTS provides since it causes future requests to http://google.com to be handled "internally" by the cache (and importantly redirected to the HSTS protected "www.google.com" domain). On the other hand, the cache-control max-age is set to expire after 30 days (much shorter than what the HSTS max-age is usually set to) and, most importantly, unlike the HSTS max-age which is refreshed on every https request made to an HSTS enabled domain, the cache-control max age is not refreshed until a new insecure http request is made! This means that your requests to google.com can be intercepted as often as once every 30 days.

  • The request to www.google.com is made over http and is vulnerable to MITM attacks. At least in this case the response is a 302 redirect to https://www.google.com which does include the STS header. This means that any subsequent requests to http://www.google.com will be "internally" redirected to https by the browser, as noted above, the HSTS max-age is refreshed on every request. So as long as your browser makes a request to https://www.google.com at least once a year (which is what the STS max-age expiry is set to), all requests to that domain will be HSTS protected.

TL;DR - "google.com" is not HSTS protected and it seems as though requests can potentially be subject to MITM attacks as often as once every 30 days (or more often if cache is cleared or incognito mode is used).

This may not be as bad as it sounds for the following reasons:

  • All important cookies for .google.com and www.google.com most definitely have the secure flag set.
  • google.com seems to do nothing more than redirect to www.google.com so any request to google.com would realistically only be to the root path (Therefore the URL itself wouldn't be interesting to an eavesdropper).
  • Google subdomains that send/receive more sensitive data (e.g. gmail.com, accounts.google.com...) are on the HSTS preload list. So even if an attacker sets up something like sslsplit and a user ends up on an attacker controlled http://www.google.com (which is already hard enough to pull of as it requires the user not to notice the missing padlock icon), the HSTS preloaded domains would still be protected. An attacker would therefore need to prevent a user from navigating to any of those subdomains.

Questions

  1. What could be the reasons why Google hasn't enabled HSTS for google.com
  2. What could be the reasons why Google has only enabled the STS header for www.google.com but hasn't added that domain to the HSTS preload list?
el_tigro
  • 694
  • 8
  • 14
  • I can confirm this behavior on a Win10 Machine, with the Version 85.0.4183.121 of Google Chrome with one difference: The second request results in a 307 (Internal Redirect) not 302. Still one would likely expect to have that for even on google.com in the first place. – Marcel Oct 07 '20 at 06:39
  • What device/OS is that on? Is it only in Google Chrome or also on other browsers? – Marcel Oct 07 '20 at 06:42
  • 1
    @Marcel: It's not browser specific, but Google has actually implemented it this way, server-side. It's also probably not an error, but intentional. See my answer for details. – Esa Jokinen Oct 07 '20 at 06:50
  • 3
    @Marcel if you start a new Incognito session (close all your Incognito windows and open a new one), you should get a 302. However if you repeat the experiment in the same window, you'll get a 307 instead of a 302 (even if disable cache is enabled). This is because the first time you run the experiment, on the third request (`https://www.google.com`), an HSTS header is sent in the response. So any subsequent request to `http://www.google.com` will result in a (307) "Internal Redirect" until the expiry of the HSTS header, which is set to _max-age=31536000_ (1 year). – el_tigro Oct 07 '20 at 07:04
  • Yes, using an incognito window, I can reproduce exactly as you describe! – Marcel Oct 07 '20 at 07:58
  • _Google subdomains [...] are on the HSTS preload list_. No: only domains up to eTLD+1 are eligible for HSTS preload; subdomains like as `accounts.google.com` are not. – jub0bs Mar 02 '21 at 09:37

1 Answers1

28

Current situation

It is true that, as of Oct 2020, Google does not have HSTS on google.com, but only on www.google.com, and performs redirection first to www and then to https://. Even if there was a HSTS header on google.com, the browser would not see it and be able to cache it. Only www.google.com is protected by HSTS.

Best practices

It is also recommended as best practices by e.g. the Federal CIO Council, that:

In its strongest and recommended form, the HSTS policy includes all subdomains, and indicates a willingness to be “preloaded” into browsers:

Strict-Transport-Security: max-age=31536000; includeSubDomains; preload 

When using this form, bear in mind:

  • The policy should be deployed at https://domain.gov, not https://www.domain.gov.
  • All subdomains associated with the parent domain must support HTTPS. (They do not have to each have their own HSTS policy.)

OWASP HTTP Strict Transport Security Cheat Sheet adds (also noted in the RFC 6797, 14.4):

Cookies can be manipulated from sub-domains, so omitting the includeSubDomains option permits a broad range of cookie-related attacks that HSTS would otherwise prevent by requiring a valid certificate for a subdomain. Ensuring the secure flag is set on all cookies will also prevent, some, but not all, of the same attacks.

This can only be achieved by first redirecting to HTTPS.

Why?

However, we can only tell what would be better, but we cannot answer why some are not following these guidelines. Only Google knows why they have implemented it this way. It is not lack of knowledge and ability, as they have already done it for e.g. gmail.com, which currently is on the HSTS preload list.

You can get closest to your answer by reading Jay Brown's Bringing HSTS to www.google.com from the Google Security Blog. From this article from July 2016 we can find out that it is intentional, due to the complexness of the huge site, and for backwards compatibility with legacy services.

Ordinarily, implementing HSTS is a relatively basic process. However, due to Google's particular complexities, we needed to do some extra prep work that most other domains wouldn't have needed to do. For example, we had to address mixed content, bad HREFs, redirects to HTTP, and other issues like updating legacy services which could cause problems for users as they try to access our core domain.

This process wasn’t without its pitfalls. Perhaps most memorably, we accidentally broke Google’s Santa Tracker just before Christmas last year (don’t worry — we fixed it before Santa and his reindeer made their trip).

Esa Jokinen
  • 16,100
  • 5
  • 50
  • 55
  • 17
    I wonder if the fact that `google.com` is a very common start page and WiFi networks often rely on capturing requests to serve a login page (which doesn't work with HSTS), perhaps it's to reduce that specific user hassle. I had trouble finding something to load the other day to force a WiFi login page to come up. – Chris H Oct 07 '20 at 09:48
  • 13
    @ChrisH You might find [neverssl.com](http://neverssl.com) a useful option in that case. – ydaetskcoR Oct 07 '20 at 10:47
  • 1
    @ydaetskcoR thanks, I knew it existed but forgot the URL (and of course couldn't search!) – Chris H Oct 07 '20 at 10:51
  • 21
    @ChrisH looks like it, [Android (and other Google's products) uses Google's domain when checking for the captive portal](https://android.stackexchange.com/q/123129/44325). – Andrew T. Oct 07 '20 at 13:19
  • 1
    @AndrewT.: I can't come up with a reason why the captive portal check can't ignore HSTS. – Joshua Oct 07 '20 at 20:04
  • 2
    @Joshua if the OS does not support it explicitly, it's up to the user to try to browse some page and get redirected. – Antzi Oct 08 '20 at 06:31
  • @Antzi also if the user dismissed the OS notification because it interrupted something else, then they remembered that the 4G reception is terrible and it's actually worth signing in to the WiFi (me arriving at the gym the other day). – Chris H Oct 08 '20 at 07:29
  • 3
    @Joshua that would completely stop the very purpose of HSTS. A captive portal check *is* a MITM, where a user is given a different page than the one requested. Of course, you could have browsers make an exception for captive portal checks, but it would be a dangerous precedent, and a very real potential attack vector. – Emil Bode Oct 08 '20 at 09:07
  • 4
    @Joshua Even if there were an exception, it would be difficult to retroactively apply that to the hundreds of millions of Android devices out there that no longer receive updates. – jpa Oct 08 '20 at 12:12
  • @EmilBode: Why does the captive portal check use the browser cache at all? Seems like it should be stateless. – Joshua Oct 08 '20 at 15:16
  • [detectportal.firefox.com](http://detectportal.firefox.com/) is another one of these WiFi login pages. – SE - stop firing the good guys Oct 08 '20 at 20:57
  • 1
    @Joshua It's not really a cache. It's just that once you visited "www.google.com", your computer stores that it should *never* accept anything for that domain that isn't from google. It has some similarities to a cache, but is a seperate list. And it's a list that is important to security, so you don't want to bypass it just for convenience. You seem under the impression a captive portal check is some completely seperate process; it's not. If it doesn't appear, you can just open a browser yourself and type "google.com" for it to appear. – Emil Bode Oct 09 '20 at 07:59
  • The *Captive-Portal Identification in DHCP and Router Advertisements (RAs)* ([RFC 8910](https://tools.ietf.org/html/rfc8910)) aims at removing the requirement for this procedure which is essentially a legitimate use of MitM. As usual, it takes quite some time for new practices to spread. – Esa Jokinen Oct 09 '20 at 08:27
  • @EmilBode: I'm used to the android phone autodetecting the captive portal in the background with an isolated HTTP call (with no context whatsoever) and checking if /generate_204 returned a redirect. – Joshua Oct 09 '20 at 15:44
  • @Joshua but to the phone, this is nothing else than just a way of requesting some page. And HSTS is also very simple: *no* requests are accepted if not from the trusted source. Not "no, except when captive portal check", just "No". I'm sure if you were to allow your proposal, malware would find a way to set a flag that says "this is a portal check, just ignore that it is not HSTS-compliant". – Emil Bode Oct 09 '20 at 16:32
  • @EmilBode: It's not that there's a flag on the request that says this is the captive portal check. It's that the wifi connection app (a builtin phone app) that checks if the wifi it just connected to is a captive portal by opening a socket to google.com port 80 and sending the canned `HEAD /generate_204 HTTP/1.0` and check if it gets a `Redirect:` header back or not. – Joshua Oct 09 '20 at 21:28
  • 1
    @Joshua But how does it "open a socket"? Simple: by asking the OS to do so for it. And the OS will check whether it can do so, or that it is prohibited because of HSTS. And that process is completely outside the controle of the wifi-connection-app. So we'd either need a new process, with the associated rights, for opening a socket (a process that can be abused by malware), or we need the wifi-app to tell the OS that "this time, you don't need to check HSTS-compliancy": a flag, that can also be abused. – Emil Bode Oct 10 '20 at 11:08
  • @EmilBode: Open a socket: https://stackoverflow.com/questions/7384678/how-to-create-socket-connection-in-android ; it's lower level than HSTS. Were you thinking of websocket? – Joshua Oct 10 '20 at 15:49