So, my question is how can Firefox know that login page address?
It doesn't, actually.
Your ISP uses a technology known as a Captive Portal. Captive portals work by "somehow" hijacking the browser's HTTP request and re-directing it to the login portal.
This "somehow" can be achieved in different ways, for example
- HTTP Redirect
- ICMP Redirect
- DNS Hijacking
Your browser, in turn, tries to detect this "hijacking" by trying to retrieve a well-known web page and checking whether the response they get back is the response they are expecting or something else. Here are some examples of pages that such "hijacking detection systems" use:
The Google one gives a hint as to how it works: The webserver will respond with an HTTP 204 No Content status code. A captive portal, however, will return content (otherwise it would be useless) and therefore never answer with a 204 status code. Most likely, it will use a 307 Temporary Redirect to tell the browser to fetch a different URI (the URI of the captive portal login page).
The other ones use a small document with well-known content instead (e.g. Apple's simply contains the word "Success").
The hijacking detection doesn't have to be performed by the browser, actually. Most modern devices will automatically run this "captive portal hijacking detection" automatically whenever they connect to an open WiFi and automatically pop up a dialog allowing you to go to the captive portal, without you having to explicitly open up your browser and visit some web page.
The reason for this is that in the modern Internet world, a browser is not necessarily the first app a user would be trying to use the Internet with. It could be the Facebook client, WhatsApp, or email client, for example.
Note that I used the term "hijacking" deliberately. These techniques are actually basically performing a Man-in-the-Middle attack. (The difference being that a "real" attacker would try to redirect you to a website that looks exactly like the one you wanted to visit and trick you into entering your username and password on the fake website.) Hence, these techniques only work as long as you are trying to visit an "insecure" website, i.e. a website that does not use SSL/TLS (i.e. no https://
), does not use HTTP Strict Transport Security (HSTS), and the likes.
This is starting to become a problem, because more and more websites are only accessible through HTTPS (TLS). Modern browsers remember whether a website supports HTTPS and will use the HTTPS version, regardless of what you enter in the URI bar. Techniques such as HSTS ensure that browsers will always use the encrypted version of a website. Newer versions of the HTTP protocol such as HTTP/2 and HTTP/3 don't strictly require encryption, but all major browser vendors have decided to only implement them for HTTPS connections.
If you try to visit Facebook or SuperUser, for example, your browser will automatically use an encrypted, authenticated connection, and when the captive portal tries to redirect the browser to the login page, the browser will detect this manipulation and throw an error. Normally, this is exactly what you want, but in this case, it will prevent you from logging into the captive portal and thus from using the Internet.
If you ever run into problems where you are connected to the WiFi, but your apps show errors or load indefinitely, the reason is almost certainly that for some reason you are not logged into the captive portal. Maybe you didn't see the notification popup, maybe the detection failed, there can be many reasons.
In this case, you can solve the problem by visiting a website that you know is "insecure", i.e. doesn't use HSTS, SSL/TLS, or HTTP/2 (the standard specifies both HTTP and HTTPS, but browser vendors have decided that they will support only HTTPS for HTTP/2 going forward). The above-mentioned URIs should do the trick, but there is actually a website which some nice people have put up that serves exactly this purpose, and whose URI is easy to memorize: http://neverssl.com/.
NeverSSL does exactly what its name suggests: it is simply a completely useless website whose sole purpose is to never use SSL/TLS, HSTS, HTTP/2, QUIC or anything other than un-encrypted, un-authenticated, insecure, plain HTTP/1.1, so that the captive portal can intercept the request and redirect to its login page.
DNS Hijacking seems to become more lenient/less enforced lately compared to the past, or maybe it is a local thing here, due to many people introducing 1.1.1.1/8.8.8.8 in their TCP/IP configurations to escape nationwide DNS blacklists promoted by the media cartels. e.g. relying more on refresh/WISPr. Good job on this answer btw. +1 – Rui F Ribeiro – 2019-10-06T09:04:40.890
I am using HTTP 302 instead of 307 in my captive portal "implementation" (Apache link in my answer) Maybe it is more general saying it is a 30x error. – Rui F Ribeiro – 2019-10-06T09:22:25.897
6@RuiFRibeiro: there usually is a DNS hijacking being performed by captive portals, with all queries returning the captive portal IP address until passed. Note that you using 1.1.1.1/8.8.8.8 doesn't protect you from a MITM that goes after UDP packets to port 53. – Ángel – 2019-10-07T01:00:36.870
@Angel It does not indeed, I was forgetting the mitm, thanks – Rui F Ribeiro – 2019-10-07T04:24:40.160
1On all devices I have, the captive portal is opened using, I guess, a "browser view" rather than the regular browser. I some cases where I have to enter login details to get access, I would prefer to use a regular browser that remembers passwords but, especially on phones, it is hard or impossible to copy the URL (or even see the full address) for use in a regular browser. Is there a way around this? – d-b – 2019-10-07T07:17:55.300
1What is ICMP Redirect? I’ve never heard about that before in the context of a captive portal. – jornane – 2019-10-07T08:19:41.547
1@jornane: REDIRECT is one of the ICMP message types. (The most well-known message types are probably ECHO REQUEST and ECHO REPLY, used by
ping
.) It I used by routers to tell a host that a more direct route than the one they are taking is available. Or, in this case, it is used to redirect the traffic to the router itself. – Jörg W Mittag – 2019-10-07T09:06:38.3072For a long time, I have estabilished a practice for myself to enter
http://www.example.com
after connecting to public wifi exactly because the Captive portal wouldn't work for HTTPS pages, like superuser.com. – Tomáš Zato - Reinstate Monica – 2019-10-07T10:24:20.0136@TomášZato: NeverSSL employs another neat trick that example.com doesn't: it uses ECMAScript to redirect to a randomly generated subdomain, that way, when you see the website in your browser, you know that web surfing is working, as it is highly unlikely to be a cached copy. – Jörg W Mittag – 2019-10-07T10:54:56.740
Will browsers ever stop supporting HTTP/1, and if so, what will captive portals do? – Tim – 2019-10-07T20:16:06.203
@Tim, HTTP/1.1 is a plain text protocol, where HTTP/2 is binary encoded -- The difference is, an HTTP/2 can fit more information into less space, but it is no longer easily read by humans. Being human readable, HTTP/1.1 will continue to make up a large portion of transfers where developers write scripts to manage the requests and responses, where HTTP/2 will continue to be seen in places where speed and bandwidth matter, such as serving content to browsers. It will take a VERY long time for HTTP/1.1 to go away; plenty of time for captive portals to adapt or die. – Ghedipunk – 2019-10-07T21:14:01.650
@Tim: HTTP/2-over-TCP will probably go away before HTTP/1.1-over-TCP does, since HTTP/3-over-QUIC is already on the horizon and will probably replace it fully in a short period of time. The advantages of HTTP/2 and HTTP/3 don't really apply in highly-reliable, low-latency networks with up-to-date middle-boxes, so HTTP/1.1 is likely to stay with us inside clouds, data centers, and local networks. HTTP/2 and HTTP/3 really only shine in high-latency, unreliable networks, and HTTP/3 in particular is able to circumvent old middle-boxes. – Jörg W Mittag – 2019-10-07T22:27:58.780
@d-b Exit out of the "browser view" login page, tell your device that you want to stay connected even though you didn't log in yet, then just use your browser to go to
http://neverssl.com
or one of the other options discussed here. Then your browser will handle the redirect and you can log in there. – stephenwade – 2019-10-08T17:55:04.3931@Ángel I have never seen a captive portal hijack DNS requests in the wild. I mean, if the device cached the DNS request that would lead to terrible results (potentially even a largely unusable internet connection), so I can't imagine (for-profit) hotspot providers going down that route. – balu – 2019-10-08T21:56:57.000