56

In Poland, it is common for mobile ISPs to offer plans with limited amount of bandwidth per month, with exclusion of some popular apps. So for example all traffic from YouTube is not counted towards the data cap.

Aside from net neutrality issues, I am wondering how is this achieved in the HTTPS age? How does the ISP know which packets to count towards the data cap?

I know it could be done with just looking at the IP address, but YouTube has a ton of IPs, and I suspect they change all the time. Plus, I wouldn't be surprised if some of YouTube's IPs are shared with other Google services, which are not uncapped by the ISPs...

TylerH
  • 109
  • 6
Kuba Orlik
  • 651
  • 1
  • 5
  • 9
  • 6
    Youtuve has whatever IP address the DNS (of your provider) returns. Easy to change that to local IP port proxies and not account for them. – TomTom Jan 05 '20 at 11:16
  • I think it's strange for a mobile ISP to offer unlimited YouTube. Isn't the bandwidth of the air waves more limited than the bandwidth of the ISP's upstream provider? – Nayuki Jan 05 '20 at 18:02
  • 7
    @TomTom Given things like certificate pinning and DNS over HTTPS, they may be whitelisting IP ranges instead. – ceejayoz Jan 05 '20 at 18:44
  • https://en.wikipedia.org/wiki/Deep_packet_inspection#Tiered_services – Thomas Jan 06 '20 at 13:21
  • 4
    What's the advantage to the ISP of doing this? Is it actually cheaper for them to transfer data from Youtube than Serverfault? – user3490 Jan 06 '20 at 16:14
  • 9
    @user3490 Large companies frequently pay to have their sites exempted, which offsets the bandwidth costs. Sometimes, popular sites are exempted because otherwise a significant portion of users would exceed the caps. Exempting them allows the caps to stay low enough to scare most users into voluntarily limiting usage while avoiding having too many people exceed them. Too many people exceeding the caps leads to lots of complaints, which leads to pressure to eliminate the caps (and all the money ISPs earn from them). – bta Jan 06 '20 at 23:17
  • 3
    @user3490: Also, as several answers below have noted, it's quite common for large site operators like Google or Netflix to offer their own caching front-end servers for ISPs to host on their premises. So when e.g. 10,000 users of an ISP suddenly all start watching the same viral video on YouTube, that video gets served by the [Google Global Cache](https://support.google.com/interconnect/answer/9058809?hl=en) server in the ISP's server room, which only needs to download the original video from Google's data center once. – Ilmari Karonen Jan 07 '20 at 02:25
  • 9
    @user3490 This is the kind of crap you get when net neutrality goes out the window. – Mast Jan 07 '20 at 06:53
  • Some ISPs offer free data for *any* video streaming. How do they do that when a lot of streams are encrypted? It's actually relatively easy to determine if a connection is streaming video vs downloading a file or doing something else because video streams have a peculiar traffic shape. – Giacomo Alzetta Jan 07 '20 at 10:53
  • 2
    @bta and is a selling point - "sign up to our network and get all your social media data free" – Baldrickk Jan 07 '20 at 13:25
  • 3
    @user3490 As a side note: This is a perfect example of what net neutrality is about, to prevent market/network access distortions by providers treating some content providers special and thus putting all others at a disadvantage. – Frank Hopkins Jan 07 '20 at 15:28
  • 1
    This practice is called [zero-rating](https://en.wikipedia.org/wiki/Zero-rating). They tend to whitelist ranges of IPs. – David Ehrmann Jan 07 '20 at 19:14
  • 1
    @Nayuki AFAIK, 4G/5G network architecture is actually *designed* to let everyone streaming video. As in, that is one of their design goals. Moving to smaller cells, for example, allows more bandwidth in total because each user is competing with fewer other users. And it's designed to allow the ISP to have cache nodes near the user, so all that streaming data *doesn't* have to come from upstream. – user253751 Jan 08 '20 at 12:05
  • 1
    Interestingly; I suspect that this is actually illegal. The EU has laws around net neutrality; meaning "ISPs are prohibited from blocking or slowing down of Internet traffic, except where necessary". – UKMonkey Jan 08 '20 at 12:57
  • @GiacomoAlzetta, I'm not aware of ISPs that offer free data for *any* video streaming (not just from specific named list of sites, e.g. Twitter, Vimeo, Youku...). Can you name a few? in which countries? – smci Jan 08 '20 at 13:59
  • @UKMonkey If that's truly the clause then there's nothing illegal (or relevant to the clause) about what the ISPs are doing in OPs question. They are not blocking or slowing down traffic, but tracking it and selectively charging zero cost for that traffic. – TylerH Jan 08 '20 at 14:21
  • @smci Vodafone in Italy at least some time ago. My father had a 100GB monthly plan they suddendly changed it so that you had unlimited video streaming and 100GB data download *but* the data download was not only capped in max bandwith per month, it was also restricted in speed (like it topped at 100kB/s or so for download but not for streaming video). – Giacomo Alzetta Jan 08 '20 at 15:27
  • @GiacomoAlzetta: Do you remember roughly when? New carriers or new plans might occasionally run limited-time introductory offers to attract customers, but I've never heard of it being done permanently. – smci Jan 08 '20 at 15:44
  • Partly this is due to internet links to the US being very costly, but these services connecting over their own fiber to the given ISP. Hence youtube does not gore over the US link the ISP has to pay for. – Ian Ringrose Jan 08 '20 at 15:48

6 Answers6

64

HTTPS obscures the content of the traffic, but not the endpoints. So, for instance, my ISP does not know that I'm responding to this particular question, because I'm using HTTPS, but they do still know that I'm accessing content on serverfault.com port 443.

In specific cases such as the ISP/Neflix partnerships you describe, it's also common for Netflix to co-locate one of their endpoints in the ISP's data center, which then operates similarly to a CDN - when you connect to Netflix, you get the Netflix server on the ISP's own network, which makes it even easier for them to track it for purposes of the deal, since your traffic never leaves their own network. (The co-located server still needs to get the video streams from another Netflix server outside of the ISP's network, of course, but funneling everything through the co-located server allows them to cache data, use dedicated connections, aggregate streams, etc. to reduce their costs and/or pass some of those costs back to Netflix as part of the partnership deal.)

Dave Sherohman
  • 1,661
  • 1
  • 11
  • 16
  • 9
    +1. Let me emphasize: they don’t just know the target IP, but they also know the destination hostname (`serverfault.com`) because it’s sent unencrypted — and the server can use it to choose how to decrypt the traffic. – Blaisorblade Jan 05 '20 at 17:48
  • 1
    @Blaisorblade There are means to hide hostname (like DNSSEC). Hiding endpoint is much harder and need some kind of proxy. – val is still with Monica Jan 05 '20 at 18:07
  • 12
    @valsaysReinstateMonica I am talking about SNI, a feature of HTTPS. It sounds confusing that DNSSEC would affect HTTPS? – Blaisorblade Jan 05 '20 at 18:12
  • 6
    @valsaysReinstateMonica "DNSSEC does not encrypt DNS data. An observer can still look at DNS activity," Source: https://blog.apnic.net/2018/08/20/dnssec-and-dns-over-tls/ for details. – Nick ODell Jan 05 '20 at 19:30
  • 6
    @Blaisorblade - With ESNI and dns-over-https, the hostname is never sent unencrypted. That combination isn't common now, but likely to be in future. – paj28 Jan 05 '20 at 23:31
  • I am surprised to learn that QUIC still uses SNI. – user253751 Jan 06 '20 at 10:59
46

Firstly they know the YouTube IP address.

ISP's have an IP database. For example YouTube's ASN is AS15169. On the server side they would make a grouping for each service. One of them is the default grouping and this is the billing group. When you make use of default group, that usage is recorded in the system.

For example a few YouTube addresses are listed below.

root@server ~>whois -h whois.radb.net -- '-i origin AS15169' | grep ^route
route:      192.179.147.0/24
route:      192.179.148.0/23
route:      192.179.148.0/24
route:      192.179.149.0/24
route:      192.179.150.0/23
route:      192.179.150.0/
...
route6:     2607:f8b0:4016::/48
route6:     2604:31C0::/32
route6:     2620:33:c000::/48
route6:     2607:f8b0:4000::/48
route6:     2404:f340::/32

When you are trying to reach YouTube or other YouTube services (Google video storage) your phone will try to reach these IP addresses.

The ISP checks the IP address, and if it is inside the YouTube group, they don't apply charge at this group.

Another option is checking the SNI header at the initial HTTP connection. When you make a connection with HTTPS sites not all the data is encrypted.

For example, when you make a search on Google, you can see the URL in your browser like this: https://www.google.com/search?q=hello+world.

Encrypted data is /search?q=hello+world and all page content. Now you are reaching a site like www.google.com, but they don't know which page or the content inside of that page.

Some ISPs use SNI for this. For example in Turkey this method is used for making specific internet packages like 5GB internet+4GB Spotify or 7GB internet with unlimited WhatsApp. Also they use SNI for banning websites. Some websites use the same IP addresses like wikimedia.com or wikipedia.org. If they try to block Wikipedia with an IP addresses they block all Wikimedia services.

TylerH
  • 109
  • 6
Ahmet Özer
  • 554
  • 5
  • 9
  • 1
    Why would they query DNS if they already have the SNI directly from the traffic? – Esa Jokinen Jan 05 '20 at 13:07
  • Yet, your answer suggests they are looking it from DNS. – Esa Jokinen Jan 05 '20 at 14:50
  • 3
    ok. I make a edit for better explanation . Hopely it is better. – Ahmet Özer Jan 05 '20 at 17:25
  • 1
    "When you make a connection with https sites not all the data is encrypted"... except if it's using preloaded HSTS, like YouTube, in which case all HTTP traffic is encrypted. – ArtOfCode Jan 06 '20 at 19:04
  • 3
    HSTS will not protect the HTTPS handshake, nor the DH exchange, so you are incorrect in your assertion that "all" HTTP traffic is encrypted when HSTS is active, @ArtOfCode – Kamilion Jan 07 '20 at 00:18
  • 1
    That's not HTTP traffic, @Kamilion. All traffic using the HTTP protocol is encrypted. Not all traffic using the TLS protocol is encrypted. – ArtOfCode Jan 07 '20 at 00:58
  • Don't forget protocols such as QUIC or HTTP2, in which case they will know the IP but may not know the domain, in QUIC/HTTP3 the SSL handshake is a part of the TCP handshake so that's been negotiated before the browser can even mention the domain – Tom J Nowell Jan 08 '20 at 00:41
  • Food for thought: If a service that is exempted from your limit has any kind of private messaging in it, you could theoretically create a VPN that transfers data over that messaging for free, of course the other side of that VPN would need to have unlimited connection. – Tomáš Zato - Reinstate Monica Jan 08 '20 at 17:00
  • HSTS forces TLS, so nothing is occurring over HTTP; *all* resources must be loaded over HTTPS, or browsers balk. Preloaded HSTS is even more strict. I have my site listed with hstspreload.org. Regardless, the TLS handshake and DH exchange is not encrypted, and the dhparams used can uniquely fingerprint connections. (see Defcon27 presentation: The Tor Censorship Arms Race) – Kamilion Jan 14 '20 at 01:48
19

Even more simple - THEIR DNS will answer with specific IP addresses of local proxies. This may not even break HTTPS - they can port forward from there. But this allows you to have the free traffic on specific IP addresses and remove those from the accounting.

TomTom
  • 50,857
  • 7
  • 52
  • 134
3

In addition to knowing IP addresses as explained in other answers, some CDNs offer their nodes to host at ISPs. For example: Edge nodes / Google Global Cache. Data served from those nodes never leaves ISP perimeter and never hits "outbound" traffic counter. This, incidentally, is also exactly why ISP are able to offer this traffic for free or at significantly cheaper price.

2

It's a mix of having a look on IP endpoints i.e ip addresses and DNS. This is usually done at the so called PGW which stands for Packe Data Gateway. The PGW is part of every mobile network.

newduino
  • 29
  • 3
2

A small addition to the otherwise excellent answers so far. I'm aware that many ISPs, certainly all the ones I've dealt with anyway, handle this via Class of Service groups. Essentially traffic from and to specific ASNs are tagged with a specific CoS that is then considered unmetered by their stats engines. This makes the actual stat tracking easier on the infrastructure and billing easier. It also allows for throttling but not many use that today - though that might change when it comes to mobile tariffs in the future, and VBR will handle the playback flutucations this will cause.

Chopper3
  • 100,240
  • 9
  • 106
  • 238