How are full URLs exposed when they are encrypted by HTTPS?

Question

As far as I know, HTTPS URLs are encrypted (correct me if I'm wrong). There was a data leak recently and in one article about the leak I saw this picture:

If HTTPS URLs are encrypted then how did the ISP log the full URL (notice "fw-url")?

@Arminius [link](https://www.hackread.com/massive-leaks-exposes-browsing-history-of-users/) — Alexander, Dec 30 '19 at 14:32
Tangential clarification: HTTPS encrypts the _path_ portion of a URL. The hostname is sent in cleartext, which is necessary for proper routing on servers that host multiple websites. — user2752467, Dec 30 '19 at 23:02
the DNS lookup is plaintext (convert www.xvideos.com to an IP address). other than that, the TLS connection is encrypted. There should be no way to see URLs... EXCEPT, your browser trusts a ton of certs. You literally trust the US government, Turkish Intel agencies, and their enemies. It's possible that you trusted a cert that was on its way to that site. Or this is a log FROM that site. TLS can't stop these scenarios. Or plugins collecting info, then leaking it... — Rob, Dec 31 '19 at 03:29
"Voluntarily installed middleware named Connor Web Filtering is logging browsing history of users and was hacked, releasing browsing history of users passing through it." The part about "ISPs" is nothing more than fodder. People use an ISP to get on the internet?!?! NO KIDDING. — MonkeyZeus, Dec 31 '19 at 16:04
That's some garbage "redaction" of that image. The blacked-out text is easily readable. — OrangeDog, Jan 01 '20 at 14:27

gowenfawr · Accepted Answer · 2019-12-30T15:17:35.903

The article states that:

a connection was discovered to a web filter app built by Conor [Solutions]

Given that it was a web filter, and given that it was able to log URLs, we can infer that this was a Man-in-the-Middle (MITM) proxy which decrypted the requests, filtered based on the unencrypted request, and then re-encrypted and forwarded the request to the actual destination. And unfortunately, it logged these requests, and that log got compromised, thus the leak.

This sort of MITM would require a CA certificate be installed on the client so that the proxy could present certificates for each web site visited. Presumably Conor Solutions had some way to roll this change out to customers; perhaps there was "filtering software" for customers opting into having web filtering as a package.

score 4 · Answer 2 · edited Jan 01 '20 at 13:33

4

Below is a screenshot of an image search at the time of this discussion. The source image from the OP is referenced in numerous websites, and appears to be the subject of discussion due to the image content.

The original image appears to be from a vpnMentor blog post: https://www.vpnmentor.com/blog/report-conor-leak/

Perform an image search

Looking though https://crt.sh/?q=xvideos.com, it doesn't seem that any gov has issued a certificate to xvideos.com.

Considering the JSON log source (see image for log location), though redacted, my bet is a user agent plugin/extension logging all activity. E.g., a parental control/marketing solution. (Why would there be a "_score" data element?!?)

A sophisticated break/inspect proxy is less likely, due to the original reporting from vpnMentor indicating TLS was not used to protect the "database" of user info. A MITM (break/inspect) proxy would be observed via the user agent (browser), and poor hygiene of the solution would likely result in widespread detection by the users.

Relative URLs are not indicated in DNS lookups, or TLS SNI, regardless of encryption.

edited Jan 01 '20 at 13:33

Matthew Read

282
3
15

answered Dec 31 '19 at 05:43

Todd Johnson

59
3

6

I don't see what the screenshot adds to the answer. – Paŭlo Ebermann Dec 31 '19 at 05:46
4

There are numerous posts with that image, lacking context of the original source. The image is the source of the question, due to the assumed breach of privacy. – Todd Johnson Dec 31 '19 at 05:50
1

Ah, thanks. Maybe add this to your answer? I didn't get it just from looking at the image. – Paŭlo Ebermann Dec 31 '19 at 05:53
3

`_score` is a default output field of every Elasticsearch query, whether there is search criteria provided or not. In the case of the image in the OP, search criteria were provided, as the number is not `1.0`. – Skrrp Jan 01 '20 at 20:26

How are full URLs exposed when they are encrypted by HTTPS?

2 Answers2