2

I came across an interesting case today, which with my limited knowledge I'm unable to understand the working of.

I was trying to access a bit.ly link, but it is blocked by my University.

Knowing a bit of HTTP requests and basic network stuff, I initially suspected a DNS block. But even after changing to google DNS, I was getting the blocked page error.

I used nslookup, both from my command line as well as a web utility, and got one of the servers as 67.199.248.10.

I used chrome to inspect the request, and indeed, the domain was being resolved to that IP.

I even used curl on my local machine, and got this output:

$ curl bit.ly -v
* Rebuilt URL to: bit.ly/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 67.199.248.10...
* TCP_NODELAY set
* Connected to bit.ly (67.199.248.10) port 80 (#0)
> GET / HTTP/1.1
> Host: bit.ly
> User-Agent: curl/7.56.1
> Accept: */*
>
< HTTP/1.1 403 Forbidden
< Content-Type: text/html; charset="utf-8"
< Content-Length: 1272
< Connection: Close
<
{ [1272 bytes data]
100  1272  100  1272    0     0   1272      0  0:00:01 --:--:--  0:00:01  5412<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><style type="text/css">html,body{height:100%;padding:0;margin:0;}.oc{display:table;width:100%;height:100%;}.ic{display:table-cell;vertical-align:middle;height:100%;}div.msg{display:block;border:1px solid #30c;padding:0;width:500px;font-family:helvetica,sans-serif;margin:10px auto;}h1{font-weight:bold;color:#fff;font-size:14px;margin:0;padding:2px;text-align:center;background: #30c;}p{font-size:12px;margin:15px auto;width:75%;font-family:helvetica,sans-serif;text-align:left;}
</style>

<title>The URL you requested has been blocked</title></head>
<body>

<div class="oc">
<div class="ic">

<div class="msg" style="text-align: center"><img src=https://www.hku.hk/f/page/7561/basic_logo_20.jpg alt="The University of Hong Kong"></img><h1>The URL you requested has been blocked </h1><p>The webpage you have requested has been blocked, because the page was reported as containing phishing material.  <br>Please refer to <a href="http://www.its.hku.hk/spam-report">HKU Spam report </a> or contact <a href="mailto:ithelp@hku.hk">ithelp@hku.hk</a> if you have further enquiry.<br /><br />URL = bit.ly/<br /></p></div></div></div></body></html>

* Closing connection 0

Now, I am very curious about one thing: If I am able to establish a connection, (it says I am connected to the IP at the given port), how is the University coming in between? Is it some kind of firewall which sees:

If (coming from bit.ly IP) => {Replace HTML with our- block page}?

This is my closest guess, since on HTTPS I get Invalid Certificate. Just wondering how it's happening, answers or even guesses would be appreciated.

Edit: A traceroute as well:

Tracing route to bit.ly [67.199.248.10]
over a maximum of 30 hops:

  1     1 ms    <1 ms    <1 ms  147.8.121.200
  2    11 ms     3 ms     2 ms  147.8.240.57
  3     6 ms     1 ms     2 ms  147.8.240.65
  4    <1 ms    <1 ms    <1 ms  147.8.240.121
  5    <1 ms    <1 ms    <1 ms  147.8.239.8
  6     1 ms     1 ms     1 ms  203.188.117.1
  7     2 ms     2 ms     2 ms  165084185137.ctinets.com [165.84.185.137]
  8     2 ms     2 ms     2 ms  202.4.163.3
  9     3 ms     2 ms     2 ms  014136142018.ctinets.com [14.136.142.18]
 10     3 ms     3 ms     3 ms  ix-ge-9-0-0.core1.undefined.as6453.net [180.87.160.33]
 11     7 ms     4 ms     3 ms  if-ge-4-1-0.hcore1.h71-hong-kong.as6453.net [180.87.160.102]
 12     4 ms     3 ms     3 ms  if-ae-38-2.tcore1.hk2-hong-kong.as6453.net [116.0.67.86]
 13     4 ms     4 ms     3 ms  116.0.67.194
 14     3 ms     3 ms     3 ms  po110.bs-b.sech-hkg2.netarch.akamai.com [72.52.2.184]
 15     4 ms     3 ms     3 ms  ae121.access-a.sech-hkg2.netarch.akamai.com [72.52.2.189]
 16   307 ms   326 ms   349 ms  93.191.173.93
 17   152 ms   152 ms   152 ms  a72-52-42-132.deploy.static.akamaitechnologies.com [72.52.42.132]
 18     *        *      151 ms  ae5.cbs01.eq01.sjc02.networklayer.com [50.97.17.72]
 19   150 ms   150 ms   150 ms  e1.11.6132.ip4.static.sl-reverse.com [50.97.17.225]
 20   150 ms   151 ms   150 ms  po1.fcr01b.sjc03.networklayer.com [169.45.118.135]
 21   150 ms   150 ms   150 ms  67.199.248.10

Trace complete.
poiasd
  • 53
  • 4

2 Answers2

3

I would suggest that there is some firewall doing deep packet inspection. The TCP handshake is forwarded to the original target but the firewall is watching the connection. It then sees the HTTP request with the Host header and checks the value of the header against the policy. If the site is forbidden according to the policy the firewall will itself reply to the request with the error message you see, i.e. essentially hijack the TCP connection and insert its own packet.

If I'm right and the firewall is only checking for the Host header of the request you could try to set your own header. For example if google.com is allowed by the firewall you could try

 curl -v -H 'Host: google.com' bit.ly

If my theory is right the request will not be rejected but will be passed to the IP address of bit.ly which currently replies to such a request with the wrong Host header with status code 302 (redirect), some cookie set (i.e. Set-Cookie header) and a Location pointing to https://bitly.com/pages/landing/branded-short-domains-powered-by-bitly?bsd=google.com.
Note that some malware uses exactly this method of faking the Host header to bypass policies in firewalls and also make any log data look innocent.

Steffen Ullrich
  • 184,332
  • 29
  • 363
  • 424
  • I missed that about setting the header with a custom domain. I like the idea, I was tempted to suggest Open Observatory of Network Interference, but depending upon the country, some may not take kindly. – safesploit Aug 18 '18 at 17:17
  • 1
    @safesploit: setting a different `Host` header works only for sites which ignore the `Host` header. In case of bit.ly the response is different with the correct and the wrong `Host` header, which means they don't ignore it. It was only intended as a way to verify that my theory how the inspection is done is correct. – Steffen Ullrich Aug 18 '18 at 17:19
  • I think you're absolutely spot on. When I set host to `google.com` or even `bitly.com`, I get status 302. But if I explicitly set the host to `bit.ly`, I get the blocked page error. So ultimately, they are monitoring actual packets and, as you said, hijacking the connection (which would also explain the certificate error with HTTPS). The `Host` header was a good find, thanks a lot for the detailed info! – poiasd Aug 18 '18 at 18:50
1

If (coming from bit.ly IP) => {Replace HTML with our- block page}?

This is a good idea, and how most network filtering is done. On a domain basis rather than IP address basis. Web servers are often instructed to redirect. So, "67.199.248.10" is redirected to "http://bitly.com", then quite often "http://bitly.com" is redirected to "https://bitly.com".

safesploit$ curl bit.ly -v
* Rebuilt URL to: bit.ly/
*   Trying 67.199.248.10...
* TCP_NODELAY set
* Connected to bit.ly (67.199.248.10) port 80 (#0)
> GET / HTTP/1.1
> Host: bit.ly
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 302 Moved Temporarily
< Server: nginx
< Date: Sat, 18 Aug 2018 16:51:43 GMT
< Content-Type: text/html
< Content-Length: 154
< Connection: keep-alive
< Location: https://bitly.com/
< 
<html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>nginx</center>
</body>
</html>
* Connection #0 to host bit.ly left intact

I have run $ curl bit.ly -v over the Tor network and produce the same result. So, it seems more likely the university is performing a MITM. I should be careful with that wording though, as it is their network and leaves onto the Internet via their routers, but it is possible for traffic to be redirected, either on ingress (inbound) or egress (outbound). E.g. like how web servers redirect you to Error 404, the university can do the same, if domain youtube.com:443 then redirect to 192.168.1.100/websiteNotAllow.html.

Here are some ideas how you could "block" access to a website yourself on your local machine. Blocking Websites with /etc/hosts. As for circumventing this censorship look at Avoiding Dark Web monitoring detection which I outline points regarding circumventing censorship using Tor and Revealing IP of an email recipient using remote PHP script or pixel tracking where I mention about different types of SOCKS proxies.

safesploit
  • 1,827
  • 8
  • 18
  • Yes, it does seem that the University is directly intercepting the connection. I have in fact used dns blocks before with some students who were messing around in class, and I suspected they were doing something similar. But then I saw it make the connection, which is where I didn't know what was going on. Thanks for the resources, I'll definitely give them a read! – poiasd Aug 18 '18 at 18:52