10

Disclaimer:

This is NOT intended to educate myself or others on carrying out illegal activities. I am interested in security topics and as a craftsman I pride myself on continued education. THAT SAID:

Question:

It is often said that things such as VPNs, proxy servers etc. mask your identity on the internets. The more I research how the TCP/IP framework actually works though, I question this theory more and more.

Conventional knowledge says that if you punch up a VPN, then you can torrent files or look at embarrassing porn without worry of being identified. I believe this to be true if there is someone sitting between me and my VPN because that's an encrypted tunnel (hopefully). I don't believe this could be said though for systems beyond my VPN tunnel. Based on my limited research, datagrams contain not only the target IP address but also the source IP address. Even though networks such as tor bounce your traffic around the world, in the end, it has to find its way back to you.

I guess in short I am asking, what is stopping the FBI NSA or anyone for that matter from setting up a script that logs headers of datagrams and also the targeted content of these requests? It seems as if there is truly no way to mask one's identity unless you submitted your request from a public location with a spoofed MAC address.

To be clear, I am specifically asking... Is it reasonable to say, any server that I request data from, regardless if I am behind tor, VPN, ect, can in fact, find my source IP contained within the header of the Datagram, and thus, know the true origin of the request. Which can then be logged along with whatever content is requested and then handed over to whomever may request it.

Resources:

Datagrams

How TCP/IP works

(and just because I know someone will argue it's not conventional knowledge that VPN providers assert they provide anonymity http://btguard.com/ )

DotNetRussell
  • 1,441
  • 1
  • 19
  • 30
  • 1
    The relevant question is: "mask your identity from whom?" Governments can simply ask the VPN provider for their logs, for instance. – schroeder May 26 '15 at 23:09
  • @schroeder Okay, well let's assume I use Nord VPN and I route traffic through a server in Singapore or Sweden or Chile. The US government wouldn't get logs from them. Let's focus in on what question I really asked please. It's hypothetical so "whom" isn't relevant. – DotNetRussell May 26 '15 at 23:13
  • 1
    ok - so your focus is on networking only, that's helpful to address your question – schroeder May 26 '15 at 23:15
  • Im refining now – DotNetRussell May 26 '15 at 23:16
  • I'm not sure that a proxy has the same problem as a VPN (re: datagrams) - the traffic sent from a proxy would not contain data about the source (attribution is handled internally, often by using port translation). Is your focus VPN, or proxy, too? – schroeder May 26 '15 at 23:25
  • @schroeder honestly I thought that the packets were handled the same. I assumed that a proxy was just another node to jump from in order to bypass local firewalls but in the end it still returned it to you, so it must have the source in there somewhere. – DotNetRussell May 26 '15 at 23:27
  • "datagrams contain ... the source IP address" - but when you use VPN you are assigned an IP address by the VPN provider so it can only be traced to you with the VPN provider's help. – Neil Smithline May 26 '15 at 23:47
  • @NeilSmithline then how does the packet know to return to your machine and not just stop at the VPN exit portal? – DotNetRussell May 26 '15 at 23:53
  • VPN provider handles that. I assume it's in the VPN protocol but I don't knows the details. – Neil Smithline May 26 '15 at 23:55
  • @NeilSmithline Thanks for the additional info. There is so much top layer info out there but not a lot of the nitty gritty details. Everyone knows something but not many know how things really work. – DotNetRussell May 26 '15 at 23:57
  • 2
    You may wish to read about browser fingerprinting and other tricks to trace you http://spectrum.ieee.org/computing/software/browser-fingerprinting-and-the-onlinetracking-arms-race – Neil Smithline May 26 '15 at 23:59
  • 1
    @AnthonyRussell: Actually, Neil's answer of "the VPN provider/server handles that" is also high level info that doesn't really tell you anything. How they handle it is that each packet contains 3 things: source IP, destination IP and ID. The VPN server keeps track of which ID is mapped to which machine (private or secret IP) on the internal network and rewrites the source/destination IP before sending the packet in or out. – slebetman May 27 '15 at 02:39

5 Answers5

15

Is it reasonable to say, any server that I request data from, regardless if I am behind tor, VPN, ect, can in fact, find my source IP contained within the header of the Datagram, and thus, know the true origin of the request.

No on IPv4. A virtual private network works by connecting, between you and the VPN provider, a tunnel such that you can route through a private network - one of the 10.0.0.0/8, 192.168.0.0/24 or 172.16.0.0/16 address spaces. This is why it is a "private network". The virtual bit comes from the fact you are actually routing over the public internet.

Your access to the internet is then "proxied" like a home router i.e. network address translation. Since there is no route to a private IP address, unlike a public one, the public facing router must take responsibility for both forwarding packets out to the internet and tracking where the replies should go to.

One way to do this is via source ports, but it is not the only way. In this case the source port of the packet and hence the port on which replies are received tells the router which private network host to send the packet to.

As such, you will "appear" to have the IP of the exit gateway of the VPN to servers that see your requests.

Tor circuits, simplified, are like the gateway process times 3. Each step knows who to send on to and who to return to, but not anything else, with each taking responsibility for routing. So, between you and the internet you have a tor entry server, a relay server and a tor exit node. The relay server for example knows to relay packets from the entry server to the exit node, but no more - it does not know the eventual destination of the packet which only the exit will know, or the source, which only the entry knows.

This is completely opposed to conventional routing, which on the public internet explcitly has a source and destination IP. Routers do not re-write this, instead they simply try to calculate the most efficient "next hop" (directly connected hardware to which they can send the packet). Multiple such steps happen in a typical internet communication (see traceroute) and since tor nodes are internet communications tor nodes packets will also hop over multiple nodes taking varied routes through the internet with known source and destination (which are crucially only the relays).

I've written a lot of text, so perhaps it's worth recapping with steps what happens to a packet in the VPN scenario. Let's say I'm sending a HTTP get request.

  1. I connect to VPN. The process of doing this preserves my route for VPN traffic but otherwise creates a default route via the VPN gateway and makes me part of their private network. Examples from my VPN provider:

    $ ifconfig
    tun0: flags=4305<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST>  mtu 1500
        inet 10.0.9.14  netmask 255.255.255.255  destination 10.0.9.13
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 100  (UNSPEC)
        RX packets 48  bytes 23132 (22.5 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 55  bytes 12215 (11.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    $ ip -4 route
    0.0.0.0/1 via 10.0.9.13 dev tun0 
    default via 10.0.0.1 dev wlp2s0  proto static  metric 1024 
    10.0.0.0/16 dev wlp2s0  proto kernel  scope link  src 10.0.0.109 
    10.0.9.1 via 10.0.9.13 dev tun0 
    10.0.9.13 dev tun0  proto kernel  scope link  src 10.0.9.14 
    notareal.ip.200.165 via 10.0.0.1 dev wlp2s0 
    128.0.0.0/1 via 10.0.9.13 dev tun0 
    

    if you look at this carefully you'll notice some strangeness. The default route remeains in place but has its metric set high (low priority) whereas 0.0.0.0/1 and 128.0.0.0/1 - comprising all IPv4 except the notareal.ip line - are sent via tun0. That line exists so that VPN traffic can still go via the proper original interface.

  2. I craft a packet and send it. According to my above mentioned scheme, when I say I want to send this packet to stackexchange.com my routing table instructs me to forward it to the VPN gateway because that's the highest priority matching route.

  3. Internally, the VPN software does its crypto thing and writes a packet to the VPN gateway. This too is looked up in the routing table and goes via the actual network. It is unwrapped at the other end.
  4. At the other end, the packet hits the gateway and is NAT'd, or proxied, or whatever scheme your provider uses. They then leave for the public internet.
  5. When a reply comes in over an established TCP connection, the reverse process happens.
  6. UDP is connectionless anyway; inbound UDP and outside initiated TCP connections are not possible because there's no public routes to me.

In essence, this is quite like your home network NAT, except virtual.

I am using OpenVPN in this case; what you see if you try these steps depends on your provider, VPN software etc and are essentially the same.


Now to answer your question "do VPNs really mask your identity?" - well, there are a lot of ifs and buts and ummms to that, but here's the simplest breakdown I can think of:

  1. If you are an server operator and someone is using a VPN, it's unlikely you'll be able to identify them. You'll see traffic from the VPN provider easily enough, but you won't know who it came from originally unless:
  2. You are a server operator with some kind of authentication mechanism or the ability to store persistent "tracking" cookies on the target. In the former case, you might even know who they are; in the latter case you know they are a unique individual and you won't know exactly who. However:
  3. The VPN provider sees all - the DNS requests you make, where your packets are going etc. In this sense they are now your ISP. Even if you paid them anonymously and gave them details to a completely segregated persona, they still know where your packets come from. The upshot of this is that there is a link to your true identity, or at least whoeever pays the internet bill.

Thus, if you give law enforcement sufficient motive, they'll find you easily enough.


IPv6 is an interesting area. I don't know many people who routinely route ipv6 over VPN mostly because the only person I can name with IPv6 connectivity is me... but, it can be done. IPv6 does not have the concept of private address spaces (slight lie, they do exist) as everything is supposed to be world-routable. Indeed some people believe that NAT breaks the original conceptual design of the internet as a single globally routable network.

Anyway, you can read more about it here - you get IPv6 routable blocks assigned to you instead of private ranges.

If you're wondering, though, publicly routable doesn't necessarily mean packets will be accepted, it just means they could be directed there. There can still be firewalls blocking all incoming connections.


Edit: as an interesting educational exercise, try listening for traffic with wireshark on the VPN interface (in my case tun0) versus the actual hardware interface :)

6

VPNs do not mask your identity and even BTGuard doesn't suggest that it does. VPNs encrypt your traffic to them. This hides the content of your traffic along that link only. This is good and it is one brick in the wall.

A proxy is the other important factor. BTGuard, for instance, offers a proxy service for anonymity. A proxy takes your traffic and re-transmits it to the destination. One of the ways it can do this is to resend the traffic with itself as the source and changing the source port. It maintains an internal table of your connection along with the new source port. This way, all traffic on the other side of the proxy can only conclude that it is the proxy who is the originator. The proxy receives traffic from the destination, and retransmits to you on the original source port.

Both technologies are important and can be effective out-of-the-box. What you have not asked (and has been tackled on this site in various ways) is how can this effective 1-2 punch be defeated by tracking mechanisms, misconfiguration, and errors.

schroeder
  • 123,438
  • 55
  • 284
  • 319
  • If I understand this correctly the proxy functions to anonymize the user from the VPN provider? What other benefits are there to use proxy with a non-logging vpn provider? – Manumit May 27 '15 at 01:15
1

You are correct saying you can be identified.

Please have a check at the FREAK vulnerability : http://en.wikipedia.org/wiki/FREAK

I would say it's possible to get caught doing something bad / confidential even with a VPN. A couple other backdoor may still be present, or introduced in encryption technologies.

Martin
  • 21
  • 3
1

If the VPN doesn't alter the headers of your browser requests, then it's possible to identify your browser because of its fingerprint (you can test it with Panopticlick) or supercookies. Your personal identity is not revealed, but it's theoretically possible to find which pages you (more precisely your computer) visited by looking at the server logs. It can be a way to differentiate 2 users connecting from the same VPN with the same IP address. It's possible to modify headers sent by the browser in order to mimic more frequent fingerprints.

A.L
  • 302
  • 3
  • 12
0

You are right, with enough resources and tracking capabilities you could track someone using a VPN to the user's location.

This does require government level of resources since you would need to tap all packages and then re-match the packers coming through the VPN. This does not mean they can read the contents yet, but that's to always needed for governments use.

The main reason they probably are not doing this for everyone is the sheer computing power it would require. And using it on a limited set is much more effective. Than again they could be and we simply do not know about it.

proteus
  • 123
  • 5
LvB
  • 8,217
  • 1
  • 26
  • 43
  • 1
    it looks like the OP isn't concerned with the content, but the attribution - if so, the computing power reduces significantly – schroeder May 26 '15 at 23:22
  • True. But to use this level of packet inspection you need massive bandwidth. More than a casual researcher normally has. – LvB May 26 '15 at 23:30
  • If you've got government-level resources, you don't need to do packet tracking. You simply get a court order telling the VPN company to turn over the appropriate subscriber records. – Mark May 26 '15 at 23:46
  • That is if you want the Vpn to know. You use tracking when you don't want them to know. – LvB May 26 '15 at 23:48
  • 2
    Where are court orders going to lead with non-logging VPN companies? – Manumit May 27 '15 at 01:00