Can government track VPN traffic if it has control on both ISP and website server?

Question

Let's say I access a website called example.com with VPN.

Here's the flow: me → ISP → VPN → example.com

Do typical VPN providers use the same IP address to receive data from ISP and send data to the destination server (example.com)?

If that is the case, can the government identify me, by matching VPN server address and timestamp in ISP log and server log of example.com?

"typical VPN provider" is hard to access. There are techniques to mix up traffic of many clients so correlation is almost unfeasible, but I don't know of a market study testing many VPN providers for this feature. — Falco, May 21 '20 at 16:06

Steffen Ullrich · Accepted Answer · 2020-05-21T17:06:50.913

Do typical VPN providers use the same IP address to receive data from ISP and send data to the destination server (example.com)?

This does not matter much. VPN have a severely limited number of entry and exit points. The protocols used are usually different from normal web traffic (i.e OpenVPN, Wire, IPSec, ... have different communication pattern) and from the perspective of a server one can usually easily find out of the traffic comes from a VPN - this is actually often used to block such traffic when the access should be restricted to specific geographic locations.

If that is the case, can the government identify me, by matching VPN server address and timestamp in ISP log and server log of example.com?

If somebody has access to both the VPN-protected traffic sent using the ISP and to the traffic leaving the VPN then it is likely possible to correlate incoming and outgoing traffic and based on this associate specific server traffic with a specific customer of the ISP.

But likely this is not possible based on the log files of the VPN and server, assuming that the VPN logs contain only information needed for billing. If the VPN logs instead contain a detailed log of all outgoing connections and the associated VPN customer IP address, then these can be correlated with the server log. The VPN customer IP address can then associated by the ISP with a specific ISP customer.

Note that it is likely not possible to identify the client based on a single request. But the more requests are done and the more likely traffic patterns can be associated, even if some "noise" (i.e. small delays, a bit of fake traffic) was added by the VPN.

I think it will be very hard to correlate typical web traffic, especially if the VPN introduces small random delays and compression, so size and timing are hard to correlate. If one does not stream or download a big file, it should be quite hard with a VPN server with enough traffic. — Falco, May 21 '20 at 16:09
@Falco: There are techniques which can make analysis harder but this was not the point of the question. It is about typical VPN, not about atypical ones. Tiny delays will not help, helpful delays are not tiny and will significantly impact the usability and therefore are not used in typical VPN. — Steffen Ullrich, May 21 '20 at 16:36
the size of the delays depend simply on the number of concurrent requests by different users. If there are not enough users the VPN can also create fake requests to increase the number. As long as a bit enough number of requests match the time bucket of my request, the delay can be small. — Falco, May 21 '20 at 16:48
@Falco: I think this depends on the actual problem description: if each request to the server from the VPN should be tracked back this is likely not possible. But if an agency suspects a specific user or small group of users, tracks down its traffic at the ISP level and has access to the server logs and does a correlation over many requests (i.e. not a single one) then the "noise" added by the VPN will not help much if enough requests are done. — Steffen Ullrich, May 21 '20 at 17:04

score 4 · Answer 2 · answered May 21 '20 at 15:08

When you say government, you are implying a certain level of privileged access. Not sure how much of that you are thinking about such as your computer, local ISP, VPN provider, hosting provider for the website;

Here's some variations, assuming that gov always has control over your local ISP (or can request sufficiently detailed logs and traffic captures):

gov has control over the VPN provider:

can the government identify me(...)?

Yes. They will be able to match your connection to the VPN, look into the VPN logs or traffic and match to the website.

gov has control over your computer:

can the government identify me(...)?

Yes if they can pull logs from the website owner including session identifiers;

gov has control over the hosting provider of the website (sufficiently to analyse traffic:

can the government identify me(...)?

Not in a very deterministic way. If they can pull in logs from the website, they could try to match slight peaks of traffic (handful of KBs/MBs) to page hits. But without checking your computer and maybe matching session identifiers, this seems very circumstantial;

Can government track VPN traffic if it has control on both ISP and website server?

2 Answers2

Linked