Detecting Tor proxy by reading request headers

Question

I am a newbie to the field of security. As I was trying to explore more about HTTP request and response headers, I came across this website which provides a detailed analysis of the request header that is sent by our system to its server. According to the website, by examining the Via or X-Forwarded-For header, one can conclude whether the client is using a proxy or not. When I visited the website without using any proxy, it showed both of these headers as not present (which was expected). But, when I visited this website again, this time using the Tor Browser, still the two headers were not present! No sign of proxy at all.

I have a very vague idea about the working of Tor, but I am not sure why the headers are unable to detect the proxy when using the Tor Browser! Can anyone explain the reason for the same in layman's terms? Also, if Tor offers this level of anonymity, is there any other way to detect a Tor proxy?

for a discussion on how tor works you may want to see http://security.stackexchange.com/q/36571/21234 — Shurmajee, Jul 06 '13 at 11:35
@qbi - I appreciate the edit to change "TOR" with "Tor", but to be fair it used to be written as an acronym TOR of The Onion Router, however, according to Wiki: _"...the current project no longer considers the name to be an acronym, and therefore does not use capital letters"_. Just thought to mention this, as I couldn't comment on your suggested edit I was approving (only possible on rejection). We see the old acronym still being used by many. ;) — TildalWave, Jul 07 '13 at 20:21
Well, the [FAQ entry](https://www.torproject.org/docs/faq#WhyCalledTor) makes it perfectly clear: »Tor is not spelled "TOR".« — qbi, Jul 07 '13 at 22:18
Yeah. It's clear now. Thanks! Man, there's a lot more out there to learn about Tor. :P — Rahil Arora, Jul 07 '13 at 23:09
Just for your future reference, this is a much more complete proxy check website: http://whatismyipaddress.com/proxy-check — Gaia, Oct 03 '13 at 03:18
@Gaia Tried the above website using TOR. It was able to detect the proxy using wimia test, but failed the Tor test! — Rahil Arora, Oct 03 '13 at 03:51
Other people used the same test while using the same Tor exit node you are. WIMIA works by "using a non-cookie, non-javascript method to attempt to detect multiple users of the same IP address. Consequently it can give a false positiv for people in a multi-user environment. We're working to find the correct threshold." WIMIA is proprietary to Whatismyipaddress.com — Gaia, Oct 03 '13 at 03:59
related: http://stackoverflow.com/questions/9780038/is-it-possible-to-block-tor-users — Ciro Santilli OurBigBook.com, Dec 25 '15 at 10:27

TildalWave · Accepted Answer · 2013-07-06T23:23:59.610

Tor simply repeats requests as an anonymous transparent HTTP proxy, meaning it does not attach typical proxy headers (such as Via or X-Forwarded-For), or in any other way modify HTTP requests or responses (besides being "onion routed, encrypted and decrypted" through the Tor network).

As for identifying clients connecting through Tor network, the easiest to detect such clients on the web server end is to query the public TorDNSEL service that publishes Tor exit nodes:

TorDNSEL is an active testing, DNS-based list of Tor exit nodes. Since Tor supports exit policies, a network service's Tor exit list is a function of its IP address and port. Unlike with traditional DNSxLs, services need to provide that information in their queries.

Previous DNSELs scraped Tor's network directory for exit node IP addresses, but this method fails to list nodes that don't advertise their exit address in the directory. TorDNSEL actively tests through these nodes to provide a more accurate list.

This TorDNSEL querying can be automated e.g. in your web application, and example code in many programming languages can be found on the Internet. For example, here is some sample code demonstrating how to do that in PHP.

If you're going to implement this Tor checking in your web application, then I recommend you cache query results locally for some time it's reasonable to expect the exit nodes didn't change in the meantime, not to constantly repeat same queries and add an additional lag to your responses.

Edit to add: One more way to optimize this Tor exit node querying and avoid using TorDNSEL all the time is to do a reverse DNS lookup beforehand, and try and match it against a list of major known Tor exit node hosts. This can be actually quite effective, as a lot of major exit node hosts never change and they can operate a large number of exit nodes all using same or similar rDNS names. For example, you could try matching rDNS names to your list using regular expressions, LIKE SQL operator, or similar. Some of the known Tor exit node hosts (real examples) will match these names:

tor[0-9].*
tor-exit*
*.torservers.*
*.torland.is

This is the list that I'm using. As you see, it's far from being complete, but it is a start and you can always add more entries as you detect them to follow an easily matched pattern. As it is meant to merely optimize querying, it doesn't really need to be complete, but each match will most certainly speed things up. Hope this helps!

If the exit nodes don't change and are well known, how long until web services which choose to block proxies (for legal or whatever reasons) start blocking Tor traffic? I was under the impression Tor usage didn't advertised itself so blatantly... — Gaia, Oct 03 '13 at 03:16
@Gaia - Tough to judge, since they do change, but majority of traffic actually goes through large hosts that might use clearly identifiable DNS names, like in those few examples listed. It would depend greatly on the ability of individual services to query TorDNSEL, but my guess would be you can filter out roughly 3/4 of Tor clients _cheaply_ by doing faster rDNS lookups, and the rest then through querying TorDNSEL. Implementations will vary in effectiveness and reaction time, though. It is however possible to do it real-time, with some smart local caching in place. — TildalWave, Oct 03 '13 at 03:31

score 8 · Answer 2 · answered Jul 06 '13 at 01:45

The X-Foreward-For-header is sent optionally (and purposefully) by proxies. If a proxy tries to hide the identity of its user, it won't sent this header.

The Onion Network is specifically designed for the purpose of keeping the users identity hidden. It will never reveal the IP address or any other data that could show the users identity.

score 1 · Answer 3 · answered Oct 03 '13 at 03:07

A small note here:

While going through the Wikipedia article on Proxy Servers, I found out that even if the proxy servers are not using the header lines such as HTTP_VIA, HTTP_X_FORWARDED_FOR, or HTTP_FORWARDED, it is still possible for a website to suspect a proxy if the packets sent by the client include a cookie from a previous visit that did not use the high-anonymity proxy server.

score 0 · Answer 4 · edited Jul 12 '16 at 11:24

You can detect it with the Firefox resource bundle. You just have to be aware of JavaScript programming.

The resource:// URI scheme is used by Firefox to call on-disk resources from internal modules and extensions.

But some of these resources may also be included to any web page and executed via a script tag. Mozilla developers is not considering the resources as a fingerprinting vector, despite the fact that some of them can reveal things that the user does not wish to reveal. For example, differences in built-in preferences files clearly indicates if you are using Windows, Linux or Mac, even if you're behind Tor.

Try this URL in the Tor browser: https://www.browserleaks.com/firefox

Although this was correct, `resource://` is no longer exposed in Tor browser. — forest, Jul 05 '18 at 22:08

Detecting Tor proxy by reading request headers

4 Answers4

Linked