11

To bypass censorship in a DIY fashion, I plan to set up private proxies for my personal use. One of the possible methods I would try is to use an HTTPS proxy (not a web proxy, to be explicit) hosted on an overseas server which is not censored by my ISP.

However, what I have read on the differences between an HTTP and an HTTPS proxy did not provide enough information to clear my concerns over the encryption of traffic between the proxy client and the proxy server. This is critical to a successful mitigation of the censorship by [c_____ed].

According to this diagram, fetching an HTTP URI over an HTTPS proxy has NO difference with doing so over an ordinary HTTP one, except that an SSL/TLS handshake is performed before the HTTP GET request which is unencrypted. If this is the case, communication with the proxy would almost certainly be reset because URL filtering and DPI have long been implemented in my place. The use of HTTPS proxies, then, would be highly limited: if an HTTP-transferred site has a "Share to ..." button containing URL pointing to, let's say Youtube or Facebook, this can be detected and communication to the proxy server is then disrupted.

In addition, I have seen that in the proxy settings page of the mainstream browsers, users are given the option to specify different proxies for HTTP and HTTPS, so does this imply that HTTP requests over HTTP and/or HTTPS proxies can never be encrypted, at least in defined, conventional ways?

I will only set out building an HTTPS proxy provided that all traffic between the proxy client and the proxy server, regardless of it being HTTP or HTTPS, is definitely encrypted, only except for the initiating TCP connections and handshakes.

I look forward to clarification on this issue.

Gilles 'SO- stop being evil'
  • 50,912
  • 13
  • 120
  • 179
Yann Ren
  • 111
  • 1
  • 1
  • 5
  • What's "c_____ship" ? – Joel L Jun 18 '14 at 10:47
  • I don't know if type explicitly every character of the word, will my government prosecute me, so i have to type it in an implicit manner. It is a common practice carried out by the government of the economic powerhouse in East Asia on its Internet connection. – Yann Ren Jun 18 '14 at 10:52
  • I edited your question (pending review) to actually include the word "censorship". (You also include it in your profile at http://security.stackexchange.com/users/49537/yann-ren) – Joel L Jun 18 '14 at 11:00
  • The worst thing about censorship is ██████ █████ ██ █████. By the way, when you want to avoid eavesdropping, you might want to use [the HTTPS version of this site](https://security.stackexchange.com). – Philipp Sep 16 '14 at 15:06

1 Answers1

20

It may be simpler to see it in stages. First, in a whole-HTTP world (no SSL whatsoever), an HTTP request is a collection of headers, indicating the target URL, and sent over a TCP connection (usually on port 80). The request headers begin with a "verb" which is usually GET or POST.

When there is a proxy, the request is sent to the proxy; the proxy then opens the connection to the target server (or another proxy) and forwards the request over that new connection. The response follows the same path, backwards.

Now enters SSL. SSL is coupled with HTTP in the following way: once the TCP connection is established, a SSL handshake is performed, establishing an encrypted tunnel between client and server. The HTTP request will be sent within this tunnel.

When there is a proxy, a client wishing to talk to a SSL-powered server will send a CONNECT request. The request identifies the target server name and port. The proxy then connects to that server (TCP) and forwards bytes back and forth. The SSL handshake occurs between the client and the server; the proxy is kept "on the outside". The proxy (and eavesdroppers on the line between client and proxy, and between proxy and server) can still see the handshake message and thus know what server is being contacted (in particular through the Server Name Indication, and also the certificate sent back by the server).

It so happens that the proxy itself may also be a SSL server. In that case, the communications between the client and the proxy will be themselves encapsulated in a SSL tunnel. When the client-to-proxy connection is SSL-protected, eavesdroppers on the line between client and proxy can know that the client is talking to the proxy, but cannot see the requests and responses. In particular, they cannot know the identity of the ultimate target server.

The two SSL are uncorrelated. This implies some considerable confusion, because when people say "HTTPS proxy", they may mean two different things:

  • a proxy which knows the "CONNECT" verb and is able to forward connections to a ultimate SSL-powered target server;
  • a proxy which is itself a SSL server and will engage in SSL with the client, to protect requests and responses when they transit between client and proxy.

The two characteristics are orthogonal to each other, meaning that you may get one, the other, or both. If you have both then the client actually manages two SSL connections, one nested into the other.


If you want to defeat eavesdroppers on the line between client and proxy, and the goal of the attackers is to guess the identity of the sites that you are talking to, then you need a SSL-powered proxy, i.e. one that supports SSL by itself; and you will also want a proxy which supports CONNECT, so that you may browse both HTTP and HTTPS Web site under protection of your SSL-powered proxy.

Since a given proxy may or may not support a specific protocol (at least conceptually, a proxy may refuse to understand a "CONNECT" request), Web browsers can be configured to use different proxies based on the kind of protocol they want to use. In your case, this only adds to the confusion.

An alternative is to use a SOCKS proxy. This is easy to setup with SSH. With a SSH-powered SOCKS proxy, all the communications emanating from your browser will go through a SSH tunnel between your client machine and the proxy server.

Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • Does the proxy decrypt the requests, or does it always just pass them through? – DanMan Nov 17 '14 at 09:45
  • +1 for a clear answer. Though, one point: When using an HTTPS proxy (not SOCKS) will the browser do the DNS lookup through its ISP, or will it delegate that to the proxy? If it does it through the ISP then if `the goal of the attackers is to guess the identity of the sites that you are talking to` - even an HTTPS proxy won't help. – ispiro May 14 '15 at 21:22
  • 1
    Browser only resolves DNS of the proxy. Then it just sends FQDN to proxy letting the latter to resolve DNS itself: "GET https ://www.example.com HTTP/1.1" or "CONNECT www.example.com:443 HTTP/1.1" (had to use extra space in URI to prevent SE's mangling) – Van Jone Apr 03 '17 at 19:38
  • More detailed description is here: http://serverfault.com/questions/169816/how-dns-lookups-work-when-using-an-http-proxy-or-not-in-ie – Van Jone Apr 03 '17 at 19:44