What is the difference between "CONNECT" and "GET HTTPS"?

4

2

Before getting to the real question, let me explain how my project works: for sake of simplicity, my proxy is on my laptop, where the client (my browser) also is; the remote server will be, for example, YouTube.

The client is connected to a specific port of the proxy thanks to SwitchOmega plugin: the client wants to connect to www.youtube.com and the proxy gets the following request:

CONNECT www.youtube.com:443 HTTP/1.1
Host: www.youtube.com:443
Proxy-Connection: keep-alive
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36

I was told that when a proxy gets a CONNECT request, it should open a TCP connection to IP:Port, return a 200 OK message to client and send data until one side of the connection is closed.

With another plugin that tracks HTTP requests, HTTP Trace, I see a different request on my browser:

GET https://www.youtube.com/
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
and other data...

So, why my proxy receives CONNECT www.youtube.com:443 HTTP/1.1 while HTTP Trace shows GET https://www.youtube.com/? Do they mean the same thing?

elmazzun

Posted 2016-03-23T17:44:33.710

Reputation: 187

1So...CONNECT is used to establish a tunnel connection, and GET https is sent by the client asking for resources to the remote server after the successfull connection created by CONNECT request? – elmazzun – 2016-03-23T17:53:27.583

1CONNECT is used to establish a TCP/IP tunnel, but not necessarily an SSL connection, and GET simply retrieves resources. – elmazzun – 2016-03-23T19:06:48.230

1So GET https means a previous CONNECT to port 443? – elmazzun – 2016-03-23T22:44:44.243

I parse CONNECT www.youtube.com:443 HTTP/1.1 and get the port number after :, that's why I said 443. – elmazzun – 2016-03-24T09:00:43.260

Answers

4

CONNECT deals with the request

CONNECT

The CONNECT method converts the request connection to a transparent TCP/IP tunnel, usually to facilitate SSL-encrypted communication (HTTPS) through an unencrypted HTTP proxy.

While GET retrieves the data.

GET

The GET method requests a representation of the specified resource. Requests using GET should only retrieve data and should have no other effect. (This is also true of some other HTTP methods.) The W3C has published guidance principles on this distinction, saying, "Web application design should be informed by the above principles, but also by the relevant limitations."

Source - Hypertext Transfer Protocol

Ramhound

Posted 2016-03-23T17:44:33.710

Reputation: 28 517

0

I think you are dealing with a cosmetic issue.

GET https://www.youtube.com/ is most likely just what is logged to indicate that the fetching is done with GET, and the target is https://www.youtube.com.

There is no standardised way for a proxy to support GET https:// URIs, it was mooted a couple years back at the IETF HTTP WG but discarded for various reasons (trust issues with proxies mainly if I recall)

It is very unlikely to be the request sent to the proxy. As others have said, CONNECT is used to connect to www.youtube.com:443, then there would be some other GET request which does not contain the scheme (protocol) or authority (server:port etc) parts of the URI.

In your example it would be:

GET / HTTP/1.1
host: www.youtube.com:443

Adrien

Posted 2016-03-23T17:44:33.710

Reputation: 1 107