How are ports and URI schemes handled by a tcp/ip server?

1

I am working with the lwip tcp/ip stack on an embedded device, and I'm trying to understand how it all works. I've been looking through the documentation and code, but I'm confused by how ports and URI schemes are handled by the tcp/ip stack.

The first confusing thing is that they both seem to define a protocol. Is this redundant?

For lwip, to set up a tcp connection, one creates a "Protocol Control Block" (PCB). This is defined by the local IP address and a port. That seems to make sense - this PCB listens on the specified port. How does the URI scheme play into this then? Does this PCB receive any uri scheme? I also don't see the URI scheme being passed to the callback function for receiving packets.

How does this work if I want to change protocols - for example upgrade an HTTP connection into a Websocket connection? If the intial handshake is done over HTTP: port 80, then how are further communications done over WS: port X?

As an example, here is the function signature for binding a PCB in lwip (in C code):

tcp_bind(struct tcp_pcb *pcb, struct ip_addr *ipaddr, u16_t port)

This binds the PCB to an IP address and port number. However, the URI scheme is not specified. Therefore, I would assume the PCB is agnostic to the URI scheme. If we look at the callback prototype for receiving packets:

err_t (* accept)(void *arg, struct tcp_pcb *newpcb, err_t err)

Again, the URI scheme does not appear. I also have source code for an implementation of an HTTP server using lwip. Nowhere does the URI scheme appear. So then how are different URI schemes handled by the IP stack? I cannot find where it is even passed as an argument into callbacks for handling IP traffic. I think I must be missing something fundamental then.

Any help is appreciated!

mhe

Posted 2014-04-23T02:59:05.583

Reputation: 13

Answers

0

TCP is a generic stream protocol for in-order reliable data delivery between two endpoints in an IP network like the Internet.

HTTP is a protocol that runs on top of TCP. Other protocols that use TCP are FTP, SSH, SSL etc.

The functions you described are for handling TCP connections in general.

You should read http://www.w3.org/Protocols/rfc2616/rfc2616.html to learn the HTTP protocol.

A short overview how a HTTP request is made. This example is based on HTTP 1.0, since it is simpler.

When you tell a browser to connect to http://superuser.com, this is what happens in the background:

  1. Browser makes a DNS lookup for superuser.com to find out the IP address for the service.
  2. Browser opens a TCP connection to the server for superuser.com
  3. Browser sends GET / HTTP request to the server.
  4. The server sends back the file corresponding to the / location.

So, the server will not need to know anything about the URI scheme here. The server needs to only understand HTTP protocol primitives (GET, POST, HEAD etc.), and return corresponding resources to the client via the TCP socket.

Tero Kilkanen

Posted 2014-04-23T02:59:05.583

Reputation: 1 405

Thanks. I have read through the http protocol. This is also a great tutorial that is easier to read than the spec: http://www.jmarshall.com/easy/http/

– mhe – 2014-04-23T13:45:35.507

How is a different URI scheme handled though? For example, what if I use ws: or ftp:? Is the scheme entirely for the benefit of the client side? The server doesn't care? – mhe – 2014-04-23T13:46:32.033

The browser uses the protocol field in the resource URL to select the appropriate method to connect to the server. For example, with ftp:// URLs, the browser connects to port 21 of the remote server with FTP protocol. The server then hopefully runs FTP server on that particular port so that the request is succesful. – Tero Kilkanen – 2014-04-27T18:21:10.607

Ok, so the protocol/scheme part of the URL is entirely on the client side then. Server doesn't even need to know what it is. – mhe – 2014-04-28T18:41:29.153