Designing a persistent asynchronous TCP protocol

Question

I have got a collection of web sites that need to send time-sensitive messages to host machines all over my metro area, each on its own generally dynamic IP. Until now, I have been doing this the way of the script kiddie:

Each host machine runs an (s)FTP server, or an HTTP(s) server, and correspondingly has a certain port opened up by its gateway.
Each host machine runs a program that watches a certain folder and automatically opens or prints or exec()s when a new file of a given extension shows up. Dynamic IP addresses are accommodated using a dynamic DNS service.
Each web site does cURL or fsockopen or whatever and communicates directly with its recipient as-needed.

This approach has been suprisingly reliable, however obvious issues have come up and the situation needs to be addressed.

As stated, these messages are time-sensitive and failures need to be detected within minutes of submission by end-users. What I'm doing is building a messaging protocol. It will run on a machine and connection in my control. As far as the service is concerned, there is no distinction between web site and host machine -- there is only one device sending a message to another device.

So that's where I'm at right now. I've got a skeleton server and a skeleton client. They can negotiate high-quality authentication and encryption. The (TCP) connection is persistent and asynchronous, and can handle delimited (i.e., read until \r\n or whatever) as well as length-prefixed (i.e., read exactly n bytes) messages. Unless somebody gives me a better idea, I think I'll handle messages as byte arrays.

So I'm looking for suggestions on how to model the protocol itself -- at the application level. I'll mostly be transferring XML and DLM type files, as well as control messages for things like "handshake" and "is so-and-so online?" and so forth. Is there anything really stupid in my train of thought? Or anything I should read about before I get started? Stuff like that -- please and thanks.

Update:

@mrdenny's is the approach I have ultimately gone with, so he gets the answer. @Henrik's ZeroMQ suggestion applied as well, but I basically had that coded already and switching my code for a 3rd party framework didn't really help to design the application layer. In the end, I have discovered just how incredibly versatile HTTP can be, and there is really no need for a roll-your-own protocol. Simply let the web sites present content-type application/json (or xml if necessary) in addition to the text/html they were already doing, and let recipients make outbound web requests instead of listen and respond to filesystem updates. Removes all of the "script kiddie" overhead described above, works much more reliably, enables much better error handling, easy to build, and more.

Henrik · Answer 1 · 2012-06-24T20:56:20.130

3

ZeroMQ has been designed as an asynchronous transport/message protocol.

If one of your nodes goes down, it will re-stablish the ZMQ-Socket and continue sending its messages when the route to the target endpoint comes back up. Performance is good and according to its IRC channel it's well tested enough nowadays to use over WAN.

edited Jun 24 '12 at 20:56

answered Jun 24 '12 at 16:43

Henrik

386
2
4
13

2

Right! Don't mess with that deep layer if someone did the work for you! – Michuelnik Jun 24 '12 at 17:19

score 1 · Accepted Answer · answered Jun 24 '12 at 16:21

1

Any reason that you can't just use web methods and HTTP (or HTTPS) calls to transfer the data between machines?

answered Jun 24 '12 at 16:21

mrdenny

27,074
4
40
68

HTTP(s) is a little bloated for this sort of thing, which is not really a big deal. The problem is you'd either have to make clients make regular GETs which is sloppy or make the server "push" as-needed on a keep-alive which seems even worse. How might this be implemented without making a disgusting hack of your web server? – dogglebones Jun 24 '12 at 17:51
See also [WebSocket](http://tools.ietf.org/html/rfc6455). – dogglebones Jul 29 '12 at 11:38

score 0 · Answer 3 · answered Jun 24 '12 at 16:34

Check out FIX (like FIX Protocol) for an example how such a protocol can look like. You COULD just use FIX and an open source library and make all field definitions yourself. FIX is used for financial trading.

But it should give you a decent idea. FIX also can / does handle items such as persist messages in case a connection goes down, if wanted.

Designing a persistent asynchronous TCP protocol

3 Answers3