Throughput; capacity planning help for C10K like design

Question

I am designing a network service in which clients connect and stay connected -- the model is not far off from IRC less the s2s connections.

I could use some help understanding how to do capacity planning, in particular with the system resource costs associated with handling messages from/to clients.

There's an article that tried to get 1 million clients connected to the same server [1]. Of course, most of these clients were completely idle in the test. If the clients sent a message every 5 seconds or so the system would surely be brought to its knees.

But... How do you do less hand-waving and you know, measure such a breaking point?

We're talking about messages being sent by a client over a TCP socket, into the kernel, and read by an application. The data is shuffled around in memory from one buffer to another. Do I need to consider memory throughput ("5 GT/s" [2], etc.)?

I'm pretty sure I have the ability to measure the basic memory requirements due to TCP/IP buffers, expected bandwidth, and CPU resources required to process messages. I'm a little dim on what I'm calling "thoughput".

Help!

Also, does anyone really do this? Or, do most people sort of hand-wave and see what the real world offers, and then react appropriately?

[1] http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-3/

[2] http://en.wikipedia.org/wiki/GT/s

Measure everything. Try alternatives and measure them to pick the ones which significantly increase throughput. Be willing to abandon all of your architectural decisions. And scrupulously document what you have tested. — Michael Dillon, Feb 14 '11 at 03:03

score 1 · Answer 1 · answered Mar 14 '10 at 21:22

We're talking about messages being sent by a client over a TCP socket, into the kernel, and read by an application. The data is shuffled around in memory from one buffer to another.

No, it isn't. Not if you do it properly anyway. For Linux you should look up the sendfile(2) and splice(2) system calls. Other kernels probably have similar zero-copy facilities but AFAIK it hasn't been standardized.

In practice, better write the program to be as simple as possible, measure where is the bottleneck, improve, measure, improve,... Predicting bottlenecks is hard and premature optimization is the root of all evils (as said Knuth).

Throughput; capacity planning help for C10K like design

1 Answers1