49

I have a minimal CentOS 6.3, 64 bit acting as gateway with 4 NIC (1 Gbps), each bonded together one for public traffic and other for private, which performs NATing. It has 6 GB RAM and 4 logical cores. We have been using this for the past two years without any problems.

I don't have any experience with hardware routers, but I have heard that they have less RAM and CPU and use flash disks. How can a box with low hardware configuration perform better (as in, handle more concurrent connections) than a machine with more RAM and CPU?

What are the limiting factors, other than IOS using different methods to handle this?

TRiG
  • 1,167
  • 2
  • 13
  • 30
Blue Gene
  • 635
  • 1
  • 5
  • 10
  • 5
    Implementations in Hardware are faster then firmware is faster than software for the same solution. – mdpc Mar 28 '13 at 05:25
  • 3
    The "better specs" you mention are irrelevant for the job at hand. There are other properties that actually matter. – ndim Mar 29 '13 at 14:21
  • Ok, so... from what you wrote, should I assume that your Centos is a virtual machine? Aside for special network optimizations for virtual machines, both on hypervisor and on guest sides, virtual machines are generally known not to be good in this kind of role... for usage as a router/firewall, bare metal is **always** recommended! – stoned Apr 23 '14 at 12:17
  • Are you actually routing 1Gbps of traffic as suggested by your configuration? You may be using a dumptruck to move a shovel full of sand. 6GB of RAM and 4 cores are basically never going to be touched for a router, you could save electricity and rack space by using a small Intel Atom machine to do this job. – Bert Apr 23 '14 at 13:15

6 Answers6

67

ASICs.

Instead of using a general purpose CPU and task-specific software, you can skip the software and just make the silicon handle the task directly.

High performance networking hardware uses ASICs instead of software for the computationally heavy (but relatively logically simple) tasks of something like comparing an IP address to an enormous internet routing table, checking a CAM table for a switching decision, or checking a packet against an ACL. This makes an enormous difference in the speed of those time-sensitive operations, providing a significant advantage over a general-purpose CPU.

Shane Madden
  • 112,982
  • 12
  • 174
  • 248
  • 1
    while i agree with shane, i would choose iptables running on a generic server over an appliance any day. in my company, iptables running on fedora is faster, more flexible, simpler to configure, less taxing on cpu and memory, quicker to boot, and way cheaper than our cisco asa. – rvh Mar 28 '13 at 20:45
  • @rvh: for firewalling or just routing ? – mveroone Apr 23 '14 at 11:51
  • 2
    and overall, 'speed' is a pretty loose term here. Im sure there are many situations where a PC can still get an operation done 'as fast' as a HW one, or maybe has some constant duration longer it takes than HW. But it really does come down to stability and 'bang for your buck'- you're still buying less RAM/CPU with an appliance, but it gets more of out of it, and only does what it needs to so you get stability – Colin Godsey Nov 24 '15 at 16:31
  • This answer is wrong by ommission. Just shouting "ASICs" is also a bit naive as most routers that do NAT use general purpose CPUs, unless we're going into extreme performance. Also actually hardware-NAT-routers have stronger limits on the number of concurrent connections. What the OP experiences is misconfiguration and the use of an operating system that does not suit the task. – dualed Dec 01 '16 at 22:59
  • @dualed `most routers that do NAT use general purpose CPUs` Yes, that's a true statement, but ignorant of the fact that many of those devices offload *specific* operations to dedicated chips (as a reading of my full answer above makes clear). Don't think NAT and connection tracking (which, I agree, don't happen in ASICs in modern devices), think routing table and switching offloading. – Shane Madden Dec 02 '16 at 00:19
  • This. in my company, iptables running on fedora is faster. And moreover is simpler to configure and manage it. – Net Runner Jul 16 '17 at 07:51
13

A high-end, dedicated router can outperform a PC with a faster CPU and more RAM because it it can do more of the routing in hardware.

It's the same reason a $60 Gigabit Ethernet switch can outperform a $2,000 PC with 4 two-port GigE cards acting as an Ethernet switch. The switch is built from the ground up to be a switch.

David Schwartz
  • 31,215
  • 2
  • 53
  • 82
  • 3
    And since the dedicated router is running on a flash disk, there are fewer moving parts to fail at inopportune times. – cpt_fink Mar 28 '13 at 04:16
  • 4
    I'm not really sure you are right. I have had created a very basic anti-DDOS protection for a company I worked based on netmap (http://info.iet.unipi.it/~luigi/netmap/) and it performed very good even on normal hardware (really full 1Gb/s or 11M packets/sec). There is also a version of openswitch for netmap on that site that can forward 3M packets/sec on "normal" hardware, I don't know any cheap (500$) switch that can do it, but I may be wrong. – XzKto Mar 28 '13 at 07:58
  • 2
    @XzKto A typical cheap ($60) 5 port GigE switch can forward packets between all combinations of ports at full wire speed. They typically uses a full, non-blocking crossbar. – David Schwartz Mar 28 '13 at 13:29
  • @XzKto What David meant is, unlike the PC which will handle "up to its own interfaces throughput" speed, the switch does if for every combination of port-to-port. You can have host1 talking to host2, and 3 to 4, and each connection at *full duplex & maximum throughput*. You can't beat that. (and for more money, more ports, even easier to see why it's faster). And for router, it is kind of similar: hardware will be able to reroute faster than software, as it can reroute as soon as it have enough information (your linux router will do it at each layer, but getting there will already be slower) – Olivier Dulac Mar 28 '13 at 18:50
11

"Other than IOS" ?

IOS makes almost all the difference. CentOS is a general-purpose operating system. It's designed to perform well enough under a very wide range of scenarios, using a vast array of different hardware configurations. IOS on the other hand is extremely fine tuned to handle only the kind of workloads you would expect from a piece of network equipment, using the very specific types of hardware you would find in Cisco gear.

Knowing exactly what pieces of hardware you're programming for will take you a very long way in terms of performance vs. compatibility.

Ryan Ries
  • 55,011
  • 9
  • 138
  • 197
  • 3
    +1. Add to that the fact that Cisco has the engineering resources and know-how to *replace software with hardware* when needed. Meaning, if a particular operation is slower than they'd like in software, they can put effort into engineering an ASIC, adding instructions to their existing processing units, or build a hardware module to help speed it up. The tables tip overwhelmingly in favor of performance when you have the luxury of controlling both the software **and** the hardware. – Justin ᚅᚔᚈᚄᚒᚔ Mar 28 '13 at 14:54
  • It works both ways, too. The hardware doesn't need to deal with rare, complex cases either. It just throws those to software. Routing loops? Don't try to figure out in hardware how to deal with those. – MSalters Mar 29 '13 at 10:35
4

Both software and hardware have something to say. I have the comparison of Intel and TP-Link NIC (which uses a Realtek chip at its heart) on generic server hardware, as well as purpose-built and generic-purpose software in routing.

On the hardware side, if the ASIC on board can do some handling of IP traffic, the processor load can be lower and thus faster. I have noticed the two onboard INtel NIC chips communicating directly by DMA, bypassing main CPU in handling packet forwarding; meanwhile the Realtek chip interrupts whenever a packet arrives.

On the software side, if the software is designed to be used in routing, it can be made more efficient. I have used both pfSense+PF (a modified FreeBSD intended to be used as a router) and generic-purpose Ubuntu 12.04+iptables as routing software and the first clearly switch traffic a lot faster. (Ubuntu 14.04 is now almost as fast, thanks to the new nftables in Linux 3.13 kernel.)

However dedicated router do have one major drawback: it cannot perform much other than switching traffic, and it cannot be virtualized. My current edge router is a virtual machine inside my ESXi cluster running Ubuntu 14.04, and it also acts as an intrusion detection system and load balancer.

3

AFAIK, it's the overhead of a general-purpose operating system; regardless of how fast your connections, the packets are dealt with on a packet-by-packet basis within the kernel's context, increasing latency and strain on the system. I believe it's been already explained in the other Answers better than I could do.

Having said that, there are promising new"ish" technologies increasing in popularity and feasibility that might create a more formidable competitor out of Linux systems in this as well as in other regards; i.e. InfiniBand

Take a look at the following Q&A on StackOverflow: How is TCP Kernel-bypass Implemented

Further Reading:

ILMostro_7
  • 210
  • 1
  • 8
3

It's usually because of lack of out-of-box network stack/devices configuration in linux. In almost 90% cases your network traffic is processed by CPU0 while other are in idle. If you'll solve this problem difference with hardware routers will not be so drastic as you may think. You should set up at least RSS or RPS (driver/stack based packet processing distribution among the CPUs).

If you really care for your linux router performance and have enough time I recommend you to read this article in packagecloud blog (there is also article about transmitting packets).

If you'll need to take a look at distribution and you think that watching at while sleep 1; do cat $some_file_in_procfs; done, CPU mask evaluation and manual smp_affinity writing is boring, you'd probably found my pet-project netutils-linux extremely useful.