1
How many parallel drives, configured with RAID 0 striping (nX sequential read/write benefit) and sitting behind a hardware RAID controller (PCI-X at 133MHz) does it take to saturate a 1Gbps network connection? What about 2Gbps?
When setting up a small NAS, a big variable is the network connection. One can go with a single gigabit ethernet connection, bond several together (with managed network switches), or go with more expensive fiber channel options. Sticking to CAT5e/6 cables reduces costs to what I can manage, but I still want to get the most out of it without introducing wasteful bottlenecks.
Referring to Seagate's example of an external transfer rate (burst) of around 300MB/s (2.3Gbps) and one of storagereview.com's 3TB reviews showing sustained transfer rates of around 110MB/s (0.9Gbps), I have to come to the conclusion that there is no noticeable performance benefit to setting up RAID striping among multiple parallel drives when accessed through a 1Gbps line. A single drive uses nearly 1Gbps of bandwidth.
Of course, the data has to jump a few hoops to get from the drive controller to the cable, reducing the effective server transfer rate. Even more overhead is introduced by a RAID controller, though the parallel drives more than make up for it, but by what margin? Hence my question.
Note that a 133MHz PCI-X connector can support up to 8.5Gbps [Wikipedia]. Try to ignore the differences between stripe sizes, protocol overhead, etc., and look at the issue from a hardware perspective alone.
example answer:
My NAS motherboard 'A' (with 'B' GB of RAM) and [133MHz PCI-X] RAID controller card 'C' have a maximum throughput of about 7Gbps. The limiting factor is the ethernet controller. I have observed that a 1Gbps/2Gbps connection becomes saturated with as few as 2 drives/3 drives, at around 100MBps/200MBps using iperf.
This holds true for sequential I/O however for random I/O more spindles would be needed to reach the max transfer for a connection – Lamar B – 2011-12-07T04:10:12.273
Perfect! So any more than 4 parallel disks wouldn't do much on a 2Gbps line. Thanks. – tyblu – 2011-12-08T22:40:54.093
@Lamar B : I totally agree. Every instance is a bit different. Random data adds a whole level of complexity that would be difficult for the controller to compensate for. The only saving grace in that instance is to have a larger array and more paths from the controller. – MikeAWood – 2011-12-28T03:13:53.767