4

I'm contemplating the next restructuring of my medium-size storage. It's currently about 30TB, shared via AoE. My main options are:

  1. Keep as is. It can still grow for a while.
    • Go iSCSI. currently it's a little slower, but there are more options
    • Fibre Channel.
    • InfiniBand.

Personally, I like the price/performance of InfiniBand host adapters, and most of the offerings at Supermicro (my preferred hardware brand) have IB as an option.

Linux has had IPoIB drivers for a while; but I don't know if there's a well-known usage for storage. Most comments about iSCSI over IB talk about iSER, and how it's not supported by some iSCSI stacks.

So, does anybody have some pointers about how to use IB for shared storage for Linux servers? Is there any initiator/target project out there? Can I simply use iSCSI over IPoIB?

Javier
  • 9,078
  • 2
  • 23
  • 24

9 Answers9

4

Although it is possible to run iSCSI over InfiniBand via IPoIB, the iSER and SRP protocols yield significantly better performance on an InfiniBand network. An iSER implementation for Linux is available via the tgt project and an SRP implementation for Linux is available via the SCST project. Regarding Windows support: at this time there is no iSER initiator driver available for Windows. But an SRP initiator driver for Windows is available in the winOFED software package (see also the openfabrics.org website).

user251384
  • 161
  • 5
  • thanks for the pointer. do you (or anybody here) have first hand experience with those packages? what are the pros/cons of SRP relative to iSER? (besides windows compatibility, which is a total non-issue for me) – Javier Dec 03 '11 at 18:46
  • An advantage of iSER is that is possible to define multiple targets on one iSER server. An iSER initiator can choose which iSER targets to log in to. SRP on the other hand is a host-to-host protocol: all LUNs defined on the target become available to each initiator - unless LUN masking has been configured on the target. Another advantage of iSER is that it is possible to configure password-based authentication. And a big advantage of SRP is significantly lower latency - that is because the SRP target implementation runs in the kernel while the iSER implementation runs in user space. – user251384 Dec 05 '11 at 20:17
4

So... the thing that most people don't really think about is how Ethernet and IB deliver packets. On one hand, Ethernet is really easy, and it's everywhere. But packet management is not auto-magic nor is it guaranteed-delivery. Granted, modern switching is excellent! Packet loss is no longer the problem that it was way-back-when. However, if you really push the Ethernet, you will start to see packets looping around in there. It's like they don't really know where to go. Eventually, the packets get to where they are supposed to go, but the latency caused by looping has already happened. There IS NO WAY to coax packets to go where they are supposed to.

Infiniband uses guaranteed delivery. Packets and packet delivery is actively managed. What you will see is that IB will peak in performance and then occassionally drop like a square-sine. The drop is over in milliseconds. Then the performance peaks again.

Etherenet peaks out as well, but struggles when use is high. Instead of a square-sine it drops off and then takes a while to step-back-up to peak performance. It looks like a stair on the left side and a straight drop on the right.

That's a problem in large data centers where engineers choose Ethernet over IB because it's easy. Then, the database admins and storage engineers fight back and forth, blaming each other for performance problems. And, when they turn to the network team for answers, the problem gets skirted because most tools see that the "average" network use isn't at peak performance. You have to be watching the packets in order to see this behavior.

Oh! There is one other reason to pick IB over Ethernet. Each IB(FDR) port can go 56 Gb/s. You have to bond (6) 10Ge ports per 1 IB port. That means A-LOT-LESS cabling.

By the way... when you're building financial, data warehouse, bio-logic, or large data systems, you need a lot of IOPS + Bandwidth + Low Latency + Memory + CPU. You can't take any of them out or your performance will suffer. I've been able to push as much as 7Gbytes/second from Oracle to all-flash storage. My fastest full-table-scan was 6 billion rows in 13 seconds.

Transactional systems can scale back on total bandwidth but they still need all of the other components mentioned in the previous paragraph. Ideally, you would use 10Ge for public networks and IB for storage and interconnects.

Just my thoughts... John

  • I also love IB but do bear in mind that FCoE over DCB/CEE at 40Gbps exists and is surprisingly close to IB in many ways, plus it doesn't make most IT people get scared :) – Chopper3 Mar 08 '15 at 21:02
3

I've just had to deal with an IB SAN using Mellanox NICs. Works out of the box on RHEL

dyasny
  • 18,482
  • 6
  • 48
  • 63
2

Do you need IB's latency benefits or are you just looking for some form of combination of networking and storage? if the former then you have no choice, IB is great but can be hard to manage, FC works great and is nice and fast but feels a bit 'old hat' sometimes, iSCSI can be a great solution if you consider all the implications. If I were you I'd go for FC storage over FCoE via Cisco Nexus LAN switches and a converged network adapter.

Chopper3
  • 100,240
  • 9
  • 106
  • 238
  • i want it just for storage, latency is not as critical as bandwidth. FC is the usual answer (besides iSCSI, of course), but I haven't really tried either, so I'd like to know if FC has any real advantage. What I like about IB is the cost/bandwidth ratio (best of any other option), and that it's included on several mainboards (unlike FC or 10gE). what kind of troubles make it "hard to manage"? I've found that even Ethernet can be a headache if you don't have really good switches. Is IB worse that that? – Javier Aug 10 '09 at 19:07
  • 1
    If you just want the bandwidth then IB is great, but most IB storage is actually just FC storage dressed as IB and iSCSI is slow for most things without 10GB and some form of QoS'ing. – Chopper3 Aug 10 '09 at 19:14
  • and what about software storage targets? IET works great for iSCSI, but don't know if IB would need anything else, or if it's simply a matter of getting IPoIB and iSCSI over that. – Javier Aug 10 '09 at 19:51
1

NFS over RDMA works GREAT in Fedora linux

It's very easy to set up. Install the right rpms and tweak a couple of files. Just google to find instructions.

I used Mellanox MT25208 PCIe-x8 infiniband cards, flashed to the latest firmware. Total cost for two cards and a 15 M cable: $150. Who cares about "market adoption" at that price.

Smokin' bandwidth, well over 400 MBytes/sec, with very little CPU usage on client or server. The bottleneck is the RAID controller.

As a bonus, X over IPOIB is also smokin', you'd swear the app is local.

  • Also qpid messaging server sees ping times of 20 us. Yes that is 20 us. Unreal. –  Nov 08 '10 at 19:27
  • so it's NFS over IPoIB. glad to hear that works so well, but i'm not interested on NFS, as i do like sharing block devices for this. What about iSCSI on IPoIB? – Javier Nov 09 '10 at 04:01
1

What about 10gb ethernet? The more exotic the interface, the harder time you're going to have finding drivers and chasing away bugs, and the more expensive everything is going to be.

Okay -- here is a cheap rundown given that everything's all within cx4 cable distances (15 meters):

(I'm using us dollars and list prices found on web pages. I'm assuming the vendor prices are USD are as well)

Is infiniband that much cheaper?

(please note -- I've never actually used any of this gear, I'm only going by whatever pops up on google after 30 seconds of googling. I'm certainly not endorsing it or making recommendations that it will do anything good or bad)

chris
  • 11,784
  • 6
  • 41
  • 51
  • yeah, it's the simplest answer, but still far more expensive than IB, both for host adapters and switches. giving one to every server and storage box is out of the question for a while. – Javier Aug 10 '09 at 17:14
1

I approached the same problem by using 10-Gigabit iSCSI with a dedicated 6-port switch (HP 6400cl-6XG - $2200) and Intel dual-port CX4 NICs (Intel EXPX9502CX4 - $650). The cost per server came down to the NIC and a $100 CX4 cable. In this case, very little was needed to get drivers, etc. to work in a mixed Linux, Windows and OpenSolaris environment.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
0

The difficulty with IB when building a SAN is to manage the srp target. There are very few pre-built solutions available and most are expensive. If products like Open-E introduced native IB support into their software (specifically srp) you would have an easy hardware solution. The client side is very simple to set up on RHEL and it works perfectly. We have a test system up and running now which is performing at 600MB/s consistently and under high load. The performance is amazing and the large amount of available bandwidth gives you great peace of mind and flexibility. Yes you are still limited to the speed of your array, but with IB you can connect multiple arrays without losing performance. Use one for backups, one for main storage etc etc and use them simultaneously without any loss of performance. In my experience, as a pure RDMA storage network, without IP, there is nothing that can beat IB and if you shop around, you can set something up for a very reasonable price. If someone was to introduce some storage appliance software similar to Open-E with full SCST SRP target support it would open up the mainstream market to IB and I for one would be very happy.

Chris
  • 1
-1

I have not implemented an IB storage solution myself, but the main problem as I understand it is that the host drivers are not in wide use in your average environment. They are in wider use in the Windows world than in the Linux world. Where they are in use in the linux world, it's usually in "packaged" infiniband hardware appliances or supercomputing applications with tuned/customized drivers.

10g ethernet or 10g fiber is in much broader use.

Karl Katzke
  • 2,596
  • 1
  • 21
  • 24