We have a small production Cloudera distribution Hadoop cluster(14 nodes, but growing). As we have expanded our usage of this cluster we have found that disk storage is our biggest blocker and requirement. RAM and CPU usage are minimal with our workloads, and our devs have already significantly reduced the amount of data being stored.
The hardware we are using is relatively low end, and we have therefore maxed out the number of drives we can install in each node. At this time we are not out of space, but we have a new data source that will accelerate our data growth and we would like to just add storage to the system.
The systems only have one expansion card slot, which currently has our SAS HBA running the internal drives. I believe we can replace that with an HBA that has internal and external SAS ports, allowing us to maintain the internal drives and connect to external ones. Where I am running into the limits of my understanding and Google'ing powers is that I cannot find the optimal setup to use to hold the external hard drives and give each server direct, 6Gbit/s SAS, access to the drives.
Hadoop HDFS prefers to not have any other technology between it and the hard drive, and I would like to keep it that way. If I was using SATA I would pick up an external, rack mount, drive enclosure that directly connects it's external eSATA ports to the drives inside with no drive expanders or raid controllers. I cannot find the equivalent in SAS hardware.
What I'm trying to find are suggestions for DAS SAS, preferably with a single chassis that can service multiple servers and does not do anything creative beyond that. Failing that, what options do I have for supplying the equivalent storage and speeds to the SAS drives we use now?
Jared