1

We have a small production Cloudera distribution Hadoop cluster(14 nodes, but growing). As we have expanded our usage of this cluster we have found that disk storage is our biggest blocker and requirement. RAM and CPU usage are minimal with our workloads, and our devs have already significantly reduced the amount of data being stored.

The hardware we are using is relatively low end, and we have therefore maxed out the number of drives we can install in each node. At this time we are not out of space, but we have a new data source that will accelerate our data growth and we would like to just add storage to the system.

The systems only have one expansion card slot, which currently has our SAS HBA running the internal drives. I believe we can replace that with an HBA that has internal and external SAS ports, allowing us to maintain the internal drives and connect to external ones. Where I am running into the limits of my understanding and Google'ing powers is that I cannot find the optimal setup to use to hold the external hard drives and give each server direct, 6Gbit/s SAS, access to the drives.

Hadoop HDFS prefers to not have any other technology between it and the hard drive, and I would like to keep it that way. If I was using SATA I would pick up an external, rack mount, drive enclosure that directly connects it's external eSATA ports to the drives inside with no drive expanders or raid controllers. I cannot find the equivalent in SAS hardware.

What I'm trying to find are suggestions for DAS SAS, preferably with a single chassis that can service multiple servers and does not do anything creative beyond that. Failing that, what options do I have for supplying the equivalent storage and speeds to the SAS drives we use now?

Jared

Geek42
  • 11
  • 1

1 Answers1

0

You're looking for an external JBOD enclosure that can accommodate SAS disks and has the ability to be zoned or accommodate multiple servers...

The only example I can think of is an HP MDS600 (older) or D6000 (current)

These can be used safely with a standard SAS HBA (LSI) and provide direct disk access without a RAID layer.

See: HP MDS 600 compatibility questions

mds600
(source: olx.co.ke)

Glorfindel
  • 1,213
  • 3
  • 15
  • 22
ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • With a device like that I would be able to connect multiple servers and get direct access to the drives? I was getting the impression that the JBOD enclosures were where I needed to go, but generally only saw ones with 2-8 connectors on them. Is there any way to use a connector for multiple servers to increase access? – Geek42 Jan 16 '14 at 20:24
  • Zoning and a SAS switch... Or you view this unit as something that can present 35 disks each to two servers. I'm not really sure what you're asking for. – ewwhite Jan 16 '14 at 20:35
  • So, to connect multiple servers as fast as is reasonable, I would connect them to a SAS switch, then the switch to this(or similar) using multiple SAS ports. Then the zoning would allow us to assign specific disks to a given server directly. I'm just trying to understand what I would need and where I need to brush up my knowledge before making a decision. Thanks. – Geek42 Jan 16 '14 at 21:07