0

This is my first time building a machine with a hardware RAID card. We bought a Dell T620 with the H710P RAID controller (1GB NV Cache), a 160GB Solid State Drive (SATA Read Intensive MLC 3Gbps), and two 3TB 7.2K RPM Near-Line SAS 6Gbps Hard Drives. The solid state drive is pretty much dedicated to the OS to keep it "hoppin".

The two SAS drives are configured by RAID 0. We are treating this space as scratch for analyses so we're not concerned about data loss. What we want is high-performance IO because we deal with lots of large files. For example, my current project is to work with 800 files ranging in size of 100-200GB. Unfortunately I have to transfer the files to the computer, analyze them, and delete them. Surprisingly (to me), is that I had 8 jobs running simultaneously (transferring, analyzing, deleting), and each job was on track for 20+ hour run times (compared to ~3 hours for a single job). I found that the processors were waiting on IO, according to top (time waiting for I/O completion was hovering around 20).

I realize that these are only 7.2k RPM drives, but I assumed they're pretty capable since Dell listed them at 6Gbps. BeowulfNode42 mentioned here that some drives get a 6Gbps interface for advertising, even though they can't even saturate a 3Gbps link. But I assume Dell wouldn't do that with a high-end server.

I strolled around google land to see if my expectations were unreasonable, but I didn't find anything definitive.

Question: What is a reasonable expectation for this setup? IO is obviously the bottle neck. The RAID card seems pretty nice and I thought the drives were pretty nice.

I ran hdparm to see what I'm getting. Here are the results:

>sudo /usr/sbin/hdparm -Tt /dev/sdb

/dev/sdb:
 Timing cached reads:   19542 MB in  2.00 seconds = 9778.47 MB/sec
 Timing buffered disk reads: 1028 MB in  3.00 seconds = 342.11 MB/sec

The cached reads are pretty rockin', but I expected more from buffered reads. I believe the theoretical output for two 6Gbps drives is 750MB/s, so I expected to get somewhere around 600MB/s.

I appreciate your help. Other relevant information listed below. Please let me know if I missed anything.

OS: opensuse 13.1
RAM: 256GB (1866)
CPUs: Dual Intel Xeon E5-2650v2 2.6GHz, 20M Cache, 8.0GT/s QPI
Mark Ebbert
  • 133
  • 4
  • 1
    Conventional HDDs (especially 7.2k RPM drives) cannot physically saturate a 3Gbps SATA link, let alone a 6Gbps SAS link. Your drives are the bottleneck. – Craig Watson Jul 19 '14 at 19:16
  • Thanks for the helpful responses! Why all the down votes? And why no explanation? – Mark Ebbert Jul 19 '14 at 19:22
  • I didn't vote down but can imagine a combination of being of-topic (a workstation configuration) as well as possibly a lack of research. Unfortunately Dell's *help me choose* a disk is lacking in hard numbers and more about relative bar charts making research a bit hard. In short the disks are wrong for your workload, ( Or as marketing would say, optimised for a different workload :-) – HBruijn Jul 19 '14 at 19:39
  • Thanks for the explanation. I wasn't aware this was off-topic for the forum, especially since I saw similar questions. And I totally agree about the Dell info. Not impressed with our Dell rep either. – Mark Ebbert Jul 19 '14 at 19:57
  • Ok, I apologize that my original question was not well-suited to this forum, but I have one other question that is hopefully more appropriate: Seems clear that the 7.2k RPM was a mistake, but since I have two with RAID 0, shouldn't they be able to saturate a 3Gbps link together? I'm wondering if the SSD (3Gbps) is limiting their output since they're all on the RAID controller. I'll try this Monday, but I'm trying to learn. Thanks for your patience. I keep asking myself why they would even make an SSD with a 3Gbps interface. I knew I shouldn't have gotten it, despite being SLC. Ugh... – Mark Ebbert Jul 19 '14 at 20:20
  • SAS is hardly consumer grade either so not really OT here IMHO. Often the backplane to RAID controller will 4 lanes which at 3Gbs still comes to 12 Gbs total. – HBruijn Jul 19 '14 at 21:10

2 Answers2

3

6 Gbs is the speed of the SAS link, not the IO profile of a single disk.

Typically the speed in a SAS backplane will be negotiated down to the lowest common denominator so you'll find slow disks that still support high-speed SAS links to allow you to mix disks in a single (external) enclosure or backplane, or to benefit from parallelised IO spread out over a larger number of disks.

The HP IO profile for similar 3 TB 7.2k 6 GB SAS disks is:

SAS Midline drives are intended for servers and storage solutions where high capacities are required. These drives have moderately-priced reliability and performance for non-mission critical, low workload applications, such as disk backup, archival, and reference applications.

hdparm is at best an indication of raw disk performance, but it for instance completely bypasses a filesystem, nor does it simulate more random IO, AFAIK. Take a look at What's a good free open source hard drive benchmark?

HBruijn
  • 72,524
  • 21
  • 127
  • 192
  • That was a super helpful explanation HBruijn. Explains why a 7.2k RPM drive still has 6Gbps interface. – Mark Ebbert Jul 19 '14 at 19:24
  • Just to clarify, even if I upgraded the HDDs, the IO would be limited to 3Gbps because of the SSD? – Mark Ebbert Jul 19 '14 at 19:30
  • Not necessarily, to OS disk might be connected directly to your motherboard and bypassing the 6 GBS backplane connected to your RAID controller. I don't know, contact a sales rep. – HBruijn Jul 19 '14 at 19:43
  • The SSD is attached to the RAID controller. This is most unfortunate. My original plan was to get consumer-level drives where I knew exactly the specs, but our university Dell rep was pushing SLC SSDs and commercial-grade HDDs for life span *and* performance. That was clearly a mistake. It didn't seem right, but I gave in. I'm much more concerned with performance than life span. – Mark Ebbert Jul 19 '14 at 19:56
2

hdparm -T will essentially test the performance of reading disk caching, which is CPU and memory. This tests what read speeds you would get when files are cached in memory (see the cache section for the free command).

The nearline SAS drives aren't full SAS drives. They have the same benefits as SAS drives as they use the SAS interface, but are still 7200rpm mechanical drives. The hdparm -t figure you gave is about on par for two drives in a RAID0. As a comparison, 3TB SATA drives are typically around 150MB/s.

For better performance, you will be looking at adding more drives. Take Hadoop for example, where it's recommended to get the best price per gigabyte and use more drives and adding more servers to the cluster. If you need blazing HD performance, higher capacity SSDs might be a better fit, though the heavy usage might cause early failures due to the more frequent read/write cycles.

Will
  • 31
  • 1
  • This is also super helpful. We deal with lots of large files, so we need space and purchasing tons of 600GB 15k RPM drives is too expensive (to get enough storage). But purchasing 2-4 more of the 7.2k drives would be a huge boon for storage and it sounds like I can expect a fairly linear increase in IO if I raid them all together. Is that accurate? – Mark Ebbert Jul 21 '14 at 15:33