Questions about storage performance

Question

I would be glad guys, if you can answer me some of my questions about the performance of our storage. The setup

HP P2000 SAS with 2GB cache
8x 1TB SATA 7200 RPM storage
RAID6
3x hosts with SAS HBA adapter
VMWare vSphere 4.1

Basically, the main reason why I had to look at our storage was the transfer of the monitoring VM from local disks of one of hosts to the storage. So before doing any migration, I setup a new VM with iometer and ran the tests during the night, when there were no important jobs running in the cluster. There was only 1 dynamo worker thread from this VM.

Access Specification Name   IOps    Read IOps   Write IOps  MBps    Read MBps Write MBps    Transactions per Second Average Response Time   Average Read Response Time
512B; 100% Read; 0% random  5617.191059 5617.191059 0.000000    2.742769    2.742769    0.000000    5617.191059 0.176979    0.176979
512B; 75% Read; 0% random   3190.524306 2369.761725 820.762581  1.557873    1.157110    0.400763    3190.524306 0.312244    0.321925
512B; 50% Read; 0% random   1055.807449 524.819993  530.987456  0.515531    0.256260    0.259271    1055.807449 0.946000    0.421600
512B; 25% Read; 0% random   1006.956966 239.414257  767.542709  0.491678    0.116901    0.374777    1006.956966 0.853556    0.687116
512B; 0% Read; 0% random    35.123065   0.000000    35.123065   0.017150    0.000000    0.017150    35.123065   28.349538   0.000000
4K; 75% Read; 0% random 3034.296095 2247.847150 786.448945  11.852719   8.780653    3.072066    3034.296095 0.328614    0.333793
4K; 25% Read; 0% random 2237.793260 587.671309  1650.121951 8.741380    2.295591    6.445789    2237.793260 0.445755    0.636275
16K; 75% Read; 0% random    627.852712  474.796322  153.056389  9.810199    7.418693    2.391506    627.852712  1.591288    1.840213
16K; 25% Read; 0% random    478.619741  116.666329  361.953412  7.478433    1.822911    5.655522    478.619741  2.086953    1.281547
32K; 75% Read; 0% random    848.266506  649.372846  198.893660  26.508328   20.292901   6.215427    848.266506  1.176316    1.334378
32K; 25% Read; 0% random    443.441341  117.275291  326.166050  13.857542   3.664853    10.192689   443.441341  2.253707    7.158792

hdparm read tests (with hdparm -t /dev/sda) gave 300MB/s.

Our monitoring system gets information from +- 40 VMs and 30 devices, every host has at least 10 services, but actually that is cacti which generates the majority of IOPS. It massively updates the RRD data simultaneously every minute. Despite this fact, I decided to migrate the VM to the storage. After the migration, I measured the IOPS generated from monitoring - the average value was 800, but the response time after any read operation on every VM was horrible - 5-10 secs, the monitoring actually killed some VMs as the kernel timed out on some IO operations. hdparm gave 1,4MB/sec. I turned off the cacti RRD processing and it runs well, but we have no graphs.

My questions:

1) What do you think about the iometer performance on this setup? It should be better, it is ok, or should I search for some misconfiguration?

2) Do you recommend to have separate physical host with monitoring software and do not "bother" the storage with this kind of IOPS?

3) This question is more general. After the storage test we can get IOPS/mbps for different block sizes. But how can I assess the block size an application is mostly using? For example the databases system is often using 75% read operations, but what is the block size so that I could compare it with my results ? Without knowing this information, my iometer tests are simply just numbers.

UPDATE 1: Thank you for answers.

So what we did, is that we created ramdisk for rrd processing, and all rrds are synced every hour to the monitoring disk. Everything works quite fast, but we will think about creating another RAID group with RAID 10 for this kind of IOPS that need good write performance.

score 4 · Answer 1 · answered Jun 14 '12 at 11:42

I'll be honest, while I'm reasonably sure that this kind of setup is supported I've never seen a >2 host direct-attach SAS VMWare cluster before. I know it works just fine with 2 hosts but 3 or more is outside my area of expertise using this method.

That said your stats look ok to me, ultimately you have some very slow disks in a R6 array so there's a limit to how fast that'll ever be - and 443 IOPS is there or thereabouts the kind of performance I'd expect.

As for your second question, if the load is that bad then you could consider creating another logical disk on the P2000 with a pair of dedicated disks in R1 and put the VM on that, or maybe move it to local DAS if you can live without the vMotion/DRS/HA functionality.

Third question - maybe iotop?

Yeah, once you factor in the write penalty for R6 with the slow drives and low number of spindles, he definitely won't be able to push a large number of front end IOPS. — MDMarra, Jun 14 '12 at 11:58

score 3 · Accepted Answer · edited Apr 13 '17 at 12:14

The setup described isn't as fast as it could potentially be. This is a supported arrangement, as you can have a maximum of four hosts attached to the setup (if you forego SAS multipathing).

To your points:

The performance is not good, but appropriate for what you have configured. I'll reference the Server Fault canonical RAID post, which basically states that RAID 6 is poor for random-write workloads. Virtual machines and monitoring systems are notorious for this access pattern. RAID 1+0 is a better option if possible.
I do virtualize my monitoring hosts, but build the storage to accommodate that (bias towards a larger write cache, set appropriate I/O elevator options in the VM). This is for other RRD-based tools (orca and OpenNMS), but definitely applies to Cacti.
As far as testing, I don't think tracking the average transaction/block sizes is that important since the larger architectural issue can yield a bigger gain across the board. You can trace the applications, though. Also, consider examining VM storage performance through vCenter or esxtop/resxtop.

Questions about storage performance

2 Answers2