9

My company is trying to figure out what type of SAN to purchase. This is specifically for database servers that are becoming IO constrained (storage is DAS right now, but we're hitting the limit of a single server and we'd like to add clustering as well).

We need a solution that will produce around 3000 IOPS long-term (we currently peak around 1000 IOPS). Most of our database operations are small reads/writes. Based on discussions with HP engineers and others online, an HP P2000 with 24 SAS HD's in a RAID 10 configuration will deliver just short of that speed for ~$20K. Add in controllers and other items to build out the SAN puts us right around our max budget of $30K.

But online, I see that many SAS SSD's deliver speeds of 80,000 IOPS+. Is this realistic to expect? If so, would it be realistic to get a P2000 or similar entry level SAN and throw a few SSD's in there? Our databases are small, only a couple TB total. If we did this, we'd have the money leftover to buy a second SAN for mirroring/failover, which seems prudent.

Beep beep
  • 1,843
  • 2
  • 18
  • 33
  • 3
    I'll answer this shortly. – ewwhite Dec 13 '14 at 00:33
  • Can you link to the specific SAN model/type you're interested in using? There are many caveats that come with HP P2000-style storage arrays. How do you plan to connect? iSCSI? Fibre? SAS? – ewwhite Dec 13 '14 at 02:13
  • Also, what database platform is this? – ewwhite Dec 13 '14 at 02:16
  • Here is the model I was looking at, just with a different drive configuration: http://www.aventissystems.com/product-p/200385.htm . The DBMS is SQL Server Standard 2008 R2. We're really open to just about any configuration/vendor as long as we can "cheaply" scale in the future, and as long as we keep within our modest budget. – Beep beep Dec 13 '14 at 03:44
  • How much capacity do you need? – ewwhite Dec 13 '14 at 16:31
  • @ewwhite - Just a couple TB to start with. At 5 years we may need 25TB total, but that's in the stretch scenario (more likely we'll need 5-10TB at the 5 year mark) – Beep beep Dec 14 '14 at 17:29

4 Answers4

6

The rule of thumb I use for disk IO is:

  • 75 IOPs per spindle for SATA.

  • 150 IOPs per spindle for FC/SAS

  • 1500 IOPs per spindle for SSD.

As well as IOPs per array also consider IOPs per terabyte. It's not uncommon to end up with a very bad IOP per TB ratio if doing SATA + RAID6. This might not sound too much, but you will often end up with someone spotting 'free space' on an array, and want to use it. It's common for people to buy gigs and ignore iops, when really the opposite is true in most enterprise systems.

Then add in cost of write penalty for RAID:

  • 2 for RAID1, RAID1+0
  • 4 for RAID5 (or 4)
  • 6 for RAID6.

Write penalty can be partially mitigated nice big write caches and in the right circumstances. If you've lots of sequential write IO (like DB logs) you can reduce those write penalties on RAID 5 and 6 quite significantly. If you can write a full stripe (e.g. one block per spindle) you don't have to read to compute parity.

Assume a 8+2 RAID 6 set. In normal operation for a single write IO you need to:

  • Read the 'updated' block.
  • Read the first parity block
  • Read the second parity block
  • Recompute parity.
  • write all 3. (6 IOs).

With a cached full stripe write - e.g 8 consecutive 'chunks' the size of the RAID stripe you can calculate parity on the whole lot, without needing a read. So you only need 10 writes - one to each data, and two parity.

This makes your write penalty 1.2.

You also need to bear in mind that write IO is easy to cache - you don't need to get it on disk immediately. It operates under a soft time constraint - as long as on average your incoming writes don't exceed spindle speed, it'll all be able to run at 'cache speed'.

Read IO on the other hand, suffers a hard time constraint - you cannot complete a read until the data has been fetched. Read caching and cache loading algorithms become important at that point - predictable read patterns (e.g. sequential, as you'd get from backup) can be predicted and prefetched, but random read patterns can't.

For databases, I'd generally suggest you assume that:

  • most of your 'database' IO is random read. (e.g. bad for random access). If you can afford the overhead, RAID1+0 is good - because mirrored disks gives two sources of reads.

  • most of your 'log' IO is sequential write. (e.g. good for caching, and contrary to what many DBAs will suggest, you probably want to RAID50 rather than RAID10).

The ratio of the two is difficult is hard to say. Depends what the DB does.

Because random read IO is a worst case for caching, it's where SSD really does come into it's own - a lot of manufacturers don't bother caching SSD because it's about the same speed anyway. So especially for things like temp databases and indexes, SSD gives a good return on investment.

Sobrique
  • 3,697
  • 2
  • 14
  • 34
  • Thanks, we are 100% RAID10 because we're focused on databases. – Beep beep Dec 15 '14 at 16:48
  • It's common, but it's also a fallacy. RAID10 is actually pretty wasteful for primarily write oriented workloads. Writed cached RAID5/RAID6 has a lower write penalty when you're doing e.g. writing database journal files. – Sobrique Dec 15 '14 at 17:14
4

I can speak on the specifics of what you're trying to accomplish. Honestly, I would not consider an entry-level HP P2000/MSA2000 for your purpose.

These devices have many limitations and from a SAN feature-set perspective, are nothing more than a box of disks. No tiering, no intelligent caching, a maximum of 16 disks in a Virtual Disk group, low IOPS capabilities, poor SSD support, (especially on the unit you selected).

You would need to step up to the HP MSA2040 to see any performance benefit or official support with SSDs. Plus, do you really want to use iSCSI?

DAS may be your best option if you can tolerate local storage. PCIe flash storage will come in under your budget, but capacity will need to be planned carefully.

Can you elaborate on the specifications of your actual servers? Make/model, etc.

If clustering is a must-have, another option is to do the HP MSA2040 unit, but use a SAS unit instead of iSCSI. This is less costly than the other models, allows you to connect 4-8 servers, offers low-latency and great throughput, and can still support SSDs. Even with the Fibre or iSCSI models, this unit would give you more flexibility than the one you linked.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • thanks! We have all HP servers, primarily HP DL380's. The only reasons we were considering iSCSI is that it seemed easier to expand past 4 servers (if we wanted to push other server data onto the SAN), and that it was slightly faster (10Gb vs. 6Gb). – Beep beep Dec 15 '14 at 16:58
  • That said, I'll take a look at the MSA2040 ... not sure why it didn't pop up on our radar before. – Beep beep Dec 15 '14 at 17:00
  • I can see the confusion... If you don't plan to expand beyond 4 or 8 servers, SAS works quite well. And it's not 10Gb versus 6Gb... it's 10Gb versus a 4-lane 12Gbps SAS (48Gbps to the enclosure) interface. – ewwhite Dec 15 '14 at 17:01
  • I just saw that the Remote Snap software only works on iSCSI or FC ... we had hoped to utilize the remote snap to mirror the SAN for disaster recovery eventually. Or is there another process via SAS that would allow for the same capability? – Beep beep Dec 15 '14 at 17:28
  • @Beepbeep I'd probably use a different DR process. I tend not to rely on storage-appliance or SAN-level replication. But the 10GbE and FC versions of the MSA2040 may be a better fit for you. – ewwhite Dec 15 '14 at 17:30
3

Your analysis is pretty correct.

Use a few HDDs for lots of GBs, and lots of HDDs to a few IOps.
Use a few SSDs for lots of IOPs, and lots of SSDs for a few GBs

Which is more important for you? Space is the big cost-driver for SSD solutions, since the price-per-GB is much higher. If you're talking about a 200GB database needed 4K IOPs, a pair of SSDs will get you there. Or a 24 disk array of 15K drives, leaving you lots of space for bulk storage.

How many IOps you'll actually get out of those SSDs varies a lot based on the storage infrastructure (ewwhite will elaborate on that), but it's reasonable to get those kinds of speeds. Especially with Raid10, where parity isn't being computed.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296
  • Thanks for the feedback! Is it reasonable to mix drives? I.e. set up 4 SSDs for performance-oriented tasks, then a bunch of HDD's for bulk storage? – Beep beep Dec 13 '14 at 00:38
  • 3
    @Beepbeep Yep, but be aware of the tradeoffs. Lots of iops will consume controller resources, which means you won't get max sequential throughput from the HDDs. Lots of sequential throughput on the HDDs can crowd out IO from the SSDs increasing latency due to channel contention. If that matters to you, but them on different channels. – sysadmin1138 Dec 13 '14 at 03:07
0

I recently built a pair of storage servers for my employer, using Dell C2100 chassis, running FreeBSD 10.1 with twelve 2TB 7200rpm Western Digital "SE" enterprise SATA drives. The drives are in a single ZFS pool consisting of two 6-drive RAIDZ-2 virtual devices (vdevs). Attached to the pool are a pair of Intel DC S3500 SSDs which are supercap protected against power loss, they are used as both SLOG and L2ARC. Load testing this server over iSCSI, I was able to hit 7500-8200 IOPS. Our total cost including hard drives was about $2700 per server.

In the time that these BSD-based systems have been running, one of our HP MSA2012i SAN units has experienced two controller faults, and our other MSA2012i unit corrupted a large NTFS volume requiring 12 hours of downtime to repair.

Dell and HP sell you 10% hardware and 90% promise of support that you never end up being able to utilize.

cathode
  • 328
  • 2
  • 11
  • This is true... Part of me wanted to recommend an HP server running ZFS (or a ZFS-based appliance OS), as it would perform much better than an MSA/P2000... but I didn't want to go off on a tangent. – ewwhite Dec 15 '14 at 18:13
  • Oh I have no problem with HP. HP and dell make great server hardware. Usually loads better than some whitebox iStarUSA or Norco chassis. But when it comes to critical devices (SAN/NAS is critical in my book), I recommend a solution with as much transparency as possible. SAN appliances are big black boxes. They work great until they don't, and then you're up **** creek. – cathode Dec 15 '14 at 19:10