Best practice for 24 Disk Array

Question

we have just migrated for our old fibre SAN storage to an IBM v3700 storwize with 24 SAS 600GB disks.

This storage is connected directly to two IBM ESXi 5.5 servers each with two 6Gbps multipath SAS controllers.

Up until now, I have configured the storage that I have used into multiple RAID5 groups. Each group would be for a different server/purpose. Mainly the RAID groups would be OracleDB, Oracle archive, SQL Server and the rest (file server, mail, etc). The most critical applications are Oracle and SQL Server.

My first concern is safety and then performance for our applications. So I have decided to go with RAID6 + spare(s).

My main concern now is, since we are using ESXi, should I configure the entire storage into one single RAID, saving space, and create data store volumes from ESXi for each server, or is this not a good practice and it's better to create separate hardware RAID groups?

People who are voting to close this: this is not an opinion based question, it's a best practices question. There is an answer from the vendor, and it's on-topic at serverfault. — Basil, Aug 14 '15 at 13:41

score 7 · Answer 1 · answered Aug 13 '15 at 13:07

Each vendor has their own recommendations, so start by asking IBM. You can open a ticket asking for configuration advice without paying for additional support, usually. That or whoever sold it to you can.

Briefly googling, I discovered this redbook. Page 212, you likely want basic raid 6, which means 1 spare and a drives per array goal of 12. That will mean two raids, one of 12, one of 11. I wouldn't recommend raid 10, because you lose half your capacity. It does avoid parity, but that's something you only need to worry about on low-end or internal storage. Your storage will hide random overwrites' parity overhead behind the cache. My shop uses raid 6 exclusively for half a petabyte of VMWare 5.5, and it's fine.

You should read that book and understand how they do mdisks and pools. You want to create a pool to wide stripe across all your spindles, once your raid groups are set up.

Craig Watson · Answer 2 · 2015-08-13T12:12:40.417

Disclaimer - This is highly opinion-based, and have flagged the question as such, but I will attempt to offer an answer as I have quite recently configured almost the exact same setup.

I highly doubt that any kind of database will perform well on a RAID5 or 6 array. Most vendors are actively discouraging (and even in come cases prohibiting) use of unnested parity-based RAID levels due to high rebuild times, which leads to an increased risk or a URE during a rebuild.

I would personally split this into two distinct groups - a RAID10 for your high-IO load such as databases, and a RAID50 for the rest of your data. How many disks you dedicate for each array depends on how much data you need to store.

For example, for your 24-disk array, you can lose two disks for enclosure spares, and create four 2-disk spans (so 8 disks total) to get a logical RAID10 of around 2.4TB. That leaves you with 14 disks for your RAID50, with 7 disks per span, and around 7.2TB of available space. Of course you can juggle the number of spans, but do bear in mind that RAID10s need multiples of 2.

As for datastores, it doesn't really make a huge amount of difference if you're not using fancy features like Storage vMotion and DRS to shuffle resources around.

Also, to clarify your last paragraph: more, smaller disks is usually preferable to less, larger disks due to the amount of time it takes to rebuild a failed disk, and the load placed on the other disks during the rebuild.

I have to agree with you and take your advice for RAID10 for the db. I'm not sure RAID50 is supported on v3700 since it goes up to RAID10. Is this done in a two stage setup from the storage manager? — teo, Aug 13 '15 at 12:11
Your flag is incorrect- this is a specific question with a specific and correct answer. There's a redbook from IBM that contains the information requested. Also, your comment about raid 5 and 6 is incorrect for enterprise storage. Rebuild times on a 12 drive 600GB raid are not the multi-week rebuilds you get on 7200 4TB drives. Additionally, raid 50 doesn't exist on this storage, they use wide striped pools containing extents from multiple raid groups of raid 5, 6, or 10. — Basil, Aug 13 '15 at 13:15

tomstephens89 · Answer 3 · 2015-08-13T12:03:24.617

1

I would never go for a RAID6, or even 5 for that matter for database style work loads. Since they are parity based, they incur a high write penalty and the rebuild times can be HUGE.

RAID 10 will give you the best performance, you can survive one failure from each side of the array and you can allocate a hot spare or two to make sure that the array gets its redundancy back quickly should a drive fail.

In terms of storage division and presentation... I usually follow a scheme of 1 LUN per RAID group, each LUN then contains several VM disks.

edited Aug 13 '15 at 12:03

answered Aug 13 '15 at 11:55

tomstephens89

981
1
11
23

3

Rebuild times on a 600GB SAS disk are not huge. Your advice is true for local storage of 7200 RPM, not enterprise storage with a large controller write cache and small, fast spindles. Also, For VMWare 5.5, two LUNs per datastore is recommended, and wide striping should be used whenever possible to make each LUN have access to the underlying performance of all the spindles, not just a single array. – Basil Aug 13 '15 at 13:11
I am talking about enterprise storage from the likes of HP, Dell, EMC and Netapp, of which I have experience with all of them. Enterprise storage can also include large arrays of big slower disks, especially if the work load consists of high volume sequential writes such as a backup system. Since faster disks are becoming bigger in capacity as well, rebuild time is now significant in the 2.5" arena as well. Agreed, wide striping is a clever way of squeezing maximum performance out of spinning disks, however since SSD's are well and truly established, it isn't so beneficial anymore. – tomstephens89 Aug 13 '15 at 15:48
1

Either way, parity based RAID is slow compared to stripe & mirror (both performance and rebuild) and need only be used where capacity is the concern, not IOPS & throughput. – tomstephens89 Aug 13 '15 at 15:51
4

The problem with RAID 5 rebuilds is less the *time* it takes to complete but rather the amount of data that needs to be read (namely: *all* data of the surviving disks) without a single unrecoverable read error: – Hagen von Eitzen Aug 13 '15 at 16:42
1

I'm not recommending raid 5, Hagen. I'm recommending raid 6. And Tom, the question states that they have 24 600GB SAS disks. Not a large array of big slow disks, nor any SSD (no matter how established the technology). This question is very explicit and clear. – Basil Aug 13 '15 at 19:49
1

Still, RAID10 for database work loads gets my recommendation and that of the storage community as a whole i'd say. – tomstephens89 Aug 13 '15 at 20:00
2

Unrecoverable error rates on enterprise drives are typically a hundred times better than desktop models. That concern for rebuilds isn't particularly significant. – Sobrique Aug 13 '15 at 22:00
That's a rather vague assertion, Tom. A good counter example would be Netapp, which doesn't do raid 10 and hosts a large number of databases across their install base. Raid 10 is something you do when you're using internal disk, for sure, but I wouldn't go as far as to say that the "storage community as a whole" recommends it. Each vendor has their own recommendations. – Basil Aug 17 '15 at 18:38

Best practice for 24 Disk Array

3 Answers3