For a Dell R920 with 24 x 1.2TB disks (and 1TB RAM), I'm looking to set up a RAID 5 configuration for fast IO. The server will be used to host KVM VMs that will be reading/writing files of all sizes, including very large files. I am not terribly interested in data safety because if the server fails for any reason, we'll just re-provision the server from bare metal after replacing the failed parts. So, performance is the main concern. We're considering RAID 5 because it allows us to distribute data over multiple spindles and therefore gives us better performance and, while not our main concern, also gives us some data protection. Our NIC is dual 10Gbps.
I'm limiting this question to RAID 5 only because we think this will give the best performance. Only if there is a compelling performance reason will we consider something else. But, I think I'd prefer answers that are related to RAID 5 configurations.
Okay, with the above stated, here is our present configuration thoughts for:
- 24 Hard Drives: RMCP3: 1.2TB, 10K, 2.5" 6Gbps
- RAID Controller: H730P, 12Gbps SAS Support, 2GB NV Cache
- 1 Hot Spare (just to give us some longer life if a drive does fail)
- 23 Data Drives (of which 1 is accounted as Parity and 22 left for Data)
- Stripe Size: 1MB (1MB/22 data drives = ~46.5KB per disk--or, do I misunderstand stripe size)?
- Read Policy: Adaptive Read Ahead
- Write Policy: Write Back
- Disk Cache Policy: Enabled
If the stripe size is the TOTAL across the data drives, then I figured ~46.5KB per drive will give us very good throughput. If the stripe size is per spindle, then I've got this all wrong.
Does the stripe size also the size that a single file takes? For example, if there is a 2KB file, would choosing a stripe size of 1MB mean that we're wasting nearly an entire megabyte? Or can multiple files live within a stripe?
Lastly, when we install CentOS 6.5 (or latest), will we need to do something special to ensure that the filesystem uses RAID optimally? For example, mkfs.ext4 has an option -E stride that I'm told should correspond to the RAID configuration. But, during a CentOS installation, is there any way to have this done?
Many thanks for your thoughts on a configuring RAID 5 for fast IO.