6

I have a test box (PowerEdge 2950) with a Perc 6/i and 4x 15.5k SAS drives attached (with 512 byte block sizes). These are in a single RAID 5 virtual disk with a 64KB chunk size.

I am creating a single test partition that spans the whole drive. Should it be aligned to the 64KB chunk mark, or 512 byte block size? If the later, the partition could start at 2048 bytes into the single virtual disk, meaning it will begin at the 2nd free block on the first drive (I assume)?

Also, I will add another two drives and recreate the RAID virtual disk at a later date for more testing, should the partition then be created at 6x512 bytes, so from 3072 bytes?

I have read a couple of similar questions on this but I couldn't see from those, how the chunk size of the RAID volume might relate to partition alignment, on drive block size when using a single drive.

jwbensley
  • 4,122
  • 11
  • 57
  • 89

3 Answers3

8

If you use the a starting sector of 2048 (512 byte) sectors, then your partition will start 1MB into the drive. This value is used as the by default on most newer installers. This number is nicely divisible by 64k, and most other common chunk/block sizes.

If you are partitioning with fdisk then make pass the -u flag. So it reports the values in 512 byte sectors instead of cylinders.

Since you are using ext* you can use this calculator to determine the strip size and stride width for the filesystem. I am showing that you would want to create your filesystem with these options: mkfs.ext3 -b 4096 -E stride=16,stripe-width=48. You might want to try just creating the filesystem without passing options and seeing what mkfs detects and uses (check with tune2fs -l /dev/sdnn). These days it seems to do a pretty good job automatically detecting the size/width.

Zoredache
  • 128,755
  • 40
  • 271
  • 413
  • @javano This is what I meant by just letting the OS (specifically the partitioner application) do what it thinks is right. These days unless you are going to squeeze every byte out of disk, using the defaults usually gets most of what you want. – StarNamer Aug 09 '12 at 20:31
0

Your partitions should be aliged to your stripe width (chunk size * number of data bearing disks). You should be aware, however, that this barely scratches the surface of the alignment optimisation, and you need to make sure everything from the RAID chunk size, to the file system metadata, to the application I/O size needs t be aligned for optimal performance and to ensure there is no unnecessary read/write amplification. I wrote an article on this subject of optimising file system alignment, which you may file useful.

Gordan Bobić
  • 936
  • 4
  • 10
0

Your math is wrong. In a 4 disk RAID5 array there are (simplistically) 3 data disks and a parity disks, which is why if you have 4 80Gb drives, you get 3*80 or 240Gb of usable space on the RAID array. So, by your assumptions, starting a partition on at 2048 bytes into the drive would start on the 2nd block of the 2nd drive.

But, in fact, your premise is wrong anyway. If you've ever watched the disk activity lights on a RAID5 array, you'd seen that they all flash together, except when doing a rebuild. In other word, the RAID5 controller actually caches the disk read & writes and executes them in parallel across all the drives (obviously, during a rebuild, all but one of the drives operate together while the rebuilding drive is usually on solid). This is so it can guarantee consistency.

Of course, it's reading and writing 64Kb chunks, so it you started you partition at the 192Kb boundary, you might just see a fractional improvement when accessing files right at the start of the partition. But, assuming this disk isn't going to have a few very large files (i.e sized in multiples of 192Kb) being read sequentially, in normal operation, the heads would be moving all over the disk(s), reading/writing files allocated in 4Kb chunks, which would swamp any gain from the alignment of the partition.

In conclusion, since the Perc 6/i is a hardware RAID controller, I'd just let the OS partition the drive as it recommends. The alignment of the partition is not going to have a noticeable effect on disk/file access speed.

StarNamer
  • 431
  • 1
  • 5
  • 15
  • 2
    "The alignment of the partition is not going to have a noticeable effect on disk/file access speed."- almost, but not quite. if the file system blocks cross a stripe boundary, then that will have an effect on performance. when creating the first partition/volume in the OS, the start offset should be equal to or a multiple of the stripe size. – longneck Aug 09 '12 at 17:26
  • "In a 4 disk RAID5 array there are (simplistically) 3 data disks and a parity disks, which is why if you have 4 80Gb drives, you get 3*80 or 240Gb of usable space on the RAID array. So, by your assumptions, starting a partition on at 2048 bytes into the drive would start on the 2nd block of the 2nd drive" Is your maths in fact wrong? :) As RAID 5 is distributed parity, unlike RAID 3. Just a side note! Or would this be correct in this instance because it is at the start of the disk, so the parity is going to be on the fourth disK? Just curious, thanks! – jwbensley Aug 09 '12 at 17:31
  • So based on what you said longneck, should the partition start at 64KB to be in line with the RAID chunk size? – jwbensley Aug 09 '12 at 17:34
  • I'd start the partition at sector 12288 (for other alignment reasons as well as there being 3 data drives) or at 0 (e.g. whole drive, /dev/sdb instead of /dev/sdb1, in Linux). Your OS caching and I/O scheduling units should also be considered. – Skaperen Aug 09 '12 at 18:32
  • OK, following comment from @longneck I went of and read http://support.microsoft.com/kb/929491 and http://download.paragon-software.com/doc/Paragon_Alignment_Tool-White_Paper.pdf . The discussion refer to Windows only, but, if you want to be on the safe side, you should align you partition according to the **stripe** size, irrespective of the number of disks or the chunk size. I also came across this (http://www.zdnet.com/blog/storage/chunks-the-hidden-key-to-raid-performance/130) article, which recommends that for better I/O throughput, you should use a *small* chunk size. – StarNamer Aug 09 '12 at 20:17
  • Regarding the parity distribution across disks, you are correct about RAID5, which is why I put in the "(simplistically)". As far as the math goes, you still lose a whole disk to parity whether it's RAID3 or RAID5. To be honest, I've never considered where a particular block gets written on a RAID array because, of course, for RAID to work, some part of each block actually gets written to every disk. – StarNamer Aug 09 '12 at 20:21
  • @StarNamer chunk size should not just be blindly set to a small value. performance of various values should be benchmarked. for example, microsoft exchange writes to the database in 256k pages and only writes complete pages so the chunk/stripe size should take that in to account. if benchmarking is impossible and you don't know the internals of the app you will be using, smaller is safer, but not guaranteed faster. – longneck Aug 09 '12 at 20:24
  • @longneck Agreed about not blindly setting values. Agreed also about benchmarking when adjusting anything. The problem I've found in the past comes when business users want to change applications but won't allow time/resource to test/adjust things for the way the new app works. – StarNamer Aug 09 '12 at 22:19