6

I'm trying to figure out the correct read-ahead values to set on a RAID10 array, and I'm wondering if the RAID stripe size should factor into my considerations.

I've heard conflicting information about this in the past. I once heard that you should always set your read-ahead value to a multiple of the RAID stripe size, and never below the stripe size, because that is the minimum amount of data the RAID controller will ever try to read at once.

Someone else told me, however, that setting read-ahead below the stripe size is fine, and can, in fact, increase the amount of parallel reads you can do across devices in the array, increasing performance and decreasing load on the array.

So which is it? Do read-ahead settings that aren't multiples of the stripe size make sense or not?

HopelessN00b
  • 53,385
  • 32
  • 133
  • 208
stbrody
  • 161
  • 1
  • 3
  • Simple. For purely sequential I/O readahead much bigger than stripe size. For the random I/O, read ahead should be of the size of a typical I/O. Stripe should be many times larger than readahead. Why? Because this typical I/O should need to randomly move a magnetic head of only one hard disk, and not impact all the other disks. – kubanczyk Oct 04 '12 at 18:25

1 Answers1

4

The logic for when Linux applies read-ahead is complicated. Starting in 2.6.23 there's the really fancy On-Demand Readahead, before that it used a less complicated prediction mechanism. The design goals of read-ahead always include not doing read-ahead unless you have a read access pattern that justifies it. So the idea that the stripe size is a relevant piece of data here is fundamentally unsound. Individual reads that are on that end of the file I/O range, below the stripe size, aren't normally going to trigger the read-ahead logic and have it applied to them anyway. Tiny values of read-ahead effectively turn the feature off. And you don't want that.

When you really are doing sequential I/O to a large RAID10 array, the only way to reach the full throughput of many systems is to have read-ahead working for you. Otherwise Linux won't dispatch requests fast enough to keep the array reading to its full potential. The last few times I've tested larger disk arrays of RAID10 drives, in the 24 disk array range, large read-ahead settings (>=4096 = 2048KB) have given 50 to 100% performance gains on sequential I/O, as measured by dd or bonnie++. Try that yourself; run bonnie++, increase read-ahead a lot, and see what happens. If you have a large array, that will quickly dispel the idea that read-ahead numbers smaller than typical stripe sizes make any sense.

The Linux kernel is so aware of this necessity that it even automatically increases read-ahead for you when you create some types of arrays. Check out this example from a system with a 2.6.32 kernel:

[root@toy ~]# blockdev --report
RO    RA   SSZ   BSZ   StartSec        Size      Device 
rw   256   512  4096          0    905712320512  /dev/md1 
rw   768   512   512          0    900026204160   /dev/md0

Why is read-ahead 256 (128KB) on md1 while it's 768 (384KB) on md0? That's because md1 is a 3-disk RAID0, and Linux increases read-ahead knowing it has no hope of achieving full speed across an array of that size with the default of 256. Even that's actually too low; it needs to be 2048 (1024KB) or larger to hit the maximum speed that small array is capable of.

Much of the lore on low-level RAID settings like stripe sizes and alignment is just that: lore, not reality. Run some benchmarks at a few read-ahead settings yourself, see what happens, and then you'll have known good facts to work with instead.

Greg Smith
  • 959
  • 5
  • 7
  • Hi Greg, thanks for the answer. The problem, however, is that I'm trying to run a database here which is doing almost entirely random I/O. There is very little sequential I/O, and I see that having readahead too high is causing my system not to utilize all the RAM available to it. I want to set the readahead pretty low, but not to zero because I want an entire database object to be read in by one disk read. – stbrody Sep 25 '12 at 18:26
  • Ran out of characters for the comment... So I've been experimenting with different readahead values to find the optimal one, but I want to better understand how the raid stripe size will affect the behavior of disk reads and how that interacts with the readahead settings. – stbrody Sep 25 '12 at 18:26