Is there a way to gather statistics about blocks being accessed on a disk?
I have a scenario where a task is both memory and I/O intensive and I need to find a good balance as to how much of the available RAM I can assign to the process and how much I should leave for the system for building its I/O cache for the block device being used.
I suspect that most of the I/O that is currently happening is accessing a rather small subset of files (or parts of large files) and that performance could be optimized by increasing the RAM that is available for I/O buffering.
Ideally, I would be able to create something like a "heat-map" that shows me which parts of the files are accessed most of the time.
Setup currently is based on CentOS 5 on AWS/EC2 m1.large instance. Disk setups are either ephemeral block devices in a RAID0 setup (LVM) or alternatively a single (500GB) EBS
Update: Originally, this question was talking about disk blocks, which was misleading as I am actually interested in the logical blocks being accessed and I don't care where they are on the physical devices. I changed this to make clear that it is parts of files I'm interested in. I apologize for the confusion.