You can use pidstat -d
to get per-process IO statistics.
Use -p
if you want to get statistics for a specific process - e.g. getting disk stats for a java process every second:
pidstat 1 -d -p $(pgrep java)
Use iostat -x
for extended disk stats like "average queue size", etc.
iostat -xmdz 1
...
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz d/s dMB/s drqm/s %drqm d_await dareq-sz aqu-sz %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.01 1.00 33.33 0.50 6.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.40
See also https://medium.com/netflix-techblog/linux-performance-analysis-in-60-000-milliseconds-accc10403c55
Note that unless you're doing "raw disk IO" the file system OS layer can have a huge impact on your application's performance - e.g. page cache in Linux caches recent contents of files in memory so the read speed can be much higher than what's possible by the disk alone.
To monitor the file system operation you can use BPF tools like vfsstat
, biolatency
, ext4slower
, ext4dist
et al.
You can also hdparm
to measure both raw disk IO speed and "cached IO" speed - e.g.
root@ubuntu-18:~# hdparm -tT /dev/sda
/dev/sda:
Timing cached reads: 11568 MB in 1.99 seconds = 5807.23 MB/sec
SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Timing buffered disk reads: 1682 MB in 3.00 seconds = 560.58 MB/sec
Notice how "cached reads" have a much larger throughput of almost 6GBs/sec compared to "buffered disk reads".
There can be an even larger difference if you use a moderately sized cloud/AWS instance.