I am experiencing very low IOPS with a SSD in my server. I noticed this when running a mysql database server, which performed very bad when there were many (~100 per second) updates to an InnoDB Database.
Here are the server specs:
SSD: Model=Samsung SSD 850 EVO 120GB, FwRev=EMT01B6Q
Server: HP proliant DL320e Gen8
CPU: Intel(R) Xeon(R) CPU E3-1270 v3 @ 3.50GHz
OS: Ubuntu 14.04.1 LTS
Kernel: Linux h119 3.13.0-44-generic #73-Ubuntu SMP Tue Dec 16 00:22:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
When checking the SSD while there's some load on it, I'm getting these results with iostat -kx 1 20
root@h119:~# iostat -kx 1 20
Linux 3.13.0-44-generic (h119) 02/19/2015 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
0.14 0.00 0.08 1.06 0.00 98.72
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 5.14 14.75 0.80 17.68 378.51 258.59 68.94 0.22 12.03 1.76 12.50 7.90 14.60
sdb 0.00 19.89 0.00 18.44 0.05 636.16 68.99 0.26 14.32 0.43 14.33 8.22 15.17
md2 0.00 0.00 0.04 31.40 0.97 257.88 16.46 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 7.93 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.00 0.00 0.63 1.00 0.00 97.37
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 21.00 0.00 18.00 0.00 346.00 38.44 0.19 10.44 0.00 10.44 7.78 14.00
sdb 0.00 21.00 0.00 18.00 0.00 346.00 38.44 0.18 10.00 0.00 10.00 7.33 13.20
md2 0.00 0.00 0.00 36.00 0.00 356.00 19.78 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.25 0.00 0.75 1.50 0.00 96.50
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 19.00 0.00 19.00 0.00 218.00 22.95 0.20 9.47 0.00 9.47 9.89 18.80
sdb 0.00 19.00 0.00 19.00 0.00 218.00 22.95 0.20 9.68 0.00 9.68 10.11 19.20
md2 0.00 0.00 0.00 37.00 0.00 236.00 12.76 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.75 0.00 1.12 1.37 0.00 95.76
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 19.80 0.00 16.83 0.00 291.09 34.59 0.19 9.88 0.00 9.88 9.18 15.45
sdb 0.00 19.80 0.00 16.83 0.00 291.09 34.59 0.19 9.88 0.00 9.88 9.18 15.45
md2 0.00 0.00 0.00 35.64 0.00 336.63 18.89 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.00 0.00 0.75 1.00 0.00 97.25
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 18.00 0.00 22.00 0.00 310.00 28.18 0.16 8.91 0.00 8.91 6.36 14.00
sdb 0.00 18.00 0.00 22.00 0.00 310.00 28.18 0.16 9.09 0.00 9.09 6.55 14.40
md2 0.00 0.00 0.00 32.00 0.00 228.00 14.25 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.75 0.00 0.75 1.00 0.00 97.50
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 10.00 0.00 13.00 0.00 142.00 21.85 0.13 9.85 0.00 9.85 9.85 12.80
sdb 0.00 10.00 0.00 13.00 0.00 142.00 21.85 0.12 9.54 0.00 9.54 9.54 12.40
md2 0.00 0.00 0.00 19.00 0.00 140.00 14.74 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.26 0.00 0.88 1.26 0.00 96.61
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 34.00 0.00 12.00 0.00 333.00 55.50 0.12 10.00 0.00 10.00 10.33 12.40
sdb 0.00 34.00 0.00 12.00 0.00 333.00 55.50 0.12 10.00 0.00 10.00 10.33 12.40
md2 0.00 0.00 0.00 45.00 0.00 352.00 15.64 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.37 0.00 0.87 1.50 0.00 96.26
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 17.00 0.00 28.00 0.00 713.00 50.93 0.28 10.00 0.00 10.00 5.86 16.40
sdb 0.00 17.00 0.00 28.00 0.00 713.00 50.93 0.27 9.86 0.00 9.86 5.71 16.00
md2 0.00 0.00 0.00 43.00 0.00 692.00 32.19 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.50 0.00 0.75 1.38 0.00 96.37
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 35.00 0.00 20.00 0.00 361.00 36.10 0.18 9.20 0.00 9.20 8.00 16.00
sdb 0.00 35.00 0.00 20.00 0.00 361.00 36.10 0.18 9.20 0.00 9.20 8.00 16.00
md2 0.00 0.00 0.00 53.00 0.00 360.00 13.58 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.25 0.00 1.12 1.12 0.00 96.50
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 20.00 0.00 11.00 0.00 193.00 35.09 0.11 9.82 0.00 9.82 9.82 10.80
sdb 0.00 20.00 0.00 11.00 0.00 193.00 35.09 0.11 10.18 0.00 10.18 10.18 11.20
md2 0.00 0.00 0.00 29.00 0.00 192.00 13.24 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.00 0.00 0.50 0.62 0.00 97.88
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 12.00 0.00 9.00 0.00 137.00 30.44 0.09 9.78 0.00 9.78 8.44 7.60
sdb 0.00 12.00 0.00 9.00 0.00 137.00 30.44 0.09 10.22 0.00 10.22 8.89 8.00
md2 0.00 0.00 0.00 19.00 0.00 136.00 14.32 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.13 0.00 0.63 0.63 0.00 97.62
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 15.00 0.00 8.00 0.00 181.00 45.25 0.08 10.00 0.00 10.00 10.00 8.00
sdb 0.00 15.00 0.00 8.00 0.00 181.00 45.25 0.07 9.00 0.00 9.00 9.00 7.20
md2 0.00 0.00 0.00 25.00 0.00 244.00 19.52 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.88 0.00 0.63 0.88 0.00 97.60
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 28.00 0.00 15.00 0.00 301.00 40.13 0.14 9.33 0.00 9.33 6.93 10.40
sdb 0.00 28.00 0.00 15.00 0.00 301.00 40.13 0.14 9.33 0.00 9.33 6.93 10.40
md2 0.00 0.00 0.00 38.00 0.00 236.00 12.42 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.75 0.00 1.25 0.50 0.00 97.50
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 15.00 0.00 7.00 0.00 177.00 50.57 0.08 9.71 0.00 9.71 10.86 7.60
sdb 0.00 15.00 0.00 7.00 0.00 177.00 50.57 0.07 9.14 0.00 9.14 10.29 7.20
md2 0.00 0.00 0.00 22.00 0.00 188.00 17.09 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.76 0.00 0.38 1.13 0.00 97.73
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 17.00 0.00 17.00 0.00 205.00 24.12 0.16 9.65 0.00 9.65 7.06 12.00
sdb 0.00 17.00 0.00 17.00 0.00 205.00 24.12 0.16 9.65 0.00 9.65 7.06 12.00
md2 0.00 0.00 0.00 33.00 0.00 196.00 11.88 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.38 0.00 0.63 0.88 0.00 97.11
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 19.00 0.00 11.00 0.00 245.00 44.55 0.10 9.45 0.00 9.45 9.45 10.40
sdb 0.00 19.00 0.00 11.00 0.00 245.00 44.55 0.10 9.45 0.00 9.45 9.45 10.40
md2 0.00 0.00 0.00 28.00 0.00 244.00 17.43 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.76 0.00 0.50 1.39 0.00 97.35
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 28.00 0.00 20.00 0.00 305.00 30.50 0.19 9.40 0.00 9.40 8.60 17.20
sdb 0.00 28.00 0.00 20.00 0.00 305.00 30.50 0.19 9.40 0.00 9.40 8.60 17.20
md2 0.00 0.00 0.00 47.00 0.00 304.00 12.94 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.01 0.00 0.63 1.13 0.00 97.23
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 28.00 0.00 16.00 0.00 357.00 44.62 0.15 9.75 0.00 9.75 8.50 13.60
sdb 0.00 28.00 0.00 16.00 0.00 357.00 44.62 0.15 9.50 0.00 9.50 8.25 13.20
md2 0.00 0.00 0.00 44.00 0.00 356.00 16.18 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.13 0.00 0.75 1.50 0.00 96.62
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 19.00 0.00 22.00 0.00 297.00 27.00 0.19 8.91 0.00 8.91 5.82 12.80
sdb 0.00 19.00 0.00 22.00 0.00 297.00 27.00 0.22 10.36 0.00 10.36 6.36 14.00
md2 0.00 0.00 0.00 40.00 0.00 296.00 14.80 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.13 0.00 0.75 1.13 0.00 96.99
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 23.00 0.00 18.00 0.00 313.00 34.78 0.19 10.22 0.00 10.22 7.78 14.00
sdb 0.00 23.00 0.00 18.00 0.00 313.00 34.78 0.19 10.22 0.00 10.22 7.78 14.00
md2 0.00 0.00 0.00 39.00 0.00 312.00 16.00 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
The following result is with fio --rw=write --name=test --size=20M --direct=1
root@h119:~# fio --rw=write --name=test --size=20M --direct=1
test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.1.3
Starting 1 process
test: Laying out IO file(s) (1 file(s) / 20MB)
Jobs: 1 (f=1): [W] [100.0% done] [0KB/404KB/0KB /s] [0/101/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=11288: Sat Feb 21 19:59:52 2015
write: io=20480KB, bw=415022B/s, iops=101, runt= 50531msec
clat (usec): min=9674, max=21523, avg=9867.58, stdev=615.15
lat (usec): min=9675, max=21524, avg=9867.76, stdev=615.16
clat percentiles (usec):
| 1.00th=[ 9792], 5.00th=[ 9792], 10.00th=[ 9792], 20.00th=[ 9792],
| 30.00th=[ 9792], 40.00th=[ 9792], 50.00th=[ 9792], 60.00th=[ 9792],
| 70.00th=[ 9792], 80.00th=[ 9792], 90.00th=[ 9920], 95.00th=[10048],
| 99.00th=[12480], 99.50th=[12864], 99.90th=[18304], 99.95th=[18560],
| 99.99th=[21632]
bw (KB /s): min= 380, max= 411, per=100.00%, avg=405.33, stdev= 5.11
lat (msec) : 10=90.66%, 20=9.32%, 50=0.02%
cpu : usr=0.03%, sys=0.19%, ctx=5138, majf=0, minf=26
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=5120/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=20480KB, aggrb=405KB/s, minb=405KB/s, maxb=405KB/s, mint=50531msec, maxt=50531msec
Disk stats (read/write):
md2: ios=0/5580, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=0/5379, aggrmerge=0/191, aggrticks=0/52640, aggrin_queue=52640, aggrutil=99.00%
sda: ios=0/5379, merge=0/191, ticks=0/52508, in_queue=52508, util=98.63%
sdb: ios=0/5379, merge=0/191, ticks=0/52772, in_queue=52772, util=99.00%
As you can see, the IOPS are at 101, and I can only write 404kB/sec. Also, there's not much fluctuation on the IOPS while the tool is running, it's always between 101 and 103.
I also trust the results of fio, as my database gets slow when it hits about 100 updates/inserts per second.
With an identical server (hard- and software) and no RAID, I get the exactly same results, so this can't be due to the software-RAID:
root@h073:~# fio --rw=write --name=test --size=20M --direct=1
test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.1.3
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0KB/408KB/0KB /s] [0/102/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=15147: Sat Feb 21 20:52:14 2015
write: io=20480KB, bw=418492B/s, iops=102, runt= 50112msec
clat (usec): min=1504, max=17337, avg=9785.74, stdev=417.14
lat (usec): min=1504, max=17337, avg=9785.93, stdev=417.13
clat percentiles (usec):
| 1.00th=[ 9792], 5.00th=[ 9792], 10.00th=[ 9792], 20.00th=[ 9792],
| 30.00th=[ 9792], 40.00th=[ 9792], 50.00th=[ 9792], 60.00th=[ 9792],
| 70.00th=[ 9792], 80.00th=[ 9792], 90.00th=[ 9792], 95.00th=[ 9792],
| 99.00th=[12480], 99.50th=[12608], 99.90th=[13120], 99.95th=[15040],
| 99.99th=[17280]
bw (KB /s): min= 400, max= 416, per=100.00%, avg=408.75, stdev= 2.69
lat (msec) : 2=0.04%, 4=0.04%, 10=98.14%, 20=1.78%
cpu : usr=0.04%, sys=0.12%, ctx=5132, majf=0, minf=27
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=5120/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=20480KB, aggrb=408KB/s, minb=408KB/s, maxb=408KB/s, mint=50112msec, maxt=50112msec
Disk stats (read/write):
sda: ios=0/5132, merge=0/43, ticks=0/50048, in_queue=50048, util=99.52%
What is remarkable is that the speed is very constant at 404kB/sec at 101-103 IOPS.
Copying files, however, is very fast and fulfills my expectations to a SSD:
root@h119:~# dd if=randomfile of=randomfile2
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 2.27684 s, 472 MB/s
So it seems only 4k random writes are slow as hell. If there's some more information you need, please let me know and I'll update it. Thanks in advance for your answer.
Update: The issue doesn't persist any more with a non-HP server. So it must have something to do how HP servers access the SSD. On the other hardware I get the following results, which are perfectly fine:
root@ca286:~# fio --rw=write --name=test --size=40M --direct=1
test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.1.3
Starting 1 process
test: Laying out IO file(s) (1 file(s) / 40MB)
test: (groupid=0, jobs=1): err= 0: pid=1285: Wed Feb 25 20:47:25 2015
write: io=40960KB, bw=130032KB/s, iops=32507, runt= 315msec
clat (usec): min=24, max=1588, avg=30.34, stdev=15.59
lat (usec): min=24, max=1589, avg=30.39, stdev=15.60
clat percentiles (usec):
| 1.00th=[ 26], 5.00th=[ 29], 10.00th=[ 29], 20.00th=[ 30],
| 30.00th=[ 30], 40.00th=[ 30], 50.00th=[ 30], 60.00th=[ 30],
| 70.00th=[ 30], 80.00th=[ 31], 90.00th=[ 31], 95.00th=[ 31],
| 99.00th=[ 33], 99.50th=[ 54], 99.90th=[ 56], 99.95th=[ 56],
| 99.99th=[ 114]
lat (usec) : 50=99.30%, 100=0.68%, 250=0.01%
lat (msec) : 2=0.01%
cpu : usr=2.55%, sys=18.47%, ctx=10247, majf=0, minf=26
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=10240/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=40960KB, aggrb=130031KB/s, minb=130031KB/s, maxb=130031KB/s, mint=315msec, maxt=315msec
Disk stats (read/write):
sda: ios=0/4806, merge=0/0, ticks=0/120, in_queue=120, util=48.19%