3

I have run 2 tests with different setup as explained below. I'm using Azure VM Standard D64s v3.

For the first setup I have mounted 4 Premium SSD disks of 4 TB each, and am performing IO operations with 1 thread writing to each disk. the FIO job is

[global]
size=30g
direct=1
iodepth=256
ioengine=libaio
bs=8k
numjobs=1

[writer1]
rw=randwrite
directory=/datadrive

[writer2]
rw=randwrite
directory=/datadrive2

[writer4]
rw=randwrite
directory=/datadrive4

[writer5]
rw=randwrite
directory=/datadrive5

The output is

writer1: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=256
writer2: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=256
writer4: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=256
writer5: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=256
fio-3.1
Starting 4 processes
writer1: Laying out IO file (1 file / 30720MiB)
writer2: Laying out IO file (1 file / 30720MiB)
writer4: Laying out IO file (1 file / 30720MiB)
writer5: Laying out IO file (1 file / 30720MiB)
Jobs: 1 (f=1): [w(1),_(3)][26.1%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 02m:58s]
writer1: (groupid=0, jobs=1): err= 0: pid=58194: Tue Feb  2 13:12:50 2021
  write: IOPS=6041, BW=47.2MiB/s (49.5MB/s)(2978MiB/63096msec)
    slat (usec): min=3, max=118643, avg=26.05, stdev=350.12
    clat (usec): min=3, max=32369k, avg=42333.21, stdev=659122.27
     lat (usec): min=46, max=32369k, avg=42359.46, stdev=659122.00
    clat percentiles (usec):
     |  1.00th=[      76],  5.00th=[     108], 10.00th=[     174],
     | 20.00th=[    1254], 30.00th=[    3392], 40.00th=[    5014],
     | 50.00th=[    6521], 60.00th=[    8848], 70.00th=[   13042],
     | 80.00th=[   20055], 90.00th=[   30540], 95.00th=[   39584],
     | 99.00th=[  120062], 99.50th=[  851444], 99.90th=[ 7012877],
     | 99.95th=[16575890], 99.99th=[17112761]
   bw (  KiB/s): min=   10, max=153154, per=25.59%, avg=50460.17, stdev=63267.92, samples=121
   iops        : min=    1, max=19144, avg=6307.31, stdev=7908.42, samples=121
  lat (usec)   : 4=0.01%, 20=0.01%, 50=0.02%, 100=4.10%, 250=8.44%
  lat (usec)   : 500=3.32%, 750=1.71%, 1000=1.33%
  lat (msec)   : 2=4.04%, 4=10.16%, 10=30.03%, 20=16.68%, 50=17.47%
  lat (msec)   : 100=1.60%, 250=0.44%, 500=0.07%, 750=0.07%, 1000=0.06%
  lat (msec)   : 2000=0.11%, >=2000=0.35%
  cpu          : usr=2.82%, sys=11.92%, ctx=32608, majf=0, minf=15
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwt: total=0,381220,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256
writer2: (groupid=0, jobs=1): err= 0: pid=58195: Tue Feb  2 13:12:50 2021
  write: IOPS=6476, BW=50.6MiB/s (53.1MB/s)(3080MiB/60882msec)
    slat (usec): min=3, max=160087, avg=29.46, stdev=435.82
    clat (usec): min=15, max=36835k, avg=39494.66, stdev=876736.78
     lat (usec): min=57, max=36835k, avg=39524.33, stdev=876736.58
    clat percentiles (usec):
     |  1.00th=[     176],  5.00th=[    1647], 10.00th=[    3097],
     | 20.00th=[    4686], 30.00th=[    5669], 40.00th=[    6849],
     | 50.00th=[    8586], 60.00th=[   11338], 70.00th=[   16057],
     | 80.00th=[   23987], 90.00th=[   34341], 95.00th=[   43779],
     | 99.00th=[   88605], 99.50th=[  130548], 99.90th=[ 1400898],
     | 99.95th=[17112761], 99.99th=[17112761]
   bw (  KiB/s): min=   10, max=155414, per=30.24%, avg=59631.85, stdev=67499.68, samples=106
   iops        : min=    1, max=19426, avg=7453.75, stdev=8437.28, samples=106
  lat (usec)   : 20=0.01%, 50=0.01%, 100=0.39%, 250=1.15%, 500=1.21%
  lat (usec)   : 750=0.55%, 1000=0.54%
  lat (msec)   : 2=1.70%, 4=9.61%, 10=40.86%, 20=19.14%, 50=21.81%
  lat (msec)   : 100=2.09%, 250=0.78%, 500=0.01%, 750=0.01%, 1000=0.03%
  lat (msec)   : 2000=0.04%, >=2000=0.09%
  cpu          : usr=2.84%, sys=12.67%, ctx=8384, majf=0, minf=15
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwt: total=0,394287,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256
writer4: (groupid=0, jobs=1): err= 0: pid=58196: Tue Feb  2 13:12:50 2021
  write: IOPS=6522, BW=50.0MiB/s (53.4MB/s)(3110MiB/61022msec)
    slat (usec): min=3, max=291305, avg=20.22, stdev=486.94
    clat (nsec): min=1500, max=11376M, avg=39221241.22, stdev=371308305.44
     lat (usec): min=39, max=11376k, avg=39241.67, stdev=371308.58
    clat percentiles (usec):
     |  1.00th=[      53],  5.00th=[      67], 10.00th=[      77],
     | 20.00th=[      97], 30.00th=[     135], 40.00th=[     343],
     | 50.00th=[    1172], 60.00th=[    2442], 70.00th=[    4424],
     | 80.00th=[    7439], 90.00th=[   24249], 95.00th=[   36963],
     | 99.00th=[  943719], 99.50th=[ 2466251], 99.90th=[ 5939135],
     | 99.95th=[ 7079986], 99.99th=[11207181]
   bw (  KiB/s): min=   10, max=153250, per=26.70%, avg=52653.34, stdev=55397.61, samples=121
   iops        : min=    1, max=19156, avg=6581.43, stdev=6924.70, samples=121
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.58%
  lat (usec)   : 100=20.60%, 250=16.69%, 500=4.34%, 750=3.05%, 1000=2.67%
  lat (msec)   : 2=9.20%, 4=10.93%, 10=15.50%, 20=4.87%, 50=9.05%
  lat (msec)   : 100=0.87%, 250=0.21%, 500=0.22%, 750=0.13%, 1000=0.11%
  lat (msec)   : 2000=0.36%, >=2000=0.61%
  cpu          : usr=3.07%, sys=12.42%, ctx=88453, majf=0, minf=1392
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwt: total=0,398034,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256
writer5: (groupid=0, jobs=1): err= 0: pid=58197: Tue Feb  2 13:12:50 2021
  write: IOPS=6347, BW=49.6MiB/s (51.0MB/s)(2984MiB/60181msec)
    slat (usec): min=3, max=623424, avg=25.05, stdev=1061.43
    clat (usec): min=38, max=36942k, avg=40305.49, stdev=896283.77
     lat (usec): min=54, max=36942k, avg=40330.71, stdev=896284.15
    clat percentiles (usec):
     |  1.00th=[      81],  5.00th=[     293], 10.00th=[    2057],
     | 20.00th=[    3556], 30.00th=[    4686], 40.00th=[    5473],
     | 50.00th=[    6652], 60.00th=[    9110], 70.00th=[   12911],
     | 80.00th=[   25822], 90.00th=[   35390], 95.00th=[   45351],
     | 99.00th=[  102237], 99.50th=[  164627], 99.90th=[ 2004878],
     | 99.95th=[17112761], 99.99th=[17112761]
   bw (  KiB/s): min=   16, max=153843, per=31.67%, avg=62458.55, stdev=67494.99, samples=98
   iops        : min=    2, max=19230, avg=7807.10, stdev=8436.72, samples=98
  lat (usec)   : 50=0.01%, 100=2.27%, 250=2.52%, 500=1.31%, 750=0.92%
  lat (usec)   : 1000=0.57%
  lat (msec)   : 2=2.22%, 4=13.76%, 10=39.16%, 20=13.02%, 50=20.89%
  lat (msec)   : 100=2.29%, 250=0.76%, 500=0.04%, 750=0.09%, 1000=0.02%
  lat (msec)   : 2000=0.05%, >=2000=0.10%
  cpu          : usr=2.42%, sys=10.44%, ctx=13976, majf=0, minf=704
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwt: total=0,381975,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256

In the second run, I created a RAID 0 array of 4 4 TB disks and mounted it. Now I run the following job

[global]
size=30g
direct=1
iodepth=256
ioengine=libaio
bs=8k
numjobs=1

[writer1]
rw=randwrite
directory=/datadrive9

[writer2]
rw=randwrite
directory=/datadrive9

[writer4]
rw=randwrite
directory=/datadrive9

[writer5]
rw=randwrite
directory=/datadrive9

Output is

writer1: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=256
writer2: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=256
writer4: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=256
writer5: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=256
fio-3.1
Starting 4 processes
writer2: Laying out IO file (1 file / 30720MiB)
writer4: Laying out IO file (1 file / 30720MiB)
writer5: Laying out IO file (1 file / 30720MiB)
Jobs: 4 (f=4): [w(4)][10.4%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 09m:43s]
writer1: (groupid=0, jobs=1): err= 0: pid=22209: Tue Feb  2 13:53:40 2021
  write: IOPS=6011, BW=46.0MiB/s (49.2MB/s)(3210MiB/68338msec)
    slat (usec): min=4, max=19116, avg=10.13, stdev=52.29
    clat (usec): min=21, max=13267k, avg=42549.89, stdev=517341.32
     lat (usec): min=42, max=13267k, avg=42560.22, stdev=517341.29
    clat percentiles (usec):
     |  1.00th=[      69],  5.00th=[     103], 10.00th=[     223],
     | 20.00th=[     578], 30.00th=[     873], 40.00th=[    1303],
     | 50.00th=[    1680], 60.00th=[    2147], 70.00th=[    2737],
     | 80.00th=[    4490], 90.00th=[   15270], 95.00th=[   27919],
     | 99.00th=[  333448], 99.50th=[ 2038432], 99.90th=[ 9328133],
     | 99.95th=[11744052], 99.99th=[12549358]
   bw (  KiB/s): min=   16, max=347020, per=35.88%, avg=85557.71, stdev=105200.55, samples=77
   iops        : min=    2, max=43377, avg=10694.30, stdev=13150.00, samples=77
  lat (usec)   : 50=0.07%, 100=4.68%, 250=5.81%, 500=6.68%, 750=8.72%
  lat (usec)   : 1000=7.22%
  lat (msec)   : 2=23.63%, 4=21.15%, 10=9.07%, 20=4.43%, 50=6.50%
  lat (msec)   : 100=0.52%, 250=0.37%, 500=0.33%, 750=0.12%, 1000=0.10%
  lat (msec)   : 2000=0.10%, >=2000=0.50%
  cpu          : usr=2.32%, sys=7.07%, ctx=47306, majf=0, minf=699
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwt: total=0,410837,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256
writer2: (groupid=0, jobs=1): err= 0: pid=22210: Tue Feb  2 13:53:40 2021
  write: IOPS=6393, BW=49.9MiB/s (52.4MB/s)(3412MiB/68310msec)
    slat (usec): min=3, max=325702, avg=31.82, stdev=630.62
    clat (usec): min=32, max=13313k, avg=40001.34, stdev=469309.55
     lat (usec): min=54, max=13313k, avg=40033.41, stdev=469313.55
    clat percentiles (usec):
     |  1.00th=[     100],  5.00th=[     196], 10.00th=[     453],
     | 20.00th=[    1045], 30.00th=[    1549], 40.00th=[    2180],
     | 50.00th=[    3195], 60.00th=[    4686], 70.00th=[    7242],
     | 80.00th=[   11863], 90.00th=[   24249], 95.00th=[   35390],
     | 99.00th=[  156238], 99.50th=[ 2105541], 99.90th=[ 9462350],
     | 99.95th=[ 9865004], 99.99th=[13086229]
   bw (  KiB/s): min=   16, max=243815, per=31.19%, avg=74378.34, stdev=68597.51, samples=94
   iops        : min=    2, max=30476, avg=9297.04, stdev=8574.66, samples=94
  lat (usec)   : 50=0.01%, 100=1.02%, 250=5.44%, 500=4.23%, 750=3.57%
  lat (usec)   : 1000=4.77%
  lat (msec)   : 2=18.39%, 4=18.58%, 10=20.83%, 20=10.44%, 50=10.46%
  lat (msec)   : 100=1.01%, 250=0.39%, 500=0.14%, 750=0.05%, 1000=0.05%
  lat (msec)   : 2000=0.12%, >=2000=0.51%
  cpu          : usr=3.29%, sys=16.53%, ctx=25760, majf=0, minf=705
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwt: total=0,436710,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256
writer4: (groupid=0, jobs=1): err= 0: pid=22211: Tue Feb  2 13:53:40 2021
  write: IOPS=8253, BW=64.5MiB/s (67.6MB/s)(4405MiB/68309msec)
    slat (usec): min=3, max=984211, avg=29.89, stdev=1823.60
    clat (usec): min=39, max=13365k, avg=30969.61, stdev=406333.66
     lat (usec): min=60, max=13365k, avg=30999.67, stdev=406337.80
    clat percentiles (usec):
     |  1.00th=[      94],  5.00th=[     196], 10.00th=[     457],
     | 20.00th=[    1254], 30.00th=[    1860], 40.00th=[    2376],
     | 50.00th=[    2868], 60.00th=[    3425], 70.00th=[    4359],
     | 80.00th=[    6718], 90.00th=[   17957], 95.00th=[   28967],
     | 99.00th=[  154141], 99.50th=[  994051], 99.90th=[ 8355054],
     | 99.95th=[ 9328133], 99.99th=[13086229]
   bw (  KiB/s): min=   16, max=325328, per=39.05%, avg=93104.42, stdev=94687.86, samples=97
   iops        : min=    2, max=40666, avg=11637.75, stdev=11835.95, samples=97
  lat (usec)   : 50=0.01%, 100=1.36%, 250=5.02%, 500=4.31%, 750=3.65%
  lat (usec)   : 1000=2.97%
  lat (msec)   : 2=15.06%, 4=34.67%, 10=17.73%, 20=6.39%, 50=7.02%
  lat (msec)   : 100=0.64%, 250=0.33%, 500=0.17%, 750=0.07%, 1000=0.11%
  lat (msec)   : 2000=0.14%, >=2000=0.34%
  cpu          : usr=3.06%, sys=17.32%, ctx=27740, majf=0, minf=700
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwt: total=0,563804,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256
writer5: (groupid=0, jobs=1): err= 0: pid=22212: Tue Feb  2 13:53:40 2021
  write: IOPS=9155, BW=71.5MiB/s (75.0MB/s)(4886MiB/68308msec)
    slat (usec): min=3, max=1253.7k, avg=33.07, stdev=2124.81
    clat (usec): min=41, max=13312k, avg=27912.10, stdev=371043.54
     lat (usec): min=56, max=13312k, avg=27945.40, stdev=371050.01
    clat percentiles (usec):
     |  1.00th=[     116],  5.00th=[     302], 10.00th=[     619],
     | 20.00th=[    1680], 30.00th=[    2769], 40.00th=[    3851],
     | 50.00th=[    4752], 60.00th=[    5997], 70.00th=[    7963],
     | 80.00th=[   11076], 90.00th=[   21627], 95.00th=[   33817],
     | 99.00th=[  164627], 99.50th=[  497026], 99.90th=[ 7616857],
     | 99.95th=[ 9596568], 99.99th=[13086229]
   bw (  KiB/s): min=   16, max=363341, per=36.27%, avg=86484.03, stdev=82997.59, samples=116
   iops        : min=    2, max=45417, avg=10810.06, stdev=10374.71, samples=116
  lat (usec)   : 50=0.01%, 100=0.62%, 250=3.40%, 500=4.25%, 750=3.26%
  lat (usec)   : 1000=2.37%
  lat (msec)   : 2=8.40%, 4=19.23%, 10=35.58%, 20=11.98%, 50=8.25%
  lat (msec)   : 100=1.27%, 250=0.64%, 500=0.24%, 750=0.06%, 1000=0.07%
  lat (msec)   : 2000=0.16%, >=2000=0.21%
  cpu          : usr=3.77%, sys=21.57%, ctx=26577, majf=0, minf=17
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwt: total=0,625402,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256

Run status group 0 (all jobs):
  WRITE: bw=233MiB/s (244MB/s), 46.0MiB/s-71.5MiB/s (49.2MB/s-75.0MB/s), io=15.5GiB (16.7GB), run=68308-68338msec

Disk stats (read/write):
    md0: ios=0/2036771, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=0/509201, aggrmerge=0/0, aggrticks=0/14988260, aggrin_queue=14611762, aggrutil=76.96%
  sdm: ios=0/508620, merge=0/0, ticks=0/16830518, in_queue=16437572, util=75.02%
  sdl: ios=0/508786, merge=0/0, ticks=0/13055450, in_queue=12693024, util=76.96%
  sdk: ios=0/508787, merge=0/0, ticks=0/14998518, in_queue=14628168, util=75.91%
  sdc: ios=0/510612, merge=0/0, ticks=0/15068554, in_queue=14688284, util=74.71%

I am noticing difference in latency distribution in these two runs. In non-RAID setup( first run), the latencies 4=10.16%, 10=30.03%, 20=16.68%, 50=17.47%

and in second run - RAID setup

2=15.06%, 4=34.67%, 10=17.73%, 20=6.39%, 50=7.02%

Noticing that distribution is higher with lower latencies with the RAID setup.

Why does this difference exist?

Sorry for the long question. I'm hoping somebody with knowledge how disk writes work can help with this question.

user3740951
  • 131
  • 3
  • An interesting and nicely written question. You may want to run BaronSamedi1958's scenario though (also don't forget to accept an answer if a good one eventually turns up or you find out the answer and can add your own). – Anon Feb 06 '21 at 18:31

2 Answers2

2

You inject extra layer into storage stack and it brings extra processing latency. TL;DR: Nothing unexpected. If you’ll do queue_depth=1 you’ll get a better understanding of how much you lose.

BaronSamedi1958
  • 12,510
  • 1
  • 20
  • 46
  • 1
    Could you please tell a little more about this extra processing latency? Is it without the RAID? How does it decrease with RAID? I have configured equal iodepth for both runs – user3740951 Feb 03 '21 at 04:05
0

RAID0 (or as some of my friends like to call it "JBOD" because where's the redundancy?) allows striping of writes - that is I can distribute inflight writes of different areas to multiple disks simultaneously rather than queuing them up against the same disk/all disks etc.

Let's say a given disk can support 32 in flight writes at a given moment and I choose to queue up 64 writes. That means the latency of the writes that have to wait for a slot (33 onwards) will be much higher. However, in the RAID0 case, those overflow writes might be serviceable by another disk that can support another 32 inflight writes before they have to be backed up.

Further, we don't know how latency increases as the amount of simultaneous writes increases on just a single one of your disks... Who's to say that write latency with a depth of 1 is the same as with a depth of 2, 4 or 8? If more I/O queued simultaneously leads to a higher queue depth then a RAID0 being able to spread the I/Os out among the disks could mean each disk's queue is low(er) and thus the latencies are better. This last point would depend massively on the I/O size versus the stripe size - e.g. at what boundary can RAID0 distribute the writes?

Anon
  • 1,210
  • 10
  • 23