I/O scheduling on LVM on dm-crypt

Question

I have the following setup:

Stock Debian stable (Linux kernel 3.16.7) running on Dell PowerEdge R320
Two SATA hard drives in RAID1 on a PERC H310 Mini controller, visible as /dev/sda
dm-crypt target dm-0 backed by /dev/sda2
A single LVM volume group with a single aforementioned physical volume dm-0
Multiple logical volumes dm-1, dm-2 etc in the aforementioned volume group, formatted as ext4

What happens is when a single process starts bulk writing to a logical volume, all other processes get severely I/O starved - the system becomes very unresponsive (with delays of up to 30 seconds when I/O is involved).

I thought it's the I/O scheduler that's in charge of making sure that doesn't happen. I see that sda uses cfq:

# cat /sys/block/sda/queue/scheduler 
noop deadline [cfq]

But every other device mapper target reports having no I/O scheduler:

# cat /sys/block/dm-0/queue/scheduler 
none
# cat /sys/block/dm-1/queue/scheduler 
none
# cat /sys/block/dm-2/queue/scheduler 
none

My question is, why is there no I/O scheduling for device mapper targets, can I enable it, or how can I otherwise make this system responsive under heavy I/O load?

score 3 · Accepted Answer · answered Jul 01 '16 at 20:00

3

First, it is perfectly normal that DM devices does not have any I/O scheduler, as (with specific exceptions)

About the low performance you recorded, consider that your H310 controller not only has no cache, but it even disables the physical disk's DRAM cache, meaning your system has no way to lower latency via caching.

Combining that with encryption, where read-modify-write is common behavior (due to unaligned write access to the encrypted container), results in exceptionally poor write I/O performance.

answered Jul 01 '16 at 20:00

shodanshok

44,038
6
98
162

But the I/O performance is quite good, actually. Simple file copying happens at a sustained ~150MB/s, for example. The problem is that one process can totally hog the entire system, making every other process block for many seconds. – dragonroot Jul 01 '16 at 20:19
Because a file copy generally use large block size (>= 128KB). So, the single-threaded copy can proceed quite fast, but any other write activity can be stalled for (comparatively) long time. Anyway, consider that one process with heavy read/write activity can hog even a non-encrypted, cache-enabled machine. So much more for an encrypted, non-cached one. – shodanshok Jul 01 '16 at 20:55
I think you're right about H310, looks like a lot of people are having performance problems with it. – dragonroot Jul 02 '16 at 01:36

I/O scheduling on LVM on dm-crypt

1 Answers1