5

When you have LVM, you have an entry for a scheduler in /sys/block for your physical volumes, but also for each individual logical volume, and the raw device.

We have a Debian 6 LTS x64, kernel 2.6.32 system running Xen hypervisor 4.0 (3Ware 9650 SE hardware RAID1). When running virtual machines on each logical volume, on which one do you need to set the scheduler if you want to influence how they get scheduled by the OS? If you set the logical volume to deadline, will that even do anything when the physical volume is set to cfq? And if you do set it do deadline on the logical volume, will those deadlines be honoured even when the disk is slowing down because of IO on other LV's that are set to cfq?

Question relates to IO on VMs slowing down other VMs too much. All guests use noop as scheduler internally.

Edit: according to this, in a multipath environment, only the DM's scheduler will take effect. So if I want to handle IO between virtual machines in a deadline manner, I have to set the DM path of the physical volume (dm-1 in my case) to deadline. Is that right? There is also a scheduler for sdc, which is the original block device of my dm-1. Why doesn't shouldn't it be done on that?

edit2: but then someone says in the comments that dm-0/1 doesn't have a scheduler in newer kernels:

famzah@VBox:~$ cat /sys/block/dm-0/queue/scheduler
none

On my system (Debian 6, kernel 2.6.32), I have:

cat /sys/block/dm-1/queue/scheduler 
noop anticipatory [deadline] cfq

Question is also, do I have a multipath setup? pvs shows:

# pvs
PV         VG                 Fmt  Attr PSize PFree
/dev/dm-0  universe           lvm2 a-   5,41t 3,98t
/dev/dm-1  alternate-universe lvm2 a-   1,82t 1,18t

But they were created with /dev/sd[bc]. Does that mean I have multipath, even though it's a standard LVM setup?

The main question, I guess, is do I have to set the scheduler on sdc or dm-1? If I do iostat, I see a lot of access on both:

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdc               0,00     0,00   13,02   25,36   902,71   735,56    42,68     0,08    2,17   0,73   2,79
dm-1             82,25    57,26   12,97   25,36   902,31   735,56    42,72     0,18    4,73   0,84   3,23

So, what is what and who is the boss? If it's sdc, I can tell you that setting it to deadline doesn't do a thing for the performance of my VMs. Looking at the difference in the 'requests merged' columns (first two), I'd say it's dm-1 that controls the scheduling.

Halfgaar
  • 7,921
  • 5
  • 42
  • 81

3 Answers3

2

So, the answer turned out to be simply: the underlying device. Newer kernels only have 'none' in /sys/block/*/queue/scheduler when there is no scheduler to configure.

However, for a reason unknown to me, the devices on this server are created as multipath devices, therefore my fiddling with the scheduler on /dev/sd[bc] never did anything in the past. Now I set dm-1 and dm-0 to deadline with a read_expire=100 and write_expire=1500 (much more stringent that normal) and the results seem very good.

This graph shows the effect on disk latency in a virtual machine, caused by another virtual machine with an hourly task:

Disk latency over 24h in ms

You can clearly see the moment where I changed the scheduler parameters.

Halfgaar
  • 7,921
  • 5
  • 42
  • 81
1

Hmm, Debian...

Well, I can share how Redhat approaches this with their tuned framework. There are profiles for "virtual-host" and "virtual-guest". The profile descriptions are explained in detail here, and the following excerpt shows which devices are impacted. The "dm-*" and "sdX" devices have their schedulers changed.

# This is the I/O scheduler ktune will use.  This will *not* override anything
# explicitly set on the kernel command line, nor will it change the scheduler
# for any block device that is using a non-default scheduler when ktune starts.
# You should probably leave this on "deadline", but "as", "cfq", and "noop" are
# also legal values.  Comment this out to prevent ktune from changing I/O
# scheduler settings. 
ELEVATOR="deadline"

# These are the devices, that should be tuned with the ELEVATOR 
ELEVATOR_TUNE_DEVS="/sys/block/{sd,cciss,dm-,vd,zd}*/queue/scheduler"

Also see:
CentOS Tuned Equivalent For Debian and Understanding RedHat's recommended tuned profiles

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • It's already useful to know that deadline is the preferred. Thanks. I'm going to look into this. – Halfgaar Jan 19 '15 at 12:58
  • I added iostat output. The 'requests merged' columns might tell the story? – Halfgaar Jan 19 '15 at 13:41
  • I'd do both... but I don't have any dm devices to check with. – ewwhite Jan 19 '15 at 13:42
  • I did do both now, but it's a question of *knowing* that my change is actually doing something. I need to know if setting deadline on sdc didn't do anything, or had an effect but just didn't help. – Halfgaar Jan 19 '15 at 14:13
0

as vmware recommends, is better to use noop scheduler, if your guest are using file as virtualdisk, in this way your guest pass the IO to your host directly withouth reorganize the IO twice in your guest and in your physical host

c4f4t0r
  • 5,149
  • 3
  • 28
  • 41
  • 1
    I forgot to metion that; all my guests use noop internally. – Halfgaar Jan 19 '15 at 13:13
  • I think the noop is the best choice for the reason i told you in my answer :) – c4f4t0r Jan 19 '15 at 14:19
  • @c4f4t0r question is not about guest scheduler. question is about setting scheduler on the host. It's not even about whether to use deadline or not, it's about *which device(s)* you need to set deadline/noop/whatever to have any effect, is it the logical volume `/dev/dm-0`, or is it the physical hard drive `/dev/sdc` ? – sourcejedi Dec 02 '18 at 16:50