0

We are running XenServer 6.1 in production, and recently had OOM-killer fire and eventually take down one of our blades. I noticed that in the kernel logs from the oom-killer instance that it seemed like there was plenty of memory available from Dom0's perspective:

Jul 24 02:29:24 xenserver4 kernel: [2091564.792646] DMA free:2832kB min:76kB low:92kB high:112kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:16256kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:144kB slab_unreclaimable:7344kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jul 24 02:29:24 xenserver4 kernel: [2091564.792676] lowmem_reserve[]: 0 698 2016 2016
Jul 24 02:29:24 xenserver4 kernel: [2091564.792696] Normal free:180036kB min:3340kB low:4172kB high:5008kB active_anon:0kB inactive_anon:0kB active_file:72kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:693240kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:14988kB slab_unreclaimable:385960kB kernel_stack:3568kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jul 24 02:29:24 xenserver4 kernel: [2091564.792728] lowmem_reserve[]: 0 0 10540 10540
Jul 24 02:29:24 xenserver4 kernel: [2091564.792747] HighMem free:829152kB min:512kB low:2132kB high:3756kB active_anon:181880kB inactive_anon:64204kB active_file:118268kB inactive_file:101744kB unevictable:55640kB isolated(anon):0kB isolated(file):0kB present:1357380kB mlocked:55640kB dirty:352kB writeback:0kB mapped:30296kB shmem:1052kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no

Just from the HighMem free value it looks like there is 800MB+ available, combined with NormaL free its about 1GB.

I followed Citrix' recommended procedure to allocate additional memory to Dom0 to help with performance and running more VM's per host: http://support.citrix.com/article/CTX134951

In the kernel line, both these values are set:

mem=1024G 
dom0_mem=2048M,max:2048M

Here's the complete kernel line:

# XenServer
kernel mboot.c32
append /boot/xen.gz mem=1024G watchdog_timeout=300 cpuid_mask_xsave_eax=0 lowmem_emergency_pool=1M crashkernel=64M@32M console=vga vga=mode-0x0311 dom0_mem=2048M,max:2048M dom0_max_vcpus=1-8 --- /boot/vmlinuz-2.6-xen blkbk.max_ring_page_order=2 root=LABEL=root-phjwmuox ro xencons=hvc console=hvc0 console=tty0 quiet vga=785 splash --- /boot/initrd-2.6-xen.img

From http://oreilly.com/linux/excerpts/9780596100797/kernel-boot-command-line-parameter-reference.html :

mem= n[KMG]

Set the specific ammount of memory used by the kernel. When used with the memmap= option, physical address space collisions can be avoided. Without the memmap= option, this option could cause PCI devices to be placed at addresses that belong to unused RAM. n specifies the amount of memory to force and is measured in units of kilobytes (K), megabytes (M), or gigabytes (G).

Is Citrix' script to increase Dom0 buggy in that it doesn't let it use any of that dom0_mem above 1GB?

Steve R.
  • 134
  • 1
  • 7

1 Answers1

1

You can definitely have more than 1GB of RAM for dom0 in XenServer 6.1. And your syntax looks correct. When changing the amount of dom0 RAM, though, it is recommended that you use the correct interface (and that will help you to prevent typos). Check this document out:

http://support.citrix.com/article/CTX134951

I'm not so sure about the total amount of Low memory you are seeing, though. Has the problem happened again since you increased your total amount of dom0 RAM? If it did, can you collect the output of /proc/meminfo and edit your question including that information?

Also, if you have many SRs with VDIs plugged to VMs in one particular host, the total amount of available LowMem in that host will reduce and that can cause the OOM to kick in. This is due to the memory allocated to blkback page pools and is configurable. Have a look at this document to learn more about it:

http://support.citrix.com/article/CTX136861

Last but not least, make sure you have all hotfixes applied.

Cheers

Felipe