6

I have a desktop running Ubuntu Quantal using OpenStack Folsom on an Intel i5 with 32 GB RAM and 2 GB swap. I'm running 7 VMs each sized like a EC2 m1.small, so 1.7 GB RAM each. I'm using KVM.

As I get up to running 5 or 6 concurrently, the host starts to swap them out:

top - 23:45:42 up 3 days,  1:51, 10 users,  load average: 0.37, 0.75, 1.15
Tasks: 418 total,   2 running, 413 sleeping,   3 stopped,   0 zombie
%Cpu(s):  8.8 us,  2.1 sy,  0.0 ni, 88.8 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
KiB Mem:  32864580 total, 32586956 used,   277624 free,   574236 buffers
KiB Swap:  1998844 total,  1113352 used,   885492 free, 16498252 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND           
24652 libvirt-  20   0 4169m 1.7g 7756 S   3.6  5.4   4:49.37 kvm               
25233 libvirt-  20   0 4450m 1.6g 7756 S   1.2  5.2   4:35.12 kvm               
25589 libvirt-  20   0 4163m 1.6g 7756 S   2.4  5.1   4:40.31 kvm               
 6562 root      39  19 2935m 658m 7460 S   0.0  2.1 100:05.62 java              
28393 libvirt-  20   0 4149m 624m 7756 S   0.0  1.9   2:25.01 kvm               
28106 libvirt-  20   0 4170m 617m 7756 S   0.0  1.9   2:18.17 kvm               
26519 libvirt-  20   0 4167m 590m 7756 S   0.0  1.8   2:22.16 kvm               
29399 libvirt-  20   0 4159m 589m 7756 S   0.0  1.8   2:19.94 kvm               


$ free -m
             total       used       free     shared    buffers     cached
Mem:         32094      31868        225          0        559      16175
-/+ buffers/cache:      15134      16959
Swap:         1951       1087        864


# /tmp/swap-used.sh |grep kvm
PID=944 - Swap used: 0 - (kvm-irqfd-clean )
PID=24652 - Swap used: 102468 - (kvm )
PID=25233 - Swap used: 108644 - (kvm )
PID=25589 - Swap used: 155768 - (kvm )
PID=26519 - Swap used: 192216 - (kvm )
PID=28106 - Swap used: 150796 - (kvm )
PID=28393 - Swap used: 208488 - (kvm )
PID=29399 - Swap used: 187388 - (kvm )

I've already tried setting swapiness to 20, then 10 and finally to 0, none of which has made a difference:

# cat /proc/sys/vm/swappiness 
0

I haven't rebooted the host since changing it from 60 to 0 (do I need to reboot)? I've also turned off swap completely with /sbin/swapoff -a; /bin/swapon -a. Immediately after re-enabling swap, I see this:

$ vmstat 5
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 2  0  11384 247572 652736 16539748   11   10   291   228   13   12  7  3 87  3
 0  0  12968 234360 652756 16554432    0  317  8576  1240 3508 5470 17  2 75  5
 1  0  17068 243512 652768 16559508    0  820  9448  1216 3687 4845 20  2 77  2
 1  0  20040 233300 652772 16576152    0  594 12262   677 4436 5063 29  2 68  1
 1  0  22764 219156 652792 16594448    6  546 11962   727 3870 4559 28  1 68  2
 3  0  40832 229384 652776 16602440    0 3614 58404  4176 2051 6231 21  2 66 10
 1  0  52420 232236 652784 16613320    0 2318 42174  2512 1819 4026 15  2 77  6

I've got 15 GB of free memory that can be used without having the processes swapped out.

Blair Zajac
  • 531
  • 5
  • 9
  • If you merely set `vm.swappiness`, that doesn't automatically swap in anything already in swap. Try `swapoff -a; swapon -a` to do that. – Michael Hampton Feb 12 '13 at 08:04
  • I've done that a couple of times and it goes back to swapping the VMs immediately. – Blair Zajac Feb 12 '13 at 08:17
  • 2
    Is it only ever swapping out (so) and never/rarely in (si)? If so, it's nothing to worry about. it's chunks of memory that are just not being used and thus the OS decided that the page cache is more important. That said, KSM might be also extremely useful for you. It should lower the memory pressure if each guest OS is similar. – R. S. Feb 12 '13 at 08:35
  • 4
    Because your OS is far better at memory management than you believe you are. – Tom O'Connor Feb 12 '13 at 10:04
  • @TomO'Connor I've done what the linked to article suggests on changing swappiness and it had no effect, so this isn't a dup. – Blair Zajac Feb 12 '13 at 18:31
  • @kormoc and TomO'Connor: You obviously don't have mostly idle VMs on a host with spinning disk drives. The VMs are painfully slow after a few days of not using them. There is nothing wrong with asking how to prioritise memory for the VMs over the filesystem cache and buffers. – Joachim Wagner Mar 13 '18 at 11:35
  • https://serverfault.com/questions/561446/how-can-i-keep-important-vms-in-memory-without-disabling-swap has 2 good pointers. Adding `` after `` in the qemu xml file fixed the problem for me. – Joachim Wagner Sep 18 '18 at 11:00

1 Answers1

8

I've got 15 GB of free memory that can be used without having the processes swapped out.

No, you don't.

           total       used       free     shared    buffers     cached
Mem:       32094      31868        225          0        559      16175

You have 225MB of memory free.

KiB Swap: 1998844 total, 1113352 used, 885492 free, 16498252 cached

Look at all that wonderful cache the system has made, in part by getting unused junk out of precious RAM. Why are you trying to make it waste precious RAM?

If you have 1GB of data in RAM that hasn't been used in hours, which makes more sense:

  1. You swap that out to disk when you're not busy and have 1GB more cache to use.

  2. You keep it in memory. You have 1GB less cache to use, and if that memory is ever needed for something else, you'll have to swap it out while you are busy.

But if you think you know more about memory management than the people who wrote your operating system's memory management logic, go ahead and keep trying to find that magic "do everything faster and better" switch.

David Schwartz
  • 31,215
  • 2
  • 53
  • 82
  • 3
    (It's a useful observation, but don't you think the tone could be more constructive?) – Michael McNally Feb 12 '13 at 09:23
  • 11
    I've tried every tone I know of for several years now. I've yet to find one that works. People continue to insist on thwanging delicately tuned knobs in various directions in response to imagined problems no matter what. But I'm not bitter. – David Schwartz Feb 12 '13 at 09:35
  • 2
    This answer is incorrect in regards to the amount of "free" memory that the OS can reallocate to different tasks, see http://www.linuxatemyram.com/ . The OS will use all available memory and over time the `top` output will always show free memory close to 0 which is what you want. But if you look at the `free -m` output on the `-/+ buffers/cache` line above it shows that if you account for the buffer cache where pages can be dropped, there's 16959 MB of memory that the OS can use. I would rather the OS leave the VMs in memory and drop buffer cache pages than otherwise. – Blair Zajac Feb 12 '13 at 18:17
  • 1
    That cache is being used to make the active VMs faster. What you are proposing is that you make all your VMs equally slow - instead of allowing the OS to make the active VMs faster and the innactive VMs slower. This doesn't make sense unless you're trying to provide some sort of incredibly tight SLA. – Chris S Feb 12 '13 at 18:45
  • In which case you should investigate using cgroups around your vms. – mattdm Feb 12 '13 at 19:04
  • 1
    @ChrisS I have 7 VMs at 1.7 GB each = 11.9 GB on a host system with 32 GB of RAM, so I can afford to keep them all in RAM. – Blair Zajac Feb 12 '13 at 19:11
  • @BlairZajac: That's gibberish and it's the reason I don't link to linuxatemyram. There's no "16959 MB of memory that the OS can use". The OS can use the entire 32GB of memory. Even memory that is currently used by a process can be switched to some other purpose by the OS. Unless memory is locked (which is almost always an insignificant amount) the OS can always switch memory to a "better" use should one arise. In this sense, almost all RAM is free. – David Schwartz Feb 13 '13 at 00:00
  • @DavidSchwartz: What's gibberish? Maybe the phrasing isn't ideal, but what do you think about the point that there's 16959 MB of memory that the OS could leave for the KVM processes and not swap out to make more space for buffer cache. – Blair Zajac Feb 13 '13 at 01:42
  • @BlairZajac: It's gibberish to say there's "16959 MB of memory that the OS can use". The OS can, and does, use *all* memory for whatever purpose it thinks is best. The algorithms it uses to make these decisions are designed and tweaked by people who understand OS memory use because they've devoted years to understanding it. Yes, you've cited one factor in favor of not swapping. Do you think the guys who wrote your OS's swapping algorithm didn't understand that argument? Do you think they failed to give it the weight it deserved? – David Schwartz Feb 13 '13 at 02:03
  • @DavidSchwartz Right, we agreed that the phrase "16959 MB of memory that the OS can use" isn't good. But you agree there's 16 GB of memory used by the buffer cache? Why does the kernel think its best to have 16 GB of buffer cache and swap processes out when swappiness is set to 0? – Blair Zajac Feb 13 '13 at 03:14
  • @BlairZajac: Because it's not busy and might need that memory for something else in the future. Imagine if right now there's a huge memory crunch. Would you rather the machine have to swap out then, when it's under load rather than now, when it's free to write to disk because it's not busy? (Perhaps you assume that it can't both write something to swap and keep it in memory. False assumption -- of course it can. There is no need to make a painful tradeoff here. RAM is plentiful. Swap is plentiful. Load is low. You *can* have it both ways. Why do you not want the obviously optimal solution?!) – David Schwartz Feb 13 '13 at 04:41
  • Your operating system's memory management code was written by the experts in that field. The OS has a complete view of exactly how memory is actually being used on your system. It's absurd for you to insist that you must know better than them how memory should be used on your system. And your effort to turn sensitive knobs huge amounts in search of a magic "go faster" button that doesn't exist is silly. Do you really think they don't know that loading from disk is slower than loading from RAM? – David Schwartz Feb 13 '13 at 05:10
  • No David. There is not 1 single objective function that everybody agrees on. Different users want different OS behaviour. I also find it annoying that my idle VMs get swapped out and are super slow when I come back to use them. Thanks to @mattdm for making a constructive suggestion (cgroups). – Joachim Wagner Mar 13 '18 at 11:41
  • I just witnessed complete swap out of a 4 GB VM within less than 5 minutes while I was using it. There was no memory pressure on the 32 GB host, just an application writing two files with 6 GB each. The VM became unusable. It took 24 minutes to restart the VM. – Joachim Wagner Sep 18 '18 at 10:37