5

I have an idle Linux centOS system and yet kswapd is using 100% cpu.

All I have running is a single bash session with top running.... I have 32G RAM and yet kswapd is constantly using 100% cpu for over 4 hours.

Deshawn
  • 363
  • 2
  • 4
  • 5
  • 1
    Can we get the output of top and ps -ef? Also, an output from /proc/cpuinfo would be nice. – Rilindo Sep 29 '11 at 20:42
  • possible duplicate of [How do I tell what process is causing kswapd to be in use?](http://serverfault.com/questions/316560/how-do-i-tell-what-process-is-causing-kswapd-to-be-in-use) – mailq Sep 29 '11 at 20:51
  • What kernel version do you have? And can you paste the output of `free`? – David Schwartz Sep 30 '11 at 07:24
  • Linux version 2.6.18-164.el5 (mockbuild@builder10.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Thu Sep 3 03:28:30 EDT 2009 – Deshawn Sep 30 '11 at 18:11
  • total used free shared buffers cached Mem: 32425584 32293300 132284 0 14932 9179804 -/+ buffers/cache: 23098564 9327020 Swap: 25165812 94624 25071188 – Deshawn Sep 30 '11 at 18:11
  • 3
    Something, other than cache, is using most of your memory. You need to figure out what. Try `ps axv --sort=-rss | head -10` and look at the `RSS` fields. – David Schwartz Oct 01 '11 at 04:35

1 Answers1

3

AFAICS this is neither related to free RAM nor SWAP. We have the same problem here which sometimes hits production machines and there is plenty of RAM free, quite often more than 700 MB with no dirty buffers to sync and 0 bytes SWAP used. It definitively looks like a severe Kernel BUG due to some unknown race condition.

Currently we run CentOS Kernel 2.6.18-194.el5 and will try to replace it by some newer kernel, because we think, this might help.

Update:

RedHat had confirmed that it is a kernel issue for 2.6.18-194.el5

Solutions:

Minimum: kernel-2.6.18-194.32.1.el5 contains the immediate bugfix
Better: kernel-2.6.18-238.el5 contains additional kswapd-related bugfixes
Best: kernel-2.6.18-348.4.1.el5 latest kernel which runs with RHEL 5.5 without change

In the meanwhile there is a script, which is able to detect the 100% CPU situation quite well. It is called by our monitoring each minute to inform us about the situation. If the situation stays for too long, affected machines would lock up completely due to more and more unkillable processes using 100% CPU, until the machine becomes completely unmanageable.

Currently the only way known to solve the problem is to manually hard reboot the affected machine. /sbin/reboot fails, because the machine hangs on shutdown quite too often.

To hard-reboot a machine from any root shell commandline without direct access to Console do:

echo 10 > /proc/sys/kernel/panic
echo 1 > /proc/sys/kernel/sysrq
echo s > /proc/sysrq-trigger
sleep 5
echo s > /proc/sysrq-trigger
sleep 1
echo b > /proc/sysrq-trigger

Keep in mind, do this after quiescing the machine, such that there is no more process writing to the disks. This shall prevent that fsck runs in severe trouble after reboot.

Sorry, no real solution, but HTH. And keep in mind, perhaps there might be other things which cause a 100% CPU situation on kswapd than described here. So automating a reboot in this case perhaps is a bad idea.

Tino
  • 1,103
  • 1
  • 12
  • 16
  • Please note that the kswap hanging at 100% was confirmed being a Linux bug. It was fixed in mid/end 2012 kernels. – Tino Jan 03 '14 at 13:49
  • do you have link to what versions have it and from what version it is fixed? – Wim Deblauwe Jan 19 '15 at 16:03
  • 1
    All I have is from the link which is in my post: `2.6.18-194.el5` (RHEL) is affected, `2.6.18-194.32.1.el5` and above is fixed. No idea for other kernel series, sorry. – Tino Jan 23 '15 at 02:14
  • 1
    Err, I'm using Linux 4.0, and I'm having this problem. – Zaz Jun 01 '15 at 19:21
  • @Zaz This Q/A is from 2011/2013, but kernel 4.0 is from 2015. So apparently you hit something else. I recommend you to open a new question like "why is kswapd on kernel 4.0 using high CPU on idle system" – Tino Jun 02 '15 at 15:30
  • @Tino: Sure. Here's the link if interested: [kswapd often uses 100% CPU when swap is in use](http://serverfault.com/q/696156/49785) – Zaz Jun 02 '15 at 17:55
  • 2
    I'm seeing this shit on 4.2.x on latest Xubuntu 15.10. Amazing this is still a problem on a supposedly "serious" OS like Linux today........ – lnostdal Oct 31 '15 at 16:44