running a high CPU consuming process, but top/htop show ALL process cpu 0%?

Question

all. I have these weird servers can not explained as follow:

htop

1  [|||||||||||||||                              28.5%]     Tasks: 53 total, 1 running
2  [||||||||||||||||                             31.1%]     Load average: 0.00 0.00 0.00 
3  [||||||||||||||||                             30.5%]     Uptime: 211 days(!), 02:21:04
4  [|                                             0.7%]
Mem[||||                                   171/16077MB]
Swp[                                         0/11610MB]

  PID USER     PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command                                                   
    1 root      20   0  8352   840   704 S  0.0  0.0  1:02.48 init [2]
23764 root      20   0 10584  1364  1172 S  0.0  0.0  0:00.00  `- bash -c while sleep 0.000001; do echo 29150 | md5sum ; done

top

top - 01:36:46 up 211 days,  2:40,  5 users,  load average: 0.00, 0.00, 0.00
Tasks: 108 total,   2 running, 106 sleeping,   0 stopped,   0 zombie
Cpu(s):  4.8%us, 18.0%sy,  0.0%ni, 77.2%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  16463184k total,   797364k used, 15665820k free,   122992k buffers
Swap: 11889656k total,        0k used, 11889656k free,   499496k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                   
1 root      20   0  8352  840  704 S    0  0.0   1:02.48 init                                                       
2 root      20   0     0    0    0 S    0  0.0   0:00.15 kthreadd                                                   
3 root      RT   0     0    0    0 S    0  0.0   0:00.14 migration/0                                                
4 root      20   0     0    0    0 S    0  0.0   0:00.22 ksoftirqd/0                                                
5 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/0                                                 
6 root      RT   0     0    0    0 S    0  0.0   0:00.16 migration/1                                                
7 root      20   0     0    0    0 S    0  0.0   0:00.09 ksoftirqd/1                                                
8 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/1

the server acted that status some days before, so I try to use the "while...md5sum" offer the cpu pressure, but not only that while's cpu/mem usage were 0%, but actually all the others' usage were 0%.

obviously when I killed that while loop, the htop bar went down to true 0% (the server don't really have much work to do).

and I double check the "md5sum which htop" (and top) on the other NORMAL server, they have the exactly binary/md5 result.

so, any idea ? Am I so deadly rootkitted? I have used rkhunter / chkrootkit already, no clue.

good point view. actually, I have a group of these servers. some act that weird, the others seemed all right. I did reboot one of the weird s, turned out, I ran the while loop on it, the top/htop functioned as normal... but I keept this weird one not rebooting. Then I came to post this question. I am so doubt of being rooted, that is so unacceptable for these server/service . — kiiwii, Nov 20 '11 at 01:17

Andrew Case · Answer 1 · 2011-11-19T23:14:04.640

3

To check if top or associated libraries are hiding processes due to a rootkit, you can compile a static version of top on another system. Then copy that version over and run it. If you've been root kitted, the hidden processes should show up in that static top since it won't be using any of the rootkit libraries.

Some suggestions in regards to determining what else might be causing the problem:

Disable as many unnecessary services (networking, iptables, auditd, selinux, sendmail, nfs, netfs, nscd, etc.) as possible to limit where the extra CPU cycles could be being used.
Look in /var/log/* to see if anything is spitting out errors.
Enable logging or more verbosity in your services
Use a program like dtrace or systemtap to see what is going on in

edited Nov 19 '11 at 23:14

answered Nov 19 '11 at 21:35

Andrew Case

3,409
3
21
38

thank a lot for the information. follow your instruction, I found this static-linked version of htop(http://htop.sourceforge.net/htop-0.6.6-static.gz), and ran on the weird box. over ran the CPU with that loop, but still, display all process USING 0% CPU. what a awful day and night I had... can we assume the debian/linux issue ? all my boxes running debian squeeze: "Linux XXX 2.6.32-5-amd64 #1 SMP Mon Mar 7 21:35:22 UTC 2011 x86_64 GNU/Linux" rebboot did helped. but I cannot accept that windows moto "restart/reboot to rescue the world"... so, at the end, all I have to do is a fully reinstall? – kiiwii Nov 20 '11 at 08:55
Any errors in /var/log/messages? It could be your HD is having IO problems and causing system calls to stall waiting on IO. Have you tried enabling smartd and running smartctl --all /dev/sda or whatever your hard drive is? Yes, I think you can eliminate rootkit as the cause. – Andrew Case Nov 21 '11 at 09:01
dropbear accept my user/pass, but failed to provide the shell. even the shell I left inside the screen of that box became unfunctional. simple ls OR w command failed to execute. no other way but only snmpd to my zenoss-server, I could still know about my server's status was the extremely high cpu load / usage. no extra network, or IO or memory. just cpu. after one hour or later, the load went down, and let me back to the box, check all the box status. nothing special, like nothing happened, no proces causing no problem like ghost never here before .... what a horrible box ... – kiiwii Nov 21 '11 at 11:10
I reported this issue to the company, we decided to do a fully reinstall. thanks all . – kiiwii Nov 21 '11 at 11:11
SORRRY, not "much", but "NOT much",my apology. I might want to say "little" OR "not much". definitely NOT "much", all. sorry again. I hope no rootkit, too. no clues or traces from /var/log/auth.log||syslog||OTHTER.LOGS. also, I have no unnecessary services on that box, and all the servers are doing their own job and under **NOT** very much pressure/load.So you may wonder why I am curious about this weird event, it is only skin-deep, but for more information, the box's load sometimes turned from around 0-1 to 40-100, and sshd rejected my login(port 22 close instantly), – kiiwii Nov 21 '11 at 11:15
Yeah a reinstall is probably the easiest bet. Are you using NFS? Did you try taking it off the network and see if it was slow locally (run level 1 or 2?) It would be easier to diagnose with more information from already suggested posts. – Andrew Case Nov 22 '11 at 18:26
yes. thanks for all of your help. NO NFS, run level 2 (debian default), typical java / apache / mysql STANDALONE server, doing their own laziness job. no complicate network structure, low network traffic and low IO . so that weird scene made me concern a lot. thank again, all friends. you are doing great help for the diagnosis. – kiiwii Nov 23 '11 at 01:32
Did you try running smartd? What about smartctl output? – Andrew Case Nov 23 '11 at 05:57
sorry for not replying this query before. no smartd. very compact debian/linux server, almost configured by default add some particular service like Apache2 / Mysql / Java, etc. pretty simple box. – kiiwii Nov 23 '11 at 09:36
Well if you want to diagnose the problem, I'd suggest installing the proper diagnostic tools. Did you try systemtap or dtrace? This is probably the most useful suggestion at this point. – Andrew Case Nov 24 '11 at 23:27

score 0 · Answer 2 · answered Nov 19 '11 at 19:45

0

On what tty are you running the while loop? I have a vserver with 8 CPUs, I'm logging in via ssh, the loop is running at 4 CPUs at first, runs only on 2 CPUs later but the load goes up to 1.5 while %CPU remains at 0. Can you check /proc/loadavg (if you have this) while the loop is running.

answered Nov 19 '11 at 19:45

ott--

1,081
1
11
13

ssh THATSERVER "while...md5..." – kiiwii Nov 19 '11 at 20:16
I can reduce the load from 1.5 to 0.5 when I redirect the output to /dev/null. What does your /proc/loadavg report? – ott-- Nov 19 '11 at 20:33
all around 0.13 0.03 0.01, bro. /proc/loadavg accommodated of top/htop and uptime and on and on ... – kiiwii Nov 19 '11 at 21:30

running a high CPU consuming process, but top/htop show ALL process cpu 0%?

2 Answers2