3

I am running fedora 5.3.12-200.fc30.x86_64 and testing simple tcp client/server codes. I tried running different ps commands but I can not get any value displayed in wchan field, although in my case both server and client are in suspended states.


edited

After further testing it looks like it depends on distribution. Kali and devuan compute wchan value, while fedora and OpenSUSE Tumbleweed do not. Does anyone have a clue why that might be the case? Only entry in kernel that i found corresponds to wchan value

(CONFIG_SCHED_OMIT_FRAME_POINTER=y)

is configured the same way in kali, tumbleweed and fedora. I could not find that parameter in devuan.


Following is output of ps command on fedora. No matter what I do wchan only shows that process is running. I tried running different ps commads displaying all processes but wchan values never show anything else but hyphen.

$ ps -o pid,ppid,wchan=WIDE-WCHAN-COLUMN -o comm -o stat -t pts/6 -t pts/7
   PID   PPID WIDE-WCHAN-COLUMN COMMAND         STAT
 58565   4247 -                 bash            Ss
102840  58565 -                 su              S
102848 102840 -                 bash            S
103048   4247 -                 bash            Ss
122844 102848 -                 tcpserv01       S+
122848 103048 -                 tcpcli01        S+
122849 122844 -                 tcpserv01       S+

I checked wchan in /proc and all I get is 0.

$ cat /proc/122844/wchan
0

Server's strace does not pass accept() which is exactly what I expected.

# strace -p 122844
strace: Process 122844 attached
accept(3, 

Client's strace is blocked at read() as expected.

# strace -p 122848
strace: Process 122848 attached
read(0, 

But they don't show in wchan. What am I missing?



On a side note I also have FreeBSD (VM) on same machine and in FreeBSD 12.0-RELEASE wchan shows correctly when using ps command, so I am pretty sure this has something to do with fedora.

$ ps aux -o pid,wchan=WIDE-WCHAN-COLUMN -o comm -o stat
USER  PID  %CPU %MEM   VSZ   RSS TT  STAT STARTED      TIME COMMAND          PID WIDE-WCHAN-COLUMN COMMAND          STAT
root   11 599.0  0.0     0    96  -  RNL  13:01   360:26.86 [idle]            11 -                 idle             RNL
root    0   0.0  0.0     0   528  -  DLs  13:01     0:00.01 [kernel]           0 swapin            kernel           DLs
root    1   0.0  0.0  9952  1016  -  ILs  13:01     0:00.01 /sbin/init --      1 wait              init             ILs
root    2   0.0  0.0     0    16  -  DL   13:01     0:00.00 [crypto]           2 crypto_w          crypto           DL


EDIT I found following in man ps

-n Set namelist file. Identical to N. The namelist file is needed for a proper WCHAN display, and must match the current Linux kernel exactly for correct output. Without this option, the default search path for the namelist is: $PS_SYSMAP...

So I've set

PS_SYSMAP=/boot/System.map-$(uname -r)

But I still do not get any output from wchan. If I try running same command as before but with -n I get

$ ps -n -o pid,ppid,wchan=WIDE-WCHAN-COLUMN -o comm -o stat -t pts/2 -t pts/3 -t pts/4
   PID   PPID WIDE-WCHAN-COLUMN COMMAND         STAT
  4830   4829                 - bash            Ss
  6201   4829                 - bash            Ss
  6251   6201                 - tcpserv01       S+
  6252   4829                 - bash            Ss
  6292   6251                 - tcpse <defunct> Z+
  6356   6252                 - tcpcli01        S+
  6357   6251                 - tcpserv01       S+
  6481   4830                 - ps              R+

With -n option wchan does not even show hyphen as before.




EDIT 2 Answer to following question is no. Kali's kernel configured that parameter exactly as fedora, but in Kali wchan values are computed. OpenSuse Tumbleweed behaves just like fedora, does not compute wchan values. Devuan computes wchan.

Could missing wchan values be due to

CONFIG_SCHED_OMIT_FRAME_POINTER: Single-depth WCHAN output

which in my kernel is configured as

CONFIG_SCHED_OMIT_FRAME_POINTER=y

Paul
  • 2,755
  • 6
  • 24
  • 35

1 Answers1

2

This is a Fedora bug which affects at least Fedora 31 and 32 (#1879450).


Note that the ps man page you are quoting from is outdated - current ps versions directly read the symbolic wchan information from /proc/$pid/wchan.


Looking at the stack walking procedure it seems that it simply doesn't work without frame pointers. CONFIG_SCHED_OMIT_FRAME_POINTER isn't the only relevant parameter here, also relevant (on Fedora) seems to be:

ONFIG_UNWINDER_ORC=y
# CONFIG_UNWINDER_FRAME_POINTER is not set

See also this 2013 Debian bug report about wchan breakage.


/proc/$pid/stack can be used as a substitute for missing wchan information.

See also my answer to a similar question.

maxschlepzig
  • 694
  • 5
  • 16