Why does a system apper to "hang" when there are many processes in 'D' state?

1

I have noticed that on a (linux) system if there are many processes in "D" (Uninterruptible) state, the system starts "hanging". The D state is mostly due to process waiting on I/O.

By "hang" I mean that, I can't type commands on the console or starting new processes is very slow, leading me to believe that the cpu is "busy" doing something. But my understanding is that a process in D state is not doing anything, but just waiting. This should need any cpu computation and in fact free the cpu and let it schedule other tasks.

I'm definitely missing something, cause the D state processes also add to the load average of a system. I don't understand why this is done, how does a process in uninterruptible sleep contribute to load?

rags

Posted 2011-02-14T16:14:29.513

Reputation: 293

If there is many processes waiting, something is wrong. Is for example "top" showing large percent in iowait (field "wa" in Cpu line)? – Olli – 2011-02-14T16:24:16.817

@Olli, this is from my general experience. I don't have a system with these symptoms, but I'd like to set up a test now. – rags – 2011-02-15T21:25:04.987

Answers

1

When you type a command on the console, this "command" - if not a shell builtin command - is also a system file, an executable one. When you type "ls" ALL dirs in your PATH variable are searched in order for an executable called "ls" - at least as long as "ls" wasn't hashed before.

This search means accessing the disc, several times on several directories. If as you wrote you have many processes waiting for I/O, also your shell will have to wait before it can search for your typed command.

Also some commands you type will also want to read and/or write to some file on the disk, adding to the waiting for I/O processes ...

This doesn't add CPU load, but it adds I/O load which also counts in Linux to "total load", why? I think this is just a matter of definition. See http://en.wikipedia.org/wiki/Load_%28computing%29 .

Therefore, distributing your I/O on several discs (not partitions) could improve your performance. Having the system on one disc, and your data on other discs will help. Think also that just starting a new shell makes it search your $HOME for files like .profile, .bashrc and others, and if the disc is busy with many other I/O this can take some time.

rems

Posted 2011-02-14T16:14:29.513

Reputation: 1 850

Command line processing doesn't look at all PATH dirs, some shells (bash) have hashes of previous command locations. If it was the PATH search, this would be a consistent slowness, yet the OP is talking about occasional slowness in a certain system state. – Rich Homolka – 2011-02-15T18:46:58.437

If you read carefully, you will see I mentioned the bash hashing feature at the end of the first paragraph. Second, if the command he is typing is accessing the disc, think of ls or find or any command that does some I/O (and most system commands do some I/O), it will be slow and appear to hang if many others are also doing I/O. – rems – 2011-02-15T19:13:53.797

so basically most commands tend to do a disk access and they may hang due to the same reason the D state processes are stuck? I'm not sure, but setting up a test case should confirm this. – rags – 2011-02-15T21:21:57.703

2

D is uninteruptible sleep. Yes, in theory processes in D state are sleeping and not directly using I/O. But from my experience with processes in this state, it most likely means something is gorked in a device driver. If you have many 'D' state processes, then Something Very Bad(tm) is happening to your OS. This may be spin loops in a driver (wasting CPU cycles), buffers being read and thrown away, whatever. I'd check your system logs and dmesg for any device or device driver errors.

In conclusion, D isn't a natural state, an occasional process might be ignorable, but if you have lots of D procs and are experiencing slowness, you have some spelunking to do.

Rich Homolka

Posted 2011-02-14T16:14:29.513

Reputation: 27 121