6

There is this popular job interview question:

Given a machine hangup (let's say RHEL) how do you trouble shoot the problem?

My answer would be:

1) I'd use (what is the name of that server BIOS which allows you to connect to its console?) or go down to the server room and connect a monitor and keyboard to it and login as root.

2) Then I'd run "top" to see if some process has a very high CPU usage

3) Then I'd check memory (by "top" again?) and the total number of processes ("ps uawx") and the system limit (how, would "limit" give me the correct number)?

And then I don't know. Maybe run "vm"? But what would it tell me?

Please give me few good advices and impressive sentences for the recruiter.

Alexander Farber
  • 714
  • 4
  • 16
  • 38
  • 1
    it doesn't really hang if you can still log in and run diagnostics, maybe change the title to "troubleshoot overloaded linux machine"? – Sgaduuw Aug 04 '11 at 20:18
  • I've experienced that a lot: when you only can login as root at the console (real display + keyboard) and not by SSH. And what do you mean by "run diagnostics", have you understood my question? – Alexander Farber Aug 04 '11 at 20:26
  • 1
    if it's really hanging - try to crash it and analyse the memdump. `sysrq` is your friend there – dyasny Aug 04 '11 at 20:28
  • 1
    if you cannot login via ssh, but the console works, you probably have a network issue or sshd is down or malfunctioning. – dyasny Aug 04 '11 at 20:29
  • 4
    I don't understand how my question about diagnosing hanging Linux server is an "off topic" – Alexander Farber Aug 05 '11 at 09:50

1 Answers1

12

You can

  • check /var/log/messages for hints,
  • analyze sar -A output,
  • take a look at vmstat,
  • iotop,
  • dstat

For really bad lockups, you also have the Magic SysRq key to squeeze some info from the system.

Other places to look is the CMDB, see if there are any previous problems logged with the server and if there is an accepted workaround and or planned problem fix. You can even ask coworkers. There is more to a job than just technical prowess.

Sgaduuw
  • 1,823
  • 12
  • 16