6

I've been running a couple old HP machines on Debian for a while, and only recently noticed that they were only 'recognizing' and using one processor. cat /proc/cpuinfo only shows output for processor #0, same with top, etc. And when I pulled the system covers off and felt the heatsinks, only one heatsink in each was hot. I'm pretty sure that the second processor in each isn't dead, because the problem is the same on both of them.

I've been told that I need to install an SMP kernel (these systems are 32-bit by the way, as they're quite old) but when I do uname -a, I get:

Linux DL360-G3-3 2.6.32-5-686 #1 SMP Mon Feb 25 01:04:36 UTC 2013 i686 GNU/Linux

The SMPpart of that leads me to believe that SMP is enabled in my kernel, but the systems are still only displaying and using 1 processor.

Does anybody know what's wrong here?

EDIT:

Ouput of ls /sys/devices/system/cpu:

cpu0  cpufreq  cpuidle  kernel_max  offline  online  perf_events  possible  present

Output of dmidecode (cut to just the CPU info):

Processor Information
    Socket Designation: Proc 1
    Type: Central Processor
    Family: Xeon
    Manufacturer: Intel
    ID: 29 0F 00 00 FF FB EB BF
    Signature: Type 0, Family 15, Model 2, Stepping 9
    Flags:
            FPU (Floating-point unit on-chip)
            VME (Virtual mode extension)
            DE (Debugging extension)
            PSE (Page size extension)
            TSC (Time stamp counter)
            MSR (Model specific registers)
            PAE (Physical address extension)
            MCE (Machine check exception)
            CX8 (CMPXCHG8 instruction supported)
            APIC (On-chip APIC hardware supported)
            SEP (Fast system call)
            MTRR (Memory type range registers)
            PGE (Page global enable)
            MCA (Machine check architecture)
            CMOV (Conditional move instruction supported)
            PAT (Page attribute table)
            PSE-36 (36-bit page size extension)
            CLFSH (CLFLUSH instruction supported)
            DS (Debug store)
            ACPI (ACPI supported)
            MMX (MMX technology supported)
            FXSR (Fast floating-point save and restore)
            SSE (Streaming SIMD extensions)
            SSE2 (Streaming SIMD extensions 2)
            SS (Self-snoop)
            HTT (Hyper-threading technology)
            TM (Thermal monitor supported)
            PBE (Pending break enabled)
    Version: Not Specified
    Voltage: 1.5 V
    External Clock: 533 MHz
    Max Speed: 3600 MHz
    Current Speed: 3066 MHz
    Status: Populated, Idle
    Upgrade: ZIF Socket
    L1 Cache Handle: 0x0716
    L2 Cache Handle: 0x0726
    L3 Cache Handle: 0x0736
Handle 0x0400, DMI type 4, 32 bytes
Processor Information
    Socket Designation: Proc 2
    Type: Central Processor
    Family: Xeon
    Manufacturer: Intel
    ID: 25 0F 00 00 FF FB EB BF
    Signature: Type 0, Family 15, Model 2, Stepping 5
    Flags:
            FPU (Floating-point unit on-chip)
            VME (Virtual mode extension)
            DE (Debugging extension)
            PSE (Page size extension)
            TSC (Time stamp counter)
            MSR (Model specific registers)
            PAE (Physical address extension)
            MCE (Machine check exception)
            CX8 (CMPXCHG8 instruction supported)
            APIC (On-chip APIC hardware supported)
            SEP (Fast system call)
            MTRR (Memory type range registers)
            PGE (Page global enable)
            MCA (Machine check architecture)
            CMOV (Conditional move instruction supported)
            PAT (Page attribute table)
            PSE-36 (36-bit page size extension)
            CLFSH (CLFLUSH instruction supported)
            DS (Debug store)
            ACPI (ACPI supported)
            MMX (MMX technology supported)
            FXSR (Fast floating-point save and restore)
            SSE (Streaming SIMD extensions)
            SSE2 (Streaming SIMD extensions 2)
            SS (Self-snoop)
            HTT (Hyper-threading technology)
            TM (Thermal monitor supported)
            PBE (Pending break enabled)
    Version: Not Specified
    Voltage: 1.5 V
    External Clock: 533 MHz
    Max Speed: 3600 MHz
    Current Speed: 3066 MHz
    Status: Populated, Enabled
    Upgrade: ZIF Socket
    L1 Cache Handle: 0x0710
    L2 Cache Handle: 0x0720
    L3 Cache Handle: 0x0730
Handle 0x0716, DMI type 7, 19 bytes

As you can see, the first processor has a status of "Populated, Idle", while the second processor has a status of "Populated, Enabled". I'm pretty sure this means this is a kernel issue. Anyone else have any other thoughts?

Libbux
  • 295
  • 1
  • 2
  • 14
  • Does the BIOS show both processors? – Nathan C Jun 03 '13 at 14:31
  • @NathanC Yes, that's why I'm 99% sure it isn't a hardware fault. Especially since the problem is the same on two boxes. – Libbux Jun 03 '13 at 14:38
  • 1
    Can you update your post with the output of `ls /sys/devices/system/cpu` for me? I have a hunch. – Nathan C Jun 03 '13 at 14:45
  • Interesting ...even the kernel is only showing one processor. My question came from http://www.cyberciti.biz/faq/debian-rhel-centos-redhat-suse-hotplug-cpu/ - hopefully someone will come by who's seen this before. – Nathan C Jun 03 '13 at 14:49
  • 3
    Can you check the BIOS and make sure someone hasn't turned off the second processor there? – Zypher Jun 03 '13 at 16:37
  • @Zypher Would the processor still show up in `dmidecode` if it was disabled in BIOS? – Libbux Jun 03 '13 at 17:55
  • @TheLibbster no, if it was disabled in BIOS it would be invisible to any OS level tools. – Zypher Jul 02 '13 at 20:25
  • 1
    HP Proliant servers seem to need some special BIOS love, see this [VMWare KB](http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1081) and check your BIOS settings. – dlu Jul 03 '13 at 18:11
  • If you boot into another Live distro temporarily such as Fedora LiveCD how many processors do you see in /proc/cpuinfo ? Then you could check if its a software error. Also you could swap cpu1 to cpu0 and try with that then you would rule out that cpu1 is faulty with one cpu if that works cpu1 is ok. – Christian Jul 04 '13 at 19:13
  • @Christian I've already ruled out a physical processor error. – Libbux Jul 05 '13 at 00:41
  • @Zypher Both CPUs show up in BIOS, I can't find a way to disable them, so I can't see how one would be disabled. – Libbux Jul 05 '13 at 00:43
  • Just a wild guess your question seems [very similar to an askubuntu question] . – user Jul 05 '13 at 15:34
  • @user Your link doesn't appear. – Libbux Jul 09 '13 at 06:15
  • @TheLibbster sorry here is is.[http://askubuntu.com/questions/127815/using-quad-core-but-only-1-cpu-entry-in-proc-cpuinfo-is-smp-running-on-my-c](http://askubuntu.com/questions/127815/using-quad-core-but-only-1-cpu-entry-in-proc-cpuinfo-is-smp-running-on-my-c) – user Jul 09 '13 at 08:19
  • Can you post the contents of `/etc/default/grub`? Had the same problem with a newer Dell R210 II and Debian. I had to remove `noapic`and `nolapic` from the **GRUB_CMDLINE_LINUX_DEFAULT** variable. – norrland Jul 24 '13 at 05:39

2 Answers2

2

This VMWare article may be useful (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1081)

This is relevant since ESX is built on Linux.

Basically modify BIOS settings like this:

  • System->OS Selection: Windows 2000
  • Advanced Options->MPS Table Mode: Full Table APIC

I haven't explicitly tested this resolution on the system you are using, but I have seen similar issues on hardware of the same age.

RussellM
  • 31
  • 4
1

Ok, so after all this time it turns out that for some reason it just 'started working'. In fact, it may have been working the entire time, and I just didn't realize it. It's a little bit odd, but CPUs 2 and 4 get ~90% of the work, while CPUs 1 and 3 get ~10%, which would explain why I felt a physical difference in the temperature of the processors when I pulled the machine apart. Thank you for all of your answers nonetheless.

Libbux
  • 295
  • 1
  • 2
  • 14