How do I troubleshoot CPU temperatures in Ubuntu?

1

1

I have some questions concerning CPU temperature on my Ubuntu 8.10.

  1. My laptop shuts itself down, which I guess is due to high CPU temperature. I want to confirm this by looking at some system log file that records the reason that the system shuts itself down. Is there any such syslog file? Where is it stored?

  2. Also I have installed libsensors, which gives me different temperatures

    $ sensors  
    acpitz-virtual-0  
    Adapter: Virtual device  
    temp1:       +49.0°C  (crit = +97.0°C)                    
    
    k8temp-pci-00c3  
    Adapter: PCI adapter  
    Core0 Temp:  +57.0°C 
    

    What do "acpitz-virtual-0" and "k8temp-pci-00c3" mean? As well as the meaning of "temp1" and "Core0 Temp"? Are the two temperatures both CPU temperatures?

    Is the temperature given by

    acpi -t
    

    another different measure of CPU temperature?

  3. I also wonder what you will do if the CPU temperature is exceeding some limit that you find is dangerous? I have also installed Computer Temperature Monitor (computertemp), which allows me to set up a limit temperature for alarm as well as command to execute when the limit is reached. So what command will you issue or things will you do when the temperature is exceeding the set limit to protect your laptop, instead of letting it shuts itself down?

Tim

Posted 2009-11-07T20:31:58.667

Reputation: 12 647

Most likely, it's not the OS shutting down itself, but the BIOS is doing it to protect the hardware. – Bogdacutu – 2012-06-24T17:00:55.550

Answers

3

On Ubuntu you will have /var/log/pm-*.log as well as the usual syslog.

acpitz-virtual-0 is an unhelpful label for an ACPI thermal zone, probably from an ACPI table at runtime.

k8temp* comes partly from /etc/sensors3.conf and partly from where the system found the chip. You probably have a default ABit configuration that looks like:

chip "k8temp-*"

   label temp1 "Core0 Temp"
   label temp2 "Core0 Temp"
   label temp3 "Core1 Temp"
   label temp4 "Core1 Temp"

If it's in fact an ABit system board you should check the BIOS as it may have have better descriptions.

Regarding acpi -t, sensors(1) is checking both acpi and hardware devices it knows about. In a perfect world sensors(1) would report a superset of what acpi is able to report.

It really shouldn't be possible to overheat a laptop unless it is operated in a rather hot environment. It's more likely that the configuration file or BIOS settings are off, or perhaps some filters need cleaning. If the notebook heatsink was installed using thermal grease, that's known to not age well. (However, thermal grease is unlikely to have been used for original production.) You might be able to regrease it or use a modern thermal interface pad. Don't remove the heatsink unless you are prepared to throw away the old thermal interface and install a new one.

DigitalRoss

Posted 2009-11-07T20:31:58.667

Reputation: 2 968

DigitalRoss, thank you so much! Some questions regarding your reply (1) I checked my /var/log/pm-*.log and syslog but did not find anything mentioned the crash down of my system yesterday. I guess perhaps the logging has not been enabled? (2) How to check ABit configuration? (3) How to check if the configuration file or BIOS settings are off and how to set them on? Thanks! – Tim – 2009-11-07T22:18:22.197

First, get an idea of the real sensor array and its BIOS setup by bouncing the box and hitting F2 or F12 or whatever pops you into the BIOS, then find whatever page has thermal data and settings and read it carefully and make sure all settings are sane. Then try, man 5 sensors.conf and goto http://www.lm-sensors.org. You could also find the grub menu and try startup with boot option acpi=off

– DigitalRoss – 2009-11-07T23:15:33.470