0
I am attempting to determine what is causing an embedded industrial computer (ARK-1550-S9A1E) with Intel 4th Gen Core i5-4300U Dual Core to scale down all the cores to around ~200 MHz from 1.90 GHz
There is several utilities/tools (turbostat or msr) tools that indicate that the reason it has scaled down is because of ThermStatus and "Digital Readout" shows 65 C/149 F.
The device itself is running Ubuntu 18.04 LTS server (no GUI, headless application) and the applications running on it are at most taking 20% of the CPU. There is nothing really to spike up this CPU utilization, so it is incredibly surprising that it is overheating. It is an industrial fan-less PC, so it does have a lot of hardware to dissipate heat.
Below is the output form MSR and turbostat for all the detail regarding the register readings.
user1@ubuntu-18.04_64:~$ cat /proc/cpuinfo | grep "MHz"
cpu MHz : 230.404
cpu MHz : 227.324
cpu MHz : 217.117
cpu MHz : 174.135
user1@ubuntu-18.04_64:~$
user1@ubuntu-18.04_64:~$ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
performance
performance
performance
user1@ubuntu-18.04_64:~$
user1@ubuntu-18.04_64:~$ sudo rdmsr 0x770 -f 63:0
rdmsr: CPU 0 cannot read MSR 0x00000770
user1@ubuntu-18.04_64:~$ sudo rdmsr 0x771 -f 63:0
rdmsr: CPU 0 cannot read MSR 0x00000771
user1@ubuntu-18.04_64:~$ sudo rdmsr 0x772 -f 63:0
rdmsr: CPU 0 cannot read MSR 0x00000772
user1@ubuntu-18.04_64:~$ sudo rdmsr 0x773 -f 63:0
rdmsr: CPU 0 cannot read MSR 0x00000773
user1@ubuntu-18.04_64:~$ sudo rdmsr 0x775 -f 63:0
rdmsr: CPU 0 cannot read MSR 0x00000775
user1@ubuntu-18.04_64:~$ sudo rdmsr 0x777 -f 63:0
rdmsr: CPU 0 cannot read MSR 0x00000777
user1@ubuntu-18.04_64:~$ sudo rdmsr 0x19C -f 63:0
88410800
user1@ubuntu-18.04_64:~$ sudo rdmsr 0x64E -f 63:0
rdmsr: CPU 0 cannot read MSR 0x0000064e
user1@ubuntu-18.04_64:~$ sudo rdmsr 0x64F -f 63:0
rdmsr: CPU 0 cannot read MSR 0x0000064f
user1@ubuntu-18.04_64:~$ sudo rdmsr 0x19B -f 63:0
13
user1@ubuntu-18.04_64:~$
decs@ubuntu-18.04_64$ ./intel-reg-pp.out
hello from intel_reg_pp!
[19CH] IA32_THERM_STATUS Register With HWP Feedback
Command to read: sudo rdmsr 0x19c - f 63:0
Value of register is: 88410800
64 60 50 40 30 20 10
43210987654321098765432109876543210987654321098765432109876543210
0b00000000000000000000000000000000010001000010000010000100000000000
└───────────────┬───────────────┘│└─┬┘└─┬┘└──┬──┘││││││││││││││││
Reserved │ │ │ │ ││││││││││││││││
Reading Valid ─────────────────────┘ │ │ │ ││││││││││││││││
Reading in Deg. Celcius ──────────────┘ │ │ ││││││││││││││││
Reserved ─────────────────────────────────┘ │ ││││││││││││││││
Digital Readout ───────────────────────────────┘ ││││││││││││││││ 65 C -> 149 F
Cross-domain Limit Log ────────────────────────────┘│││││││││││││││
Cross-domain Limit Status ──────────────────────────┘││││││││││││││
Current Limit Log ───────────────────────────────────┘│││││││││││││
Current Limit Status ─────────────────────────────────┘││││││││││││
Power Limit Notification Log ──────────────────────────┘│││││││││││
Power Limit Notification Status ────────────────────────┘││││││││││
Thermal Threshold #2 Log ────────────────────────────────┘│││││││││
Thermal Threshold #2 Status ──────────────────────────────┘││││││││
Thermal Threshold #1 Log ──────────────────────────────────┘│││││││
Thermal Threshold #1 Status ────────────────────────────────┘││││││
Critical Temperature Log ────────────────────────────────────┘│││││
Critical Temperature Status ──────────────────────────────────┘││││
PROCHOT# or FORCEPR# Log ──────────────────────────────────────┘│││
PROCHOT# or FORCEPR# Event ─────────────────────────────────────┘││
Thermal Status Log ──────────────────────────────────────────────┘│
Thermal Status ───────────────────────────────────────────────────┘
[64FH] MSR_CORE_PERF_LIMIT_REASONS
Command to read: sudo rdmsr 0x64f - f 63:0
Value of register is: 1c220002
64 60 50 40 30 20 10
43210987654321098765432109876543210987654321098765432109876543210
0b00000000000000000000000000000000000011100001000100000000000000010
└───────────────┬───────────────┘││││││└─┬─┘│││││││││││└─┬─┘││││
Reserved ││││││ │ │││││││││││ │ ││││
Maximum Efficiency Frequency Log ───┘│││││ │ │││││││││││ │ ││││
Turbo Transistion Attenuation Log ───┘││││ │ │││││││││││ │ ││││
Electical Design Point Log ───────────┘│││ │ │││││││││││ │ ││││
Max Turbo Limit Log ───────────────────┘││ │ │││││││││││ │ ││││
VR Them Alert Log ──────────────────────┘│ │ │││││││││││ │ ││││
Core Power Limiting Log ─────────────────┘ │ │││││││││││ │ ││││
Reserved ───────────────────────────────────┘ │││││││││││ │ ││││
Package-Level PL2 Power Limiting Log ──────────┘││││││││││ │ ││││
Package-Level PL1 Power Limiting Log ───────────┘│││││││││ │ ││││
Thermal Log ─────────────────────────────────────┘││││││││ │ ││││
PROCHOT Log ──────────────────────────────────────┘│││││││ │ ││││
Reserved ──────────────────────────────────────────┘││││││ │ ││││
Maximum Efficiency Frequency Status (R0)────────────┘│││││ │ ││││
Turbo Transition Attenuation Status (R0)─────────────┘││││ │ ││││
Electrical Design Point Status (R0)───────────────────┘│││ │ ││││
Max Turbo Limit Status (R0) ───────────────────────────┘││ │ ││││
VR Therm Alert Status (R0)──────────────────────────────┘│ │ ││││
Core Power Limiting Status (R0)──────────────────────────┘ │ ││││
Reserved ───────────────────────────────────────────────────┘ ││││
Package-Level PL2 Power Limiting Status (R0) ──────────────────┘│││
Package-Level Power Limiting PL1 Status (R0)────────────────────┘││
Thermal Status (R0) ─────────────────────────────────────────────┘│
PROCHOT Status (R0) ──────────────────────────────────────────────┘
[19BH] IA32_THERM_INTERRUPT
Command to read: sudo rdmsr 0x64f - f 63:0
Value of register is: 00000013
64 60 50 40 30 20 10
43210987654321098765432109876543210987654321098765432109876543210
0b10000000000000000000000000000000000000000000000000000000000010011
└───────────────┬──────────────────────┘│└──┬──┘│└──┬──┘└┬┘│││││
Reserved │ │ │ │ │ │││││
Threshold #2 INT Enable ───────────────────┘ │ │ │ │ │││││
Threshold #2 Value ────────────────────────────┘ │ │ │ │││││
Threshold #1 INT Enable ───────────────────────────┘ │ │ │││││
Threshold #1 Value ────────────────────────────────────┘ │ │││││
Reserved ───────────────────────────────────────────────────┘ │││││
Critical Temperature Enable ──────────────────────────────────┘││││
FORCEPR# INT Enable ───────────────────────────────────────────┘│││
PROCHOT# INT enable ────────────────────────────────────────────┘││
Low-Temperature INT enable ──────────────────────────────────────┘│
High-Temperature INT Enable ──────────────────────────────────────┘
decs@ubuntu:~/projects/intel-reg-pp/bin/x86/Debug$
user1@ubuntu-18.04_64:~$ sudo turbostat
turbostat version 17.06.23 - Len Brown <lenb@kernel.org>
CPUID(0): GenuineIntel 13 CPUID levels; family:model:stepping 0x6:45:1 (6:69:1)
CPUID(1): SSE3 MONITOR SMX EIST TM2 TSC MSR ACPI-TM TM
CPUID(6): APERF, TURBO, DTS, PTM, No-HWP, No-HWPnotify, No-HWPwindow, No-HWPepp, No-HWPpkg, EPB
cpu3: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST No-MWAIT PREFETCH TURBO)
CPUID(7): No-SGX
cpu3: MSR_MISC_PWR_MGMT: 0x00400000 (ENable-EIST_Coordination DISable-EPB DISable-OOB)
RAPL: 17476 sec. Joule Counter Range, at 15 Watts
cpu3: MSR_PLATFORM_INFO: 0x8083df3011900
8 * 100.0 = 800.0 MHz max efficiency frequency
25 * 100.0 = 2500.0 MHz base frequency
cpu3: MSR_IA32_POWER_CTL: 0x0004005d (C1E auto-promotion: DISabled)
cpu3: MSR_TURBO_RATIO_LIMIT: 0x1a1a1a1d
26 * 100.0 = 2600.0 MHz max turbo 4 active cores
26 * 100.0 = 2600.0 MHz max turbo 3 active cores
26 * 100.0 = 2600.0 MHz max turbo 2 active cores
29 * 100.0 = 2900.0 MHz max turbo 1 active cores
cpu3: MSR_CONFIG_TDP_NOMINAL: 0x00000013 (base_ratio=19)
cpu3: MSR_CONFIG_TDP_LEVEL_1: 0x0008005c (PKG_MIN_PWR_LVL1=0 PKG_MAX_PWR_LVL1=0 LVL1_RATIO=8 PKG_TDP_LVL1=92)
cpu3: MSR_CONFIG_TDP_LEVEL_2: 0x001900c8 (PKG_MIN_PWR_LVL2=0 PKG_MAX_PWR_LVL2=0 LVL2_RATIO=25 PKG_TDP_LVL2=200)
cpu3: MSR_CONFIG_TDP_CONTROL: 0x00000000 ( lock=0)
cpu3: MSR_TURBO_ACTIVATION_RATIO: 0x00000012 (MAX_NON_TURBO_RATIO=18 lock=0)
cpu3: MSR_PKG_CST_CONFIG_CONTROL: 0x1e008408 (UNdemote-C3, UNdemote-C1, demote-C3, demote-C1, locked: pkg-cstate-limit=8: unlimited)
cpu3: POLL: CPUIDLE CORE POLL IDLE
cpu3: C1: MWAIT 0x00
cpu3: C1E: MWAIT 0x01
cpu3: C3: MWAIT 0x10
cpu3: C6: MWAIT 0x20
cpu3: C7s: MWAIT 0x32
cpu3: C8: MWAIT 0x40
cpu3: C9: MWAIT 0x50
cpu3: C10: MWAIT 0x60
cpu3: cpufreq driver: intel_pstate
cpu3: cpufreq governor: performance
cpufreq intel_pstate no_turbo: 0
cpu3: MSR_MISC_FEATURE_CONTROL: 0x00000000 (L2-Prefetch L2-Prefetch-pair L1-Prefetch L1-IP-Prefetch)
cpu0: MSR_IA32_ENERGY_PERF_BIAS: 0x00000006 (balanced)
cpu0: MSR_CORE_PERF_LIMIT_REASONS, 0x1c220002 (Active: ThermStatus, ) (Logged: MultiCoreTurbo, PkgPwrL2, PkgPwrL1, Auto-HWP, ThermStatus, )
cpu0: MSR_GFX_PERF_LIMIT_REASONS, 0x14020002 (Active: ThermStatus, ) (Logged: ThermStatus, PkgPwrL1, )
cpu0: MSR_RING_PERF_LIMIT_REASONS, 0x0c020000 (Active: ) (Logged: ThermStatus, PkgPwrL1, PkgPwrL2, )
cpu0: MSR_RAPL_POWER_UNIT: 0x000a0e03 (0.125000 Watts, 0.000061 Joules, 0.000977 sec.)
cpu0: MSR_PKG_POWER_INFO: 0x00000078 (15 W TDP, RAPL 0 - 0 W, 0.000000 sec.)
cpu0: MSR_PKG_POWER_LIMIT: 0x804280c800dd80c8 (locked)
cpu0: PKG Limit #1: ENabled (25.000000 Watts, 28.000000 sec, clamp ENabled)
cpu0: PKG Limit #2: ENabled (25.000000 Watts, 0.002441* sec, clamp DISabled)
cpu0: MSR_PP0_POLICY: 0
cpu0: MSR_PP0_POWER_LIMIT: 0x00000000 (UNlocked)
cpu0: Cores Limit: DISabled (0.000000 Watts, 0.000977 sec, clamp DISabled)
cpu0: MSR_PP1_POLICY: 0
cpu0: MSR_PP1_POWER_LIMIT: 0x00000000 (UNlocked)
cpu0: GFX Limit: DISabled (0.000000 Watts, 0.000977 sec, clamp DISabled)
cpu0: MSR_IA32_TEMPERATURE_TARGET: 0x00640000 (100 C)
cpu0: MSR_IA32_PACKAGE_THERM_STATUS: 0x88400800 (36 C)
cpu0: MSR_IA32_PACKAGE_THERM_INTERRUPT: 0x00000003 (100 C, 100 C)
cpu3: MSR_PKGC3_IRTL: 0x00008842 (valid, 67584 ns)
cpu3: MSR_PKGC6_IRTL: 0x00008873 (valid, 117760 ns)
cpu3: MSR_PKGC7_IRTL: 0x00008891 (valid, 148480 ns)
cpu3: MSR_PKGC8_IRTL: 0x000088e4 (valid, 233472 ns)
cpu3: MSR_PKGC9_IRTL: 0x00008945 (valid, 332800 ns)
cpu3: MSR_PKGC10_IRTL: 0x000089ef (valid, 506880 ns)
Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz IRQ SMI C1 C1E C3 C6 C7s C8 C9 C10 C1% C1E% C3% C6% C7s% C8% C9% C10% CPU%c1 CPU%c3 CPU%c6 CPU%c7 CoreTmp PkgTmp GFX%rc6 Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 Pkg%pc8 Pkg%pc9 Pk%pc10 PkgWatt CorWattGFXWatt
- - 157 69.94 225 2494 22821 0 447 1810 8751 389 1496 971 329 5 0.09 0.73 11.99 1.14 6.28 7.17 3.16 0.00 20.58 6.78 0.25 2.46 35 36 99.38 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.67 0.22 0.00
0 0 151 64.78 233 2494 6150 0 139 547 2166 145 501 335 80 0 0.11 0.94 11.59 1.74 8.75 9.61 3.02 0.00 22.16 9.01 0.30 3.75 35 36 99.38 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.67 0.22 0.00
0 2 146 68.06 216 2494 6206 0 120 418 2532 82 362 229 96 2 0.09 0.66 13.98 0.88 5.84 7.01 4.02 0.00 18.88
1 1 202 87.77 231 2494 3457 0 68 206 876 35 153 104 34 2 0.07 0.34 4.57 0.41 2.46 3.30 1.27 0.00 6.32 4.55 0.19 1.17 35
1 3 128 59.14 217 2494 7008 0 120 639 3177 127 480 303 119 1 0.09 1.00 17.82 1.52 8.09 8.76 4.33 0.00 34.95
^C
user1@ubuntu-18.04_64:~$
What would be a good way of determining what is causing this frequency scaling down from 1.9 GHz to 200 MHz?
165C should still be fine – NiallUK – 2019-07-29T15:53:38.710
1Yes, 65 C/149 F is pretty hot for a human, but it doesn't seem like out there for a processor running at 1.9 GHz. – Kris – 2019-07-29T15:57:05.870