Asus ZenBook Pro with Ubuntu 16.04 has massive performance drops

3

4

Background

I recently bought the Asus ZenBook Pro. I use it for testing deep learning experiments locally. These experiments are often quite compute intense both on the CPU and GPU. I've recently experienced some huge performance drops when doing some heavy computations.

I have Ubuntu 16.04 installed.

Problem

The problem arises when I e.g. schedule a training job using TensorFlow, Keras or running a CPU and GPU heavy job in ROS or Python. After about 30-60 seconds of expected performance (i.e good and high performance) the performance suddenly dies and the entire computer becomes almost unresponsive. A complete reboot is needed to recover functionality.

Using top, nvidia-smi or the system system monitor I see no sudden spike in any processors use of CPU or memory. No other processes starts using the CPU or GPU.

When in the unresponsive-state I also see no processors using any noticeable amount of processing power.

I suspect the power management of Ubuntu to cause the problem, since my fan is also acting uncontrollably from time to time, but I'm no linux expert. However, when I installed Ubuntu I had to do the initial boot with acpi=off if it helps.

EDIT: I have tested the same code on other computers with Ubuntu 16.04 installed and see no issues here.

I appreciate any help in locating the problem or guiding me to somewhere I can research myself.

marcopah

Posted 2018-06-27T08:29:22.250

Reputation: 131

3I suggest to track the temperatures of the CPU & GPU - the problem might occur if they spike. This one laptop can have ineffective thermal paste on the CPU or similar causes. I don't use Ubuntu, but under Windows one can have them displayed continuously on the taskbar. – harrymc – 2018-09-07T06:10:09.513

I can confirm what @harrymc said - a fan went dead in my Thinkpad. I had it replaced but got a cheap one with just 3 speeds not reported back to system, so now CPU slows down when overheated while Thinkpad thinks that the fan runs on top speed which is not the case. – Pawel Debski – 2018-09-08T10:21:28.020

Answers

3

It is possible to be an issue with nvidia driver, did you install form the .RUN downloadable via nvidia website or the ubuntu provided one? Should be available via the device manager, easly find some guide to install driver by googling it.

My personal suggestion it is use the proprietary nvidia driver from the linux distribution repository, this because the open source driver nouveau it is work fine, but when are necessary the performance (and is your case) nouveau are not the best solution. Also download from the manufacter site in this case is not the best solution, because they write linux driver generic which could for sure give you more performance but also more bug. Another suggestion I could give you it is to test different version of the driver.

AtomiX84

Posted 2018-06-27T08:29:22.250

Reputation: 637

1

A laptop can get quite hot if it has insufficient cooling. Your CPU is the modern Intel I7, and most modern (costly) high-end processors reduce automatically their clock-speed when they get too hot, in order to avoid a meltdown, and not always return to normal speed.

This theory gets support from the fact that the problem only arrives when the computer is under heavy load. It might be a problem of the CPU, the GPU, or both.

I suggest adding some indicators of the temperatures of the CPU and GPU, so you may visually see their evolution. The following might help :

If the problem is indeed over-heating, there are some steps that you may take :

  • A cooling pad may improve the situation
  • Assure that all air passages are clean
  • If your environment is dusty, cleaning the interior might help
  • If the computer is still under warranty, use it
  • If it is not under warranty, the thermal paste of the CPU might require replacement
  • The cooling fan(s) might be deficient

harrymc

Posted 2018-06-27T08:29:22.250

Reputation: 306 093

1

Your CPU may be heating too much. Given that your system becomes essentially unresponsive, you need to setup a way to monitor and write to disk the temperature, clock speed, and other parameters, so after you reboot you have postmortem information.

You could use a script like below, which will check fans speed, various temperatures, and CPU clock frequency. This will likely give you enough information to figure out (or hint at) what is happening. Anything else would likely be shots in the dark (which doesn't mean they won't be on target).

A fancier output formatting can be obtained by using sed, grep and/or awk, there are several examples out there (see below). There are also other pieces of information that you could gather (see below), but I guess this would be enough.

This will hopefully help you finding the problem (your question!)... now, as for the solution, that is worth another question.


Script for monitoring various parameters.
#!/bin/bash

echo -n "" > monitor.log
while true ; do
    echo "$(date +"%H:%M:%S")" >> monitor.log
    sensors | sed 's/^/    /' >> monitor.log
    cat /proc/cpuinfo | grep '\(processor\)\|\(cpu\ MHz\)' | sed 's/^/    /' >> monitor.log
    echo "" >> monitor.log
    # Write output every 2 seconds
    sleep 2
done


References on how to format output from sensors, etc.

https://unix.stackexchange.com/questions/79060/personalize-sensors-output-and-save-it-to-file


References for other pieces of information.

https://askubuntu.com/questions/450045/show-cpu-usage-using-a-command

sancho.s Reinstate Monica

Posted 2018-06-27T08:29:22.250

Reputation: 2 404