3
I have a brand-new gaming laptop which crashes every time I run Linux. If I run Linux natively, it completely freezes (including the mouse cursor) after a seemingly random period. If I run Linux in VMWare Player under Windows 8.1, Linux eventually hangs in the same way, but Windows is also crashed and displays a Blue Screen of Death (BSOD) after a short delay. The BSOD always says MACHINE_CHECK_EXCEPTION and the BugCheck log indicates a code of 0x9c.
The Linux variants that I have tried are:
- Ubuntu MATE 15.10 64-bit
- Ubuntu 15.04 64-bit
- Ubuntu MATE 15.04 64-bit
- Ubuntu MATE 14.04.2 64-bit
- Ubuntu MATE 14.04.2 32-bit
Other than these more-or-less random hangs, Linux runs fine -- and I have been able to use it for many hours in between crashes.
I had assumed that this is a hardware problem, but the difficulty is that I cannot get Windows to crash unless I am also running Linux (in a VM). I've tried simultaneously launching every available application (around 30), while playing YouTube videos, and also running stress-test apps such as Prime95. I've also done some graphics-heavy gaming.
I have run "Windows Memory Diagnostics Tool" and other memory tests with no apparent problems.
One guess is that Linux is somehow exercising CPU features that Windows doesn't use, but it isn't clear why this would trigger random hardware failures.
How can I definitively prove that I have faulty hardware (or that I don't)?
EDIT: I seem to be having some luck fixing the Linux problems by disabling some features in the BIOS. I haven't seen any crashes since doing so. The changes I made initially (just based on guessing):
- Virtualization Technology: Disabled
- Fast Boot: Disabled
- SpeedStep: Disabled
- PCI Latency Timer: 64 Clocks (was 32)
Based on subsequent testing of variations of these, apparently both VT and SS need to be disabled -- but for sure, at least SpeedStep. Does this make it easier to isolate the crashes as being based on a hardware defect? ...Or could this possibly be a software problem in Ubuntu/Linux?
To make my question more explicit: I'm not really asking for ways to solve the problem, although that would be great in theory. What I really need is a way to isolate and reproduce this problem under Windows without also running Linux. I'm working from the assumption that I have a bad unit -- and I just need a way to prove it. Remember that the machine is crashing whenever I run Linux (excepting the BIOS changes mentioned above), so this can't be solved by simply updating Windows drivers.
In short: Knowing that Linux causes crashes, is there any other stress-test that I can run, in Windows, that might cause the same type of crash? Alternatively, is this a known bug in Linux?
Note that my processor is the newish i7-5700HQ (Broadwell microarchitecture).
Also note: I don't believe this is caused by severe overheating. The machine includes an extra fan that can be manually enabled, and the crashes don't seem to correlate with heavy loads.
UPDATE: The problems with running Linux natively have been resolved by installing a BIOS update that became available a few months after I posted the question. I am also now running Ubuntu MATE 15.10, but I don't think that matters since that also failed prior to the BIOS update. I guess the long and short of it is that the system was not compatible with Linux (or vice versa) as they were at the time of release.
I haven't gone back and retested the virtual machine problem since I don't really need that now that I can run Linux natively -- and also I have migrated from Windows 8.1 to Windows 10, so it wouldn't exactly be an apples-to-apples test anyway.
Ok, now I have tested (and crashed) with vanilla Ubuntu 15.04. – nobar – 2015-07-09T02:38:18.303
What's the machine? – Journeyman Geek – 2015-07-09T03:20:25.847
@JourneymanGeek: MSI GE72 APACHE PRO-077 – nobar – 2015-07-09T03:30:39.577
1
Very similar situation found by searching with linux broadwell speedstep: Working Around The Intel Core i7 5775C Broadwell Stability Issue On Linux. The indicated workaround seems to relate to disabling "down-clocking" in the BIOS.
– nobar – 2015-07-09T18:27:56.873Same laptop with same errors. Is it working for you after all? People still having errors here: http://ubuntuforums.org/showthread.php?t=2284315&page=2
– gabrielhpugliese – 2015-09-08T02:57:50.047@gabrielhpugliese: Thanks for the link. I still think the fixes that I posted work, but I have been running Windows on this computer for the last couple of months, so I don't have any new data -- other than the observation that Windows still doesn't crash. – nobar – 2015-09-08T05:02:13.213
I'm using Virtualbox to run Ubuntu 14.04.1 with your tips (Virtualization enabled, FastBoot disabled, SpeedStep disabled and PCI latency timer 64). So far so good, I'll keep the link updated. – gabrielhpugliese – 2015-09-08T13:20:52.603
Early results indicate all better (with SpeedStep enabled) on Ubuntu MATE 15.10 64-bit. Fingers crossed... – nobar – 2015-10-23T17:27:55.153
My previous comment turned out to be false -- it was still failing, at least under some challenging usage scenarios. I did just discover that a new BIOS is available, so I have just upgraded MicroCode from 0xd to 0x13. After this, I am passing the test that was previously failing... – nobar – 2015-11-01T19:38:56.833