Intermittent hard freeze/crashes, generally while idle. No event viewer details

1

Subject says most of it. My HTPC will occasionally be found in this hard-locked/crashed state when nothing of note was being done. The past couple times, it has been while the screensaver was running over top of the Steam client (not Big Picture, just the normal client) and nothing else of note was running. I've not experienced this crash when watching TV via Windows Media Center, nor when playing Steam games like Dark Souls.

Checking Event Viewer has been fruitless, as no BugChecks or other errors are found, aside from the Kernel Power errors saying the previous shutdown was unexpected. Presumably, whatever happens is happening suddenly enough that the OS doesn't have time to react before it's locked.

I was also getting random reboots with the same level of mystery and non-existent reproduction steps. I turned off the auto-reboot-on-crash feature, so the next time of of those occurs, I should get a legit BugCheck, but I've not had one since to use for troubleshooting.

Image of what I see when this occurs (apologies for the poor quality, taking photos of a TV with a camera phone)

Picture of fail
Click for full size

Machine details:

OS: Windows 7
CPU: Intel Pentium G645 2.9GHz
Mobo: H77MU3 LGA 1155 mATX
PSU: OCZ StealthXStream OCZ600SXS 600W
RAM: Ballistix Sport 4GB DDR3-1600
GPU: SAPPHIRE 100355OCL Radeon HD 7850 2GB 256-bit GDDR5 (OC version)
OS SSD: SSDNow V200 SV200S37A/64G 64GB SATA
Media HDD: Barracuda 2TB 7,200 RPM SATA
some SATA DVD burner
some cheap media card reader
some cheap USB3.0 hub that slots into PCI

All hardware is factory-default, no overclocking or other mods done.


Steps already taken:

Thorough dust cleaning with compressed air.
Memtest86+ on each stick of RAM individually. Both passed.

My suspicion is either PSU or GPU, both should be pretty damn healthy, as neither are a year old (both bought in April of this year). How would I test them for faults, especially given the anomalous nature of the error? This is eating away at the Spouse Acceptance Factor of my hand-rolled HTPC, so any tips on getting the stability back on track would be great.

Andy_Vulhop

Posted 2013-11-16T22:24:46.497

Reputation: 81

test by disabling any screensaver cause depending which one they can use a lot of unnesssisary gpu, then it goes on standby with a hot gpu. Test by setting standby(ing) things to much shorter times, and see how it is comming and going out of standby. (because part of the clues you provided sounds like it might not be making it out of standby well) Test by having everything going to sleep like hard drive and screen, but not the full computer standby.so called "put the computer to sleep" also make sure you get into the "advanced" areas of the power profile so you know all of it. – Psycogeek – 2013-11-17T01:18:21.437

@Psycogeek This HTPC doesn't go into Standby. It's using a custom high-perf power profile where the display and sleep settings are "never". – Andy_Vulhop – 2013-11-17T19:55:27.613

@Psycogeek Upon further review, however, the hard disk is still set to turn off after 20 minutes of idle. Could be the SSD having issues going in and out of it's low-power state. I'll see if setting that and screensaver down to 1-2 minutes causes reproduction of symptoms. – Andy_Vulhop – 2013-11-17T19:57:37.627

Important to check the GPU temps. there is a cool gadget item that does that, if your into gadgets "GPU monitor 8.2" They allow these gpus to get much hotter than I would. Testing by setting the gpu fan higher in manual temporarily. You might find me re-doing the cooling on a GPU card when I prove that it is the fault. – Psycogeek – 2013-11-18T02:38:19.710

I did a GPU burn-in with Furmark and the GPU topped out in the low 60s. 64C, iirc. I keep a windows gadget for CPU temp on the desktop and I've never seen it out of the low 40s. – Andy_Vulhop – 2013-11-18T15:59:42.293

thats good. then how about the video ram? just cleaning a GPU card for better air passage can help that. There are "artifact" checkers that will ripp through the video ram as a sort of test of that. Atitraytools (old) had one, OCCT has one in the GPU tests called "error check". Those can be harsh on the ram, without much stress on the gpu, which can mean that the fan doesn't react as much. – Psycogeek – 2013-11-18T23:10:54.460

Back one. that is one reason behind manually running the GPU fan temporarily for testing, for the temps you dont see. if somehow an excessive cooling of the gpu card total would stop the problem from ever happening, it might be easier to lay blame on it. – Psycogeek – 2013-11-18T23:16:46.670

No answers