Desktop freezing under heavy graphics load

5

2

I'm having some serious trouble diagnosing an issue that I've been having with a box I built a couple of months ago.

Here's the hardware: newegg wish list

After playing a game for 30-60 mins. The whole computer freezes (crashes?). It just stays stuck on the screen and is completely unresponsive. The only thing I can do is shut it down by holding the power button for a few seconds. I can replicate this more quickly by running Furmark for 3-5mins. Never a BSOD, just completely frozen.

Troubleshooting so far: I've had the issue under two separate XP installs and Win7 Pro with different driver version. (I have Arch Linux installed too. It's never had the problem, but I don't play games there.)

At first I assumed that it had to do with heat. I've monitored both the CPU and graphics card temps. I don't think it has anything to do with the CPU because I can run prime95 for at least a couple hours without any problems (haven't tried longer). And I've been able to cause it to freeze under Furmark with the GPU temps ranging from 59C to 80C, so it doesn't seem like heat there either.

Initially I did have the fourth core of the proc unlocked through the BIOS, but everything has since been returned to the stock defaults.

Finally I decided it must be a problem with the graphics card, so it's been RMA'd. In the meantime I'm using an ATI 6750. It has the same problem! So there probably wasn't anything wrong with the other card.

Does anyone have any ideas? Or suggestions? Let me know if I've left out any relevant details.

UPDATE 1: Memtest Passed
UPDATE 2: Computer doesn't freeze with the side of the case removed.

If it's not the graphics card, RAM, CPU, or HDD, and it does seem related to heat, that leaves only the motherboard. Does this kind of thing happen when a motherboard overheats? Is there a specific sensor that I can monitor?

UPDATE 3: I have been using the computer with the side of the case removed for a while and I've yet to have it freeze again. Although I can't use it that way forever. I'm fine with further upgrading the cooling system, but first I need to pinpoint what's overheating. Any suggestions? PLEASE?

UPDATE 4: A couple of weeks ago (before the new graphics card) I was able to take a picture of the temps after it froze. Although it is a camera phone pic, so it's not the best quality.

enter image description here

There are two things that I think might be relevant. First the VIN1 voltage shows a ridiculously large max. However, this doesn't happen every time it freezes and I've seen it report that number when it hasn't frozen. The other strange thing is that under the fans section, there are three fans that always appear right before it freezes. Under normal conditions the fans section only has one entry (CPUFANIN0).

UPDATE 5: I posted this in a comment "The case actually came with two 120mm fans, both of them pointing inward (intake). I replaced those fans with better ones. I also switched the rear fan to point out as an exhaust. Neither seemed to help."

jtmcn

Posted 2011-04-29T21:47:03.480

Reputation: 217

I would have thought it was thermal, too. Do you have problems with heavy processor load (say, Prime95)? One other thought is to install Wine on your Linux install and see if Furmark does the same thing like that. EDIT: Might be RAM, too (I don't trust Corsair, terrible failure rate from them), try something that throttles that to see if you can replicate it. – Shinrai – 2011-04-29T22:00:49.590

No problems under heavy processor load. Testing the memory does sound like a good idea, although I'm not sure if any of the symptoms actually point that way, do they? – jtmcn – 2011-04-29T23:07:28.947

@jtm - hard locks (incouding mouse, etc.) are usually RAM related. – Ƭᴇcʜιᴇ007 – 2011-04-30T00:18:14.410

possible duplicate of How to diagnose computer lockup/freezing problem

– Ƭᴇcʜιᴇ007 – 2011-04-30T00:18:50.493

I just ran memtestx from grub menu. It passed. I don't think it's the RAM. The lockups don't seem to be related to anything that's necessarily RAM heavy. – jtmcn – 2011-04-30T03:07:11.507

Under Windows 7 on a different hard drive, I just ran Prime95 and Furmark for about 4 mins until it froze. Then I took off the side and front panel from the case and ran it again. This time it hit 10 mins without freezing, then I turned it off. This must indicate overheating? But overheating of what? The range of temperatures in the CPU and GPU is too great for it to be those. I don't think it would be the same symptoms if it was the hard drive. So then what? Maybe a sensor on the motherboard is overheating? How can I tell? – jtmcn – 2011-04-30T04:48:05.413

make sure your north bridge and south bridge coolers are attached correctly. I've had several motherboards come with poor or no thermal paste on the north/south bridge chipsets. – Simurr – 2011-05-09T20:05:56.733

1How are your case fans arranged? front fans should suck air in, rear fans should blow air out. – Simurr – 2011-05-09T20:11:12.647

I second Simurr's advice: check you fans. I noticed you only purchased one 120mm fan. Did the case come with one? – Supercereal – 2011-05-09T20:20:26.780

The case actually came with two 120mm fans, both of them pointing inward (intake). I replaced those fans with better ones. I also switched the rear fan to point out as an exhaust. Neither seemed to help.

I'm a little confused what you mean by "make sure your north bridge and south bridge coolers are attached correctly." The CPU and it's cooler sit on the north bridge right? Does the south bridge typically use separate cooling? It's certainly possible that I incorrectly applied the thermal paste when I upgraded the CPU cooler, but I was also having the issue prior to that upgrade. – jtmcn – 2011-05-09T20:55:36.807

My ThinkPad (T61) has the same type issue in regards to the video freezing. The CPU is overheated. Combination of running to many apps at one time, with me failure to clear out the vents on the back/underside of the laptop. A can of compressed air cleaned an amazing amount of dust from the fans. After having the problems, I realized I had been spending a large amount of time operating the laptop on a couch, or while resting the laptop on a blanket. – DavidGrove – 2011-05-11T14:52:00.410

I don't understand... this is a desktop. Are you telling him to not use the desktop on a bed or blanket? Also the OP states he's running it with the side of the case open so I doubt the whole side is covered in dust. He also states that he has monitored GPU/CPU temps and they are within bounds. Are you just sharing your experience? – Supercereal – 2011-05-11T15:28:00.533

You have another power supply you could test your rig with? And check the PSU fan. – Simurr – 2011-05-11T17:45:47.540

Answers

1

Sounds like your GPU, CPU, hard drive and RAM are working properly, though you didn't post actual CPU temperatures.

Check your case fan setup. Make sure you are creating air flow, not just blowing air at stuff i.e. air flows into the case, through the case and then out of the case, not just cycling hot air around inside the case. Generally you want front case fans (if you have them) to suck air in and rear fans to blow air out (that includes and side or top case fans). However, if your case fans were setup incorrectly it should have showed in CPU and GPU temps, so probably not the problem (check it anyway).

WARNING WARRANTY MAY BE VOIDED - please make sure you are not voiding your warranty if you try the following advice and reapply thermal paste to your motherboard heatsinks. It may be best to send the motherboard back for replacement as overheating could have damaged something already.

Check your northbridge and southbridge heatsinks. They are the 2 heatsinks attached to the motherboard. One just below the CPU (northbridge) and the other next to the PCI slots (southbridge). I've had many new motherboards and GPU's come with little or no thermal grease and/or poor thermal pad placement.

You'll have to remove the motherboard to get the plastic clips off to reapply thermal paste. I recommend Arctic Silver 5, very nice stuff, though your Cooler Master heatsink appears to have come with some which would work just fine. It's basically the same process as applying thermal paste to the processor. Make sure both the heatsink and chip surfaces are clean before applying new thermal paste.

Edit -- Do you have another power supply you can test in your rig? Also make sure the power supply fan is working. Power supply issues usually result in power loss, but I've seen some weird stuff happen when the power supply isn't putting out enough power.

Simurr

Posted 2011-04-29T21:47:03.480

Reputation: 618

If the issue is related to the thermal paste on the stock heat sinks, is there something that would indicate that as the problem? Is there a nearby sensor that monitors anything relevant? – jtmcn – 2011-05-10T16:42:55.453

One of the motherboard sensors might be on the northbridge, though probably not. – Simurr – 2011-05-11T17:32:06.433

I don't have an extra PSU, although I'm building a second desktop soon, so I could pick one up now I suppose. (The fan does work though.) While it's probably a good idea to test it anyway, I had kind of ruled out a faulty power supply, because I would assume it would have the same issue with or without the side of the case removed. Does that make sense? – jtmcn – 2011-05-11T18:13:41.510

PSU fans generally suck air in and out the back of the PSU. With the case open it is sucking in colder air from outside of the case instead of the hotter air inside the case. – Simurr – 2011-05-11T18:40:51.183

Your problems still sound more like something overheating on the motherboard but it is possible it's cause by power fluctuations from an overheating PSU. Not likely, but worth testing if you have an extra PSU. – Simurr – 2011-05-11T18:49:34.443