My power supply/motherboard nightmare

8

2

Last summer I bought a broken power supply (without knowing of course) and put it in my computer. Turning my box on made one of my cheaper HDDs catch fire (I believe a capacitor or resistor blew) and I thought I completely killed my system.

My other 4 hard drives survived the surge (Western Digitals, better quality is why I guess they were spared). And I never had any problems with my motherboard/RAM/CPU because of it. I bought a new power supply and left it behind me.

In the past couple months my computer has been randomly crashing for no apparent reason. I'm never able to recreate the crash, as I can never tell when it's going to happen (it doesn't even seem to have anything to do with intense usage). The process goes as follows:

  • The computer fully turns off. Just dies. However, the little light on my motherboard telling me my power supply is on, is still on.

  • Pressing the power button on my computer doesn't have any effect here. To restart I have to manually turn off the killswitch of my power supply. When I do this, the light on my motherboard goes off and my subwoffer makes a sound. Then it comes back on for a second (accompanied by another subwoffer noise). Then it turns off.

  • Turning the killswitch back on, I can reboot and hope it doesn't happen again.

At first I thought overheating (I was running at high Vcore and my CPU temperature was high). I turned on the "Cool n' Quiet" function of my motherboard, and I still got a crash.

Next thing, obviously, is power supply. However, I was wondering if the initial broken power supply I bought damaged my motherboard or something like that. Which could explain why a power supply that was bought from a trusted store and brand new is seemingly "failing".

So basically what should I do? Replace my power supply again? What if the motherboard is breaking my power supply (if that's even possible)? Should I just spend 500$ on a new motherboard + power supply?

EDIT: I've noticed that it crashes mostly when I'm watching a YouTube video or watching an HD movie in VLC. It's odd because a software pattern is seemingly effecting hardware :(

EDIT 2: I now believe this might be a CPU overheating issue. Someone please tell me if CPU overheating could cause what I'm about to explain. I ran Clementine (music player) with full screen visualizations enabled on fast mode. While I was doing this, I was watching the CPU temp go up and up. It finally capped at around 81C, at which point I opened a browser to see how 'hot' that was. The crash happened. Is this definite proof that I'm dealing with an overheating issue?

EDIT 3: My CPU is AMD Athlon(tm) 64 X2 Dual Core Processor 6000+ rated at 3.1GHz (or something around that frequency). I'm not over-clocking at all.

FINAL EDIT: I'm happy to announce that I've solved it with the help of you guys. I bought some artic silver thermal paste for about 8$ and applied it to the CPU/HeatSink. Along with a good de-dusting and some better cable management, my computer is running at a cool 35C idle. Thanks to everyone who helped.

n0pe

Posted 2011-04-07T21:28:22.597

Reputation: 14 506

watching videos puts a lot more load on the CPU than most other tasks, as decoding video is CPU intensive, this could certainly cause CPU temps to rise, and even suck more power. – WolfmanJM – 2011-04-11T01:23:07.890

I was thinking that, but is there a concrete way of knowing if a crash was caused by CPU temp? And its not for every video, only sometimes. – n0pe – 2011-04-11T01:33:03.260

Install a temp monitor (eg motherboard monitor), and watch the temp. Also sometimes you can disable the temp cutoff in the BIOS. – WolfmanJM – 2011-04-11T02:56:59.317

The fact that youtube and VLC trigger it most often suggests the best place to look are the graphics card and sound card. The sound from the subwoofer makes me want to lean towards the sound card. Try playing a few sound files on a loop, then try a graphics benchmark. – TuxRug – 2011-04-11T04:34:23.033

@TuxRug. I'm running on board sound and I just changed my graphics card to an ATI Radeon HD 5750 so I don't think that's it. I think the sub 'pops' because it gets power, then loses it, then gets it, then loses it again. – n0pe – 2011-04-11T05:08:39.820

@MaxMackie: Was it a Bestec PSU? – paradroid – 2011-04-14T17:18:33.380

@paradriod. Which power supply? The one that crapped out or the one I currently have and might be crapped out? The first one was an Enermax, and I'm not sure what my current one is ... the label is below the PSU so I'd have to take it out. But it's a decent brand I remember – n0pe – 2011-04-14T21:09:58.543

Answers

7

What you described to me looks like some sort of safety system in the power supply.

I recommend that you disassemble whole computer and check for any burn marks on the motherboard and pretty much everything else. There could be something which causes a short and makes power supply turn off. Pay close attention to computer case and check if it's moving or vibrating.

Next step would be to check cables in your power supply. If you have multi-rail power supply, you could be hitting rail limit for power. Make sure that load is distributed among cables and devices. If a cable has several connectors on it, try to use up all other cables first before using extra connectors. Make sure that you balance load on cables with several connectors. For example on a cable going into a hard drive make sure you only connect low power consumers and so on. This in general shouldn't be a problem, but some power supplies can be picky and may have bad load distribution.

Next check how different devices are mounted. I once had a similar problem. I bought a new computer and every once in a while it would shut down unexpectedly. In the end it turned out that the incompetent or malicious manufacturer used too long screw on the floppy drive. They penetrated inside and caused shorts from time to time. Make sure that nothing similar is happening in your case.

AndrejaKo

Posted 2011-04-07T21:28:22.597

Reputation: 16 459

I remember doing exactly that once my HDD blew. I looked at the motherboard extensively and couldn't see any burns (despite the mobo being brown itself). Also, the output of my 'sensors' command says that all my rails are performing at the right voltage. I've remounted the computer a couple times and have done well to my knowledge, so I think I'm okay there. This is really confusing :/ – n0pe – 2011-04-07T22:02:54.857

@MaxMackie If I'm right and there's a part causing short or the overcurent protection is being activated, you won't see anything strange with the voltages until maybe a moment before computer shuts down. The system is constructed so that no parts are exposed to low voltages or excessive currents. – AndrejaKo – 2011-04-07T22:09:41.997

Alright, I plan on taking it completely apart and cleaning everything. When I set things back in, I'll inspect everything. Is there a way to check for shorts (ie multimeter)? – n0pe – 2011-04-07T22:14:30.593

@MaxMackie Multimeter won't help you very much. Since the computer works most of the time, the short, if it is a short, isn't continuous. You can try probing for shorts between motherboard power connector and case whit meter on continuity mode, but I doubt you'll get anything. What you should be looking for are loose cables and locations where a solder joint may come in contact with case. Basically as soon as the short develops, the supply shuts down, so look for anything in the case that can move. – AndrejaKo – 2011-04-07T23:01:35.703

1@MaxMackie The other equally probable option is that you might be tripping overload on a single cable, so redistribute load on the cables after you finish reassembling the computer. Also if you have enough time, try running computer with only motherboard powered on and see if it will shut down. This will help us localize the problem. – AndrejaKo – 2011-04-07T23:05:02.253

The crashes happen very suddenly. So running motherboard up, I might be waiting for days or weeks for a crash. I can go a couple days or hours without a crash and everything works perfectly. Also, I plan on taking everything apart this weekend (need the computer for some last minute assignments -- finals are soon). – n0pe – 2011-04-08T00:32:59.527

The motherboard may also have a bad capacitor, certainly cheaper motherboards are prone to that, and they degrade over time. The bad PSU may have damaged one or more of the caps and they are now starting to fail. The bad news is it is very hard to find a bad cap. – WolfmanJM – 2011-04-11T01:25:00.867

Could a bad capacitor really create such erratic crashes? It seems to me like a bad capacitor would render the motherboard pretty useless. My board is an M3N78-Pro http://www.asus.com/product.aspx?P_ID=DVvm9CU0G1bCC4gp

– n0pe – 2011-04-11T01:32:14.167

Yes I've heard bad caps cause erratic behavior, and can cause intermittent shorts. Especially the tantalum ones. – WolfmanJM – 2011-04-11T02:55:12.237

1@MaxMackie As far as computer equipment is concerned, the main problem are aluminium electrolytic capacitors and they are much more prone to failure than tantal capacitors. Go to Wikipedia and search for capacitor plague. If any of the capacitors on the motherboard match the description, they'll need to be replaced. – AndrejaKo – 2011-04-11T10:13:46.727

@AndrejaKo thanks for pointing me in this direction. Checking it out... – n0pe – 2011-04-11T11:07:38.153

4

The more I think about this the more I think it is the BIOS turning off the system due to CPU overheat, I had exactly the same problem once.

Look in the BIOS and see if it has a setting that sets the temperature at which it shuts off.

You can force the issue with cpuburn (google it) there are different versions for different CPUs, choose the one for your cpu, It will cause the CPU to heat up rapidly and if it triggers the shutdown then there is a good chance that is the problem.

Check the BIOS to see what temp it is set to shutdown at, and check the CPU specs, it may be set too low, if not you will need a better CPU cooler, or a bigger fan blowing on the CPU cooler.

WolfmanJM

Posted 2011-04-07T21:28:22.597

Reputation: 844

1I know I need some new thermal paste, but I'll have a look at this too. Thanks – n0pe – 2011-04-11T11:05:28.027

I've added an Edit 2. – n0pe – 2011-04-11T23:14:16.817

1It sounds like it, 82 is around the default cut-off, did you look in the BIOS for the option? It may also be on by default in the BIOS, if it triggers the BIOS just shuts everything down with no warning. – WolfmanJM – 2011-04-12T03:32:18.860

You could temporarily open up the case and put a fan on the heatsink and see if that helps. – WolfmanJM – 2011-04-12T03:32:58.470

I've checked the BIOS and couldn't find the cut off temp. I've read that it's iffy and it'll trip at different temperatures. – n0pe – 2011-04-12T03:36:35.397

3

You might want to consider stress testing your machine with different hardware components attached (ie, take out all unnecessary components and start with individual RAM chips) and reconnecting your wires. If its still failing, then you know whats left (CPU, MOBO, PSU).

Also, take a look at your capacitors, especially the power regulating ones by the CPU.

wag2639

Posted 2011-04-07T21:28:22.597

Reputation: 5 568

I've considered this, but seeing as it doesn't always crash and I can't reproduce a crash, I might be running under par for a long time. – n0pe – 2011-04-11T03:54:59.607

3

I vote for CPU overheating. To test this, try opening your internet browser and going to http://www.webkit.org/perf/sunspider/sunspider.html

Hold down the CTRL key and click the "Start SunSpider 0.9.1 now!" link 50 times or so - each time you click, this will open a new tab and start the test in that tab. This is an intense enough test to max out your CPU if you run enough instances.

Note: Use an inefficient but multithreaded browser. Internet Explorer 8 is ideal for this test, Firefox is also good. Don't use Chrome or IE9 because they will finish the test too quickly, not giving your CPU enough time to heat up. Do not use Safari on Windows either as it is single-threaded and won't max out your CPU.

Open Windows Task manager, as well as CoreTemp (install it from here: http://www.alcpu.com/CoreTemp/ ) and run them while you're doing the test.

By the way, on idle your CPU should be anywhere between 20C and 40C depending on what CPU type you have. Notebooks generally run a little hotter on idle but that's not what we're talking about here.

Under load, the MAX a desktop CPU should be getting to should not exceed 75C if you're using stock settings. Again, notebooks run higher.

Also: Can you share the make / model of your CPU and power supply?

Joshua

Posted 2011-04-07T21:28:22.597

Reputation: 4 290

I edited my question to add CPU info. Also, I'm on linux so I can't run those apps. I'm gonna try that tab thing you suggested. – n0pe – 2011-04-15T20:27:42.013

1If you are on linux use cpuburn, as I suggested in my answer, it has never failed to trigger overheat on my setups :) – WolfmanJM – 2011-04-15T23:46:31.640

so is it safe to say that if it crashes with cpuburn its an overheating problem? – n0pe – 2011-04-16T18:11:36.733

yes it would be safe to say that – WolfmanJM – 2011-04-17T22:05:03.597

1

If all device temperature is normal, then I think your MOBO sensor/chip was damaged during the previous power surge, but the damage wasn't apparent until now. Likely your onboard sensor is picking up wrong readings randomly and forcing a safety shutdown.

KoKo

Posted 2011-04-07T21:28:22.597

Reputation: 1 498