I've been looking into MCE errors because I'm currently working on tuning the overclocking on my machine. I haven't run into any, as I caught all the bad settings with MemTest86+ first, but I know I might, so I have to regularly check for them for the first few months after an overclocking change. They can crop up when you have the hardware pushed just barely past it's limit, and are a sign you overclocked a tad too far.
The following lines are important
"HARDWARE ERROR."
and "MCA: BUS Level-3 Observed-error-as-third-party Generic Memory-access Request-did-not-timeout Error
Model:Response hard fail"
The other lines say the registers in the processor are not causing the failure, and the exact error specs and exactly what was impacted by the error. You aren't going to need any of this info unless you are a kernel developer or a motherboard developer.
It appears your error is from the memory.
It is what is commonly referred to as a die-hard fail, because your system is booting, just getting errors.
The following are common causes for issues with memory, memory controller or bus. (In order of the ease to fix the issue._
Overclocking issues. (Timings on RAM are too short, RAM bus speed
is too high.)
Voltage issues (Voltage to RAM and/or CPU is set wrong
in BIOS, too low or too high or board is designed for a different
voltage RAM, e.g. you put 1.65 volt RAM in a board that takes 1.5
volt RAM.)
Overheat issues (CPU RAM controller, CPU Cache, motherboard and/or RAM
is overheating. This may have to do with voltage issues.)
Bad power supply (This is due to big issues.)
Bad Memory (Try testing with MemTest86+ including the dreaded bitfade test. It may not be the memory even if detected.)
6 Bad BIOS (WARNING, it may be dangerous to flash the BIOS while you have bad memory. Check your manufacturer's website to see if there are issues causing memory corruption, download and prepare the BIOS image on a different computer and use the on boot BIOS flasher, to minimize amount of resources in use, and thus the amount of things that can go wrong.)
Bad motherboard and/or bad CPU. (I think this is obvious.)
I would try running memtest on it. It got to be available from your distro. – tshepang – 2011-05-20T17:27:00.973
4You have a hardware problem, perhaps with the memory, probably with the motherboard. If you're overclocking, stop. If the motherboard is under warranty, try to get it replaced. – Gilles 'SO- stop being evil' – 2011-05-20T22:02:16.953