Syslogd: hardware error

1

The machine has been sending these messages to the terminal, paired with beeps from the speaker on the motherboard. These messages appear every 5 minutes, sometimes naming CPU2, sometimes CPU3.

Message from syslogd@pc at Mar 25 17:52:20 ... kernel:[ 7200.792043] [Hardware Error]: CPU:2 MC0_STATUS[-|CE|-|-|AddrV|CECC]: 0x9467400000000136

Message from syslogd@pc at Mar 25 17:52:20 ... kernel:[ 7200.792059] [Hardware Error]: MC0_ADDR: 0x00000001f5925200

Message from syslogd@pc at Mar 25 17:52:20 ... kernel:[ 7200.792065] [Hardware Error]: Data Cache Error: during L1 linefill from L2.

Message from syslogd@pc at Mar 25 17:52:20 ... kernel:[ 7200.792073] [Hardware Error]: cache level: L2, tx: DATA, mem-tx: DRD

Message from syslogd@pc at Mar 25 17:52:20 ... kernel:[ 7200.792085] [Hardware Error]: CPU:2 MC1_STATUS[-|CE|-|-|AddrV]: 0x9400000000000151

Message from syslogd@pc at Mar 25 17:52:20 ... kernel:[ 7200.792093] [Hardware Error]: MC1_ADDR: 0x00000000004aa210

Message from syslogd@pc at Mar 25 17:52:20 ... kernel:[ 7200.792098] [Hardware Error]: Instruction Cache Error: Parity error during data load.

Message from syslogd@pc at Mar 25 17:52:20 ... kernel:[ 7200.792105] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD

Message from syslogd@pc at Mar 25 17:52:20 ... kernel:[ 7200.792115] [Hardware Error]: CPU:2 MC2_STATUS[Over|CE|-|-|AddrV|CECC]: 0xd40041000000010a

Message from syslogd@pc at Mar 25 17:52:20 ... kernel:[ 7200.792124] [Hardware Error]: MC2_ADDR: 0x00000001d4fe5200

Message from syslogd@pc at Mar 25 17:52:20 ... kernel:[ 7200.792129] [Hardware Error]: Bus Unit Error: GEN parity/ECC error during data access from L2.

Message from syslogd@pc at Mar 25 17:52:20 ... kernel:[ 7200.792137] [Hardware Error]: cache level: L2, tx: GEN, mem-tx: GEN

System has an AMD Phenom II x4 955, standard clock speed. BIOS is up to date (except for a beta version). System runs on the newest version of Linux Mint Debian Edition. Temps are on the high end, but still acceptable (~45 degrees idle).

I have tested the system with memtest86+ for 15 hours (5 passes), as well as prime95 for in total over 24 hours. No errors were reported by either, and the system is stable. Strangely enough no syslogd messages appeared during prime95's run. Windows does not report errors in event log, but I have not been in windows long enough to be too sure of this. I understand that CPUs rarely break, but maybe this is one of the rare cases? Is there a problem in just disabling the messages from syslogd, as there are no problems I can detect? If not, what is next?

SillySyslogd

Posted 2014-03-25T17:21:25.100

Reputation: 11

Try running cpuburn. Perhaps that will help to home the problem down. And install mcelog to help read and decode machine check exception events. – jpe – 2014-03-25T17:28:37.817

I am having some problems getting mcelog to work. I started the daemon, errors keep occuring every 5 minutes, yet mcelog --client does not report anything at all, and /var/log/mcelog is practically empty except for a message about a failed prefill of DIMM database which is no problem according to mcelog's faq, and a message that the daemon is already running. – SillySyslogd – 2014-03-25T18:10:02.340

It's reporting a faulty CPU, try a replacement/known-good CPU. – Ƭᴇcʜιᴇ007 – 2014-03-25T18:11:25.910

@techie007 Thanks for the response, but I knew this already. The main question was whether it can do harm to simply disable the messages as there are no problems with stability. – SillySyslogd – 2014-03-25T18:54:42.113

If you trust it, then disable them. Personally I wouldn't trust it until I saw a fresh copy of the same OS running on a known-good CPU do the same thing. :) – Ƭᴇcʜιᴇ007 – 2014-03-25T19:07:31.740

No answers