There is a Inspur SA5212SC
server involve memory error. In dmesg output I got some error:
{16}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
{16}[Hardware Error]: APEI generic hardware error status
{16}[Hardware Error]: severity: 2, corrected
{16}[Hardware Error]: section: 0, severity: 2, corrected
{16}[Hardware Error]: flags: 0x01
{16}[Hardware Error]: primary
{16}[Hardware Error]: fru_text: CorrectedErr
{16}[Hardware Error]: section_type: memory error
{16}[Hardware Error]: node: 0
{16}[Hardware Error]: device: 0
{16}[Hardware Error]: error_type: 2, single-bit ECC
Then I use mcelog got some information like this:
CPU 0 BANK 7
MISC 15038d086 ADDR 4d44e0a80
TIME 1450362165 Thu Dec 17 22:22:45 2015
MCG status:
MCi status:
Corrected error
MCi_MISC register valid
MCi_ADDR register valid
MCA: MEMORY CONTROLLER RD_CHANNEL3_ERR
Transaction: Memory read error
STATUS 8c00004000010093 MCGSTATUS 0
MCGCAP 1000c17 APICID 0 SOCKETID 0
CPUID Vendor Intel Family 6 Model 62
My dmidecode -t 17 output like:
Locator: P1-DIMMA1
Bank Locator: P0_Node0_Channel0_Dimm0
Here is OS information:
version: rsyslog-5.8.10-8.el6.x86_64
kernel: 2.6.32-431.el6.x86_64
OS version: Red Hat Enterprise Linux Server release 6.5
How can I confirm which memory stick is bad? I want to know the slot number, thanks a lot.