Running omreport chassis
results in:
Health
Main System Chassis
SEVERITY : COMPONENT
Ok : Fans
Ok : Intrusion
Critical : Memory
Ok : Power Management
Ok : Processors
Ok : Temperatures
Ok : Voltages
Ok : Hardware Log
Ok : Batteries
For further help, type the command followed by -?
Running dcicfg command=clearmemfailures
in order to clear the SBE fails:
Clearing failures using mask: 31
DIMM_X1 : failed status: 270
Based on this message the assumption was that the command should be issued on the memory that is causing the issue.
Consulting the help by executing dcicfg command=clearmemfailures -?
resulted in:
Dell(R) Data Engine Data Engine Configuration Utility 7.4.0 (BLD_1)
Copyright (C) Dell Inc. 1995-2013
Usage: dcicfg command=COMMAND [PARAMETERS...] [OPTIONS...]
COMMAND:
clearmemfailures Clear memory device failure mode
PARAMETERS:
listonly=BOOLN (opt.) list all occupied memory connectors
connectors=STRING (opt.) memory device connector name (default=all)
failures=STRING (opt.) failure type to clear (default=all)
Running omreport chassis memory
indicates which memory is causing the issue:
Index : 3
Status : Critical
Connector Name : DIMM_Y1
Type : DDRY - Synchronous Unregistered (Unbuffered)
Size : Y MB
and issuing dcicfg command=clearmemfailures connectors=DIMM_Y1
indicated that the memory connector cannot be found:
Clearing failures using mask: 31
failed to find any memory connector based on the names provided
omreport chassis memory index=3
indicates that the memory has thrown SBEs:
Memory Device Information
Health : Critical
Status : Critical
Device Name : DIMM_Y1
Size : Y MB
Type : DDRY Synchronous Unregistered (Unbuffered)
Speed : Y ns
Rank : Dual
Failures : Single-bit warning error rate exceeded.
Single-bit failure error rate exceeded.
Questions
- What does the failed status
270
mean? - Why can the memory connector not be found while it has been specified and it exists?
- How to clear SBEs?
Attempts to solve the issue
The following commands from this Q&A:
- sudo omconfig system esmlog action=clear
- sudo omconfig system alertlog action=clear
were issued to clear the SBE, but the Critical
memory status persists.