One of our Dell PowerEdge LCDs was showing "CPU 2 machine check error", but I couldn't find anything in the logs regarding MCE or "Hardware Error." I cleared the message, but I wanted to run the machine through some heavy stuff to see if I could make it stumble again.
I utilized an infinite loop bash script executed 64 times (once for each core) for a few minutes. Then I used a program called "stress" to do the same thing with CPU and memory. My question is, what is a sufficient amount before it's generally OK to say, "okay, this machine is good to go"? A few minutes? An hour? As long as CPU temps remain OK?