I am building a high-spec workstation on a X99 chipset, and I found that 64GB of DDR4 (non-ECC) RAM is quite affordable.
This got me wondering, because data integrity in my workloads are important; I specifically wanted to ask a question about what the expected frequency of corruption and memory errors might be, and at what sort of capacity level does ECC memory start to make a lot of sense.
We are balancing various variables here.
- system stability/data integrity/data corruption rate (affected by not only the type, but the quantity/density of the RAM)
- cost
- speed
Different ways to configure things would include:
No ECC, high-end i7 CPU, overclock it, somewhat faster RAM as well. This is cheaper.
Xeon CPU, no OC allowed, ECC ram supported, many more options (reg/buff'd) for RAM available as well, and much higher capacities of RAM are also possible. More expensive.
This is somewhat related to this question but I wanted to ask a more specific question about how I should be looking to balance these factors because sometimes having more speed for significantly less cost, but with slightly reduced data integrity guarantees can still be a win, especially in a situation where it's not really clear which side of the server/workstation line we're on.
You also have factors such as, with a non-ECC "consumer" type system you can "screen" them for stability by memtesting them, which can be a significant time investment. The cost of this downtime and the effort involved also should be factored in.