Is ECC RAM recommended for use in workstations, or is it something that only gets used in servers? If non-ECC RAM works in PCs, why would we need ECC RAM at all?
10 Answers
As stuff is stored into, left, and eventually pulled out of RAM, some corruption naturally occurs (theories vary, but the one with the most weight right now is EMI from the computer itself). ECC is a feature of RAM and motherboards that allows detection and correction of this corruption.
The corruption is usually pretty minor (ECC can usually detect and fix 1-2 bits per 64 bit "word" - and that's waaaaay beyond the typical error rates), but increases in frequency with the density of the RAM. Your average workstation/PC will never notice it. On a server where you're running high density RAM 24/7 in a high-demand environment serving critical services, you take every step you possibly can to prevent stuff from breaking.
Also note that ECC RAM must be supported by your motherboard, and the average workstation/PC does not support it.
ECC RAM is more expensive than non-ECC, is much more sensitive to clock speeds, and can incur a small (1-2%) performance hit. If it helps, an analogy that works is RAM to RAID controllers. On your PC, that hardware-assisted software RAID built into your chipset is great protection against single disk failures. On a server, that would never be enough. You need high-end, battery-backed fully hardware RAID with onboard RAM to ensure that you don't lose data due to a power outage, disk failure, or whatever.
So no, you don't really need ECC RAM in your workstation. The benefit simply will not justify the price.
- 6,756
- 7
- 46
- 65
-
2As also pointed out in [Basil Bourque's answer](http://serverfault.com/a/441076/58408), the prices have pretty much converged these days (when I looked most recently, the price difference was the ~10% you'd expect from the additional chip area, for the same usable amount of RAM). It might be worth revisiting particularly the last sentence in light of this. – user Oct 18 '15 at 13:07
If this article is anything to go by, then you should use ECC RAM.
It's not just a matter of "I don't run a server, so I don't need it". It depends how much you value your data. It's not just a matter of occasional crashes - the problem is you could get corruption and have no way of knowing that it's going on.
- 261
- 4
- 9
-
4From the article: "[...] 4 GB of RAM has a 96% percent chance of having a bit error in three days without ECC RAM". This sounds like computers should be crashing constantly and data should become corrupted all the time. Yet everyone seems to be doing pretty fine without ECC... why? – Calimo Jun 08 '15 at 14:56
-
1That's because that article is false when it comes to the error rate. The actual error rate is lower by many orders of magnitude. See the relevant reddit thread https://www.reddit.com/r/programming/comments/ayleb/got_4gb_ram_no_ecc_then_you_have_95_probability/ – mimrock Aug 11 '15 at 11:32
-
Whatever the error rate is, it also depends what is affected. Chances are it's not something that causes a system crash. – sudo Jun 17 '18 at 01:43
ECC RAM gets more interesting as memory sizes grow. The probability of a single bit error in a machine with 8GB of RAM is quite a lot higher than it was in the days of a 640K PC/XT, simply due to the larger number of bits. On a database server where that RAM might be in a disk buffer, a bit error can corrupt disk storage as well. Generally you would expect to use ECC memory on a server.
Some workstations (particularly those with Xeon or Opteron CPUs) take registered memory, which pretty much only comes in ECC flavours anyway. On a desktop PC you may view it as overkill.
- 8,810
- 2
- 31
- 52
If you want a reliable workstation then you want ECC RAM for it. It will crash less often and work done on it and documents cached in RAM will not be randomly corrupted.
- 886
- 5
- 13
-
4It seems like an immeasurably small chance of improved stability. The only RAM-related crashes I'm aware of on workstations are due to bad RAM or bad applications, never something that ECC would have prevented. It makes some (read: still only a tiny bit) of sense on servers where you're crunching terabytes of data constantly, but on workstations maybe the only thing that gets close is high-end graphics rendering or video processing. In short, I think you can get a completely reliable workstation without ECC RAM. – Chris Thorpe Feb 08 '10 at 04:55
-
I ran memtest86 several times overnight without any error. That's how often memory flip occurs... If lives depend on it, that would justify using ECC, otherwise I don't think this is a real issue 99.9% of the cases. It is very unlikely that 1 random bit a month will hit something critical in terabytes of data. – inf3rno May 07 '17 at 19:57
-
-
1@inf3rno That argument? Overnight is nothing. Altitude matters. I live in Colorado and on ECC system see several correctable errors each month. Come back with logs from an ECC system or you have no information. Also read Google's report on ECC errors. – Zan Lynx May 07 '17 at 20:04
-
@ZanLynx How many errors do you see per month? What would be the impact on your system? – inf3rno May 07 '17 at 20:16
-
1
-
@ZanLynx I found something better: https://en.wikipedia.org/wiki/Row_hammer – inf3rno May 08 '17 at 23:21
-
@ZanLynx I read your article, it is nice. I will sell my non-ECC RAM and buy ECC ones for my new microserver. It is important because servers boot rarely and without booting there is no frequent RAM check, so a hard error can stay in the system undetected for a long time and can corrupt data. – inf3rno May 08 '17 at 23:29
An additional benefit of ECC over what was mentioned above is that you can detect bad RAM. While running a long memtest86 session will usually find any problems, there may be very specific problems with the RAM which only show up rarely and in certain use cases. This can still happen much more frequently than the corruption that perfectly good ECC RAM is designed to protect against -- maybe once every month. So if you install monitoring software, you can be sure that your RAM is good, or replace bad chips. Still a marginal benefit, but as ECC memory is not much more expensive than normal RAM, it may be worth it.
- 173
- 6
ECC RAM is designed to aid in preventing and fixing memory based errors, usually using some sort of hamming code or modular redundancy. This is very useful in servers that contained important data, or need high availability, but it comes at a cost.
Whilst its probably worth paying the extra for your important servers, do you really want to do so for your desktop machine, does it matter if there is the occasionally memory error? Sure it matters if your SQL database drops some data during a transaction, but do you care if your word document is affected by a slight memory blip?
- 38,158
- 6
- 77
- 113
ECC memory now costs about the same as non-ECC memory, as prices have dropped. So check prices; if prices are anywhere close, buy ECC if your workstation accommodates it.
- 801
- 1
- 11
- 22
I think there may be some confusion just based on the title of the question.
If you just mean the average desktop PC, then that is usually based on a platform that doesn't even have ECC support.
If you mean a workstation class computer, then it quite likely comes with ECC memory whether you care about it or not.
Overall, the workstation class is typically based on essentially server hardware but with proper graphics and packaged in a different form-factor.
The expected workload is also more taxing than that of the desktop PC, so if you acknowledge that ECC makes sense for servers, then I think it's not much of a stretch that ECC also makes sense for workstations.
For Desktop PCs, there's some debate whether ECC would make sense or not. It can absolutely be argued that everything ought to have ECC but, right now, it's not practical as the industry has decided to make ECC a feature to differentiate higher end hardware.
- 33,741
- 5
- 65
- 90
According to the article Zan Lynx linked in the comments: DRAM Errors in the Wild: A Large-Scale Field Study, the uncorrectable errors are common while random correctable errors appear rarely in a system. The incidence is probably a few in a year, but it depends on the usage.
So in a server environment the correctable errors might not be that important, but you boot the server machines rarely, so uncorrectable errors caused by failing RAM can be there undetected for a while corrupting your data. I think that's the main reason why servers need ECC. Workstations boot and so check RAM frequently, so hardware failures can be detected by every reboot. If that frequency is sufficient for your business, then I think you won't need ECC RAM in your workstation.
If we are talking about memory errors, it is better to version the important documents on the server. So if the workstation reads and modifies something, then the original content should not be overwritten on the server. Regular backups can do the same for you.
Another aspect of this question is security. If your workstation is connected to any non-safe network, then it might be vulnerable to the row hammer attack, which exploits a DRAM related phenomenon. So from security perspective it is better to use ECC RAM.
- 398
- 2
- 4
- 17
I would use ECC everywhere, always, and I'm upset and disappointed I cannot get it on my MacBook Pro laptop, because otherwise it could completely replace my desktop server. In my desktop server, over 10 years I have had to replace 2 memory sticks due to persistent errors which would have just been random crashes and data corruption if it were not for ECC detecting and correcting them. Not counting all the errors in those failed sticks, I would estimate seeing about 8 errors per year. Of course, this is a tiny sample and anecdotal evidence; a much better resource is Google's study which showed that about 1/3 of their servers experienced at least one memory error in a given year.
I agree with Linus Torvalds (creator of Linux) that the real driving force behind keeping ECC unavailable is Intel:
Torvalds takes the bold position that the lack of ECC RAM in consumer technology is Intel's fault due to the company's policy of artificial market segmentation. Intel has a vested interest in pushing deeper-pocketed businesses toward its more expensive—and profitable—server-grade CPUs rather than letting those entities effectively use the necessarily lower-margin consumer parts.
Removing support for ECC RAM from CPUs that aren't targeted directly at the server world is one of the ways Intel has kept those markets strongly segmented. Torvalds' argument here is that Intel's refusal to support ECC RAM in its consumer-targeted parts—along with its de facto near-monopoly in that space—is the real reason that ECC is nearly unavailable outside the server space.
- 1,335
- 10
- 20