Is it safe to use not ECC RAM for cold backup server?

Question

I need home computer for simple backup task (just cronjob on Linux, it will run once per day):

Download file from my production server (in datacenter, it's good server with Xeons & ECC RAM etc.) to this home computer, verify checksums.
Burn it to DVD-RW (later I will buy Blu-ray drive for that, later (not sure, price is too big for me now) I will buy tape drive and will write backups to LTO tapes).
After burning, read disk, and verify checksums again.

Is it safe to use not ECC RAM for that task (cold backup server)?

Because of I'm going to use rewritable disks, it will be free to repeat operation on error, so it's acceptable solution for me if someday I will need to spend some more time on this operation.

But I'm not sure, can file verification save me from memory errors?..

So do I need to buy separate server with ECC RAM to my home for that task or I can just use my old home PC (RAM without ECC) and do not spend money at all?.. [I can buy server, I understand, it's all cheap today, but I prefer not to spend money if it's possible do not spend them, also it will eat more energy, spend my time on administration and I need to find space for it in my room..]

@MichaelHampton I will use it at home. So I'm going to download backups via Internet from production server to home computer and write backups to DVD-RW... I would like to save some space in my room for books and do not buy additional PC if it's possible. it will take place — Alexander Ovchinnikov, Nov 28 '15 at 06:08
This question is being voted for closure because it prominently involves a scenario in a home or home office environment which is both off topic on ServerFault and may itself be causing the problem, or actively hindering a repeatable and consistent solution. — Wesley, Nov 28 '15 at 06:22
@Wesley It's good idea to save backups in 2 separate locations (from server datacenter). I use Hetzner datacenter (Germany), I save backups to Amazon (USA) and also going to write backups to external storage devices (DVD & Blu-ray & LTO tapes) in my home (Moscow). Of course, I'm going to use encryption for my backups. And of course I'm not going to use my home computer as primary backup solution. So I'm disagree with your opinion. — Alexander Ovchinnikov, Nov 28 '15 at 06:29
@AlexanderOvchinnikov Any time a business solution ever includes personal property it ceases to be a business solution. Furthermore, it ceases to be on topic for ServerFault. — Wesley, Nov 28 '15 at 06:40
@Wesley, I'm self-employed, work at home. And I would like to add new additional backup location for my personal startup project, I'm 100% owner of this project. Why do I can't just use my personal PC at home to do that? :-) In this topic I just try to understand, will file checksums verification save me from memory errors or not?.. — Alexander Ovchinnikov, Nov 28 '15 at 06:55
As another consultant sysadmin whose main office is a room in his home, I can understand how the lines get blurred between personal and work equipment, so I think this question's still on-topic - its approach seems professional, and that'll do for me. That said, when you say "*cold backup server*" do you mean "*backup server*" (ie, a server to do the backups)? Because a *cold backup server* to me is a unit that sits (powered off, on a shelf) ready to be brought into service if a main unit fails, and my answer is different depending on which of these two functions you have in mind. — MadHatter, Nov 28 '15 at 08:02
@MadHatter I think "cold backup server" in this case means "cold backup generating server", so the backups it generates are "cold" akin to "cold storage", where there has to be some sort of process to "thaw" the stored goods prior to use, and not that the "backup server" will be used as a drop-in-place server to spin up in the event of loss of the original. — austinian, Nov 28 '15 at 10:57
This guy seems to think so: http://blog.codinghorror.com/to-ecc-or-not-to-ecc/ — Rob, Nov 28 '15 at 14:17
@MadHatter Yes, just server for burning backups to DVD (4.7Gb/disk) / Blu-ray (25Gb/50Gb/100Gb per disk, good for backups, 100Gb disks is not available in Moscow.. ) / LTO (not sure, about LTO, even 25Gb on Blu-ray disk - it is more, than I really need now)... This server will use RAID-1 storage (MB supports it) or Btrfs filesystem in raid-1 mode on 2 or 4 disks (going to buy HGST 2-4Tb HDD for that) for file storage until burning and ~1 week after that date (it will save some time for me). So main goal - cold backup generating server... and little cache for last week backups. — Alexander Ovchinnikov, Nov 28 '15 at 15:30

score 6 · Answer 1 · answered Nov 29 '15 at 10:49

You don't need ECC memory for this. What you need is end-to-end verification of the data integrity.

If you use both ECC memory and have end-to-end integrity checks, then the ECC memory will be one of multiple intermediate storages your data travels through. All of them will be covered by the end-to-end checks, so any corruption not caught by the ECC memory would be treated just the same as corruption happening somewhere else in the chain.

If you were to use ECC memory and no end-to-end integrity checks, then the ECC memory could save you from some of the corruptions that could otherwise go undetected. But the lack of end-to-end integrity would mean corruption could happen in other locations along the way. If the corruption happens in another location than the ECC memory, then there is nothing ECC memory can do to save you from it.

So ECC memory is neither sufficient nor necessary for the data integrity checks needed in your case. Which is why I started this answer by saying that you don't need it.

One way to do end-to-end integrity checks is to produce an ISO image on the server itself and store a checksum (MD5 would be sufficient since it is there to guard against data corruption due to random bitflips not to guard against malicious activity).

After the image has been written to the final storage the receiving machine will read the data back from the final storage and compute a checksum which it sends back to the server for verification. It is important that the checksum is computed by reading the data back from the final media, because if you don't do that, it would not be end-to-end integrity.

If the comparison of checksums on the server detects corruption, you have to start the backup over again. Should an extra backup be needed too frequently, then you can start investigating which part of the chain leads to corruption and look into improving the reliability of that part. At that point upgrading the memory from non-ECC to ECC could be a possible solution.

This way ECC memory is not needed for data integrity, but it might be a performance improvement for the full chain.

There's always a small chance that data will be corrupted before the first checksum is calculated. Also you can detect faulty memory much faster and not waste time testing it if you suspect something is wrong — DukeLion, Nov 29 '15 at 16:39
@DukeLion True. The possibility exists to checksum the data as soon as it is produced in the first place and have checksums overlapping along the path the data takes such that you get end-to-end integrity without necessarily having a single end-to-end checksum. I felt that was a bit outside the scope of this question, so I didn't include it in my answer. — kasperd, Nov 29 '15 at 23:33
Ok, some real wolrd example: you take user input in web form, strip html tags and save it to database. What if memory error hits you when tags striping is not finished? — DukeLion, Nov 30 '15 at 01:39
@DukeLion In that case the data is covered by a checksum as it leaves the user's computer. If it is using HTTP there is only the TCP checksum. If it is using HTTPS there is a MAC at the SSL layer. You could send the data with the checksum from the user to two different computers which both verify the checksum and manipulate the data as needed. One of the two computers send the data to the database, the other sends a checksum of the data. If you consider this approach to be overkill you can also just use ECC memory on the server side. — kasperd, Nov 30 '15 at 10:23
I mean that data you receive with TCP in your application is not exactly the same as you store to database. It may be altered before storing, and memory bit flip can ruin it before you have a chance to compute checksum. I agree that ECC is not needed in most cases, but if you run a bank - you absolutely need it. — DukeLion, Dec 01 '15 at 10:55
@DukeLion I addressed that exact scenario in my [comment](http://serverfault.com/questions/739440/is-it-safe-to-use-not-ecc-ram-for-cold-backup-server/739566?noredirect=1#comment924827_739566) above. — kasperd, Dec 01 '15 at 12:13

score 3 · Answer 2 · answered Nov 28 '15 at 10:31

Statistically speaking, you're safe with non-ECC ram in all situations. I buy ECC ram so that when my number is up, I don't have to lie awake at night wondering if it was my fault or if it was truly unavoidable.

It's expensive for protecting against corruption. Early detection software/methods and a well-organized, tested backup solution is much cheaper than outfitting every server with ECC ram (where n>1). Considering you should have those regardless of ECC ram. However, ECC ram is super cheap for cover-your-butt insurance - whether it's your boss you have to face or your own thoughts.

score 2 · Answer 3 · answered Nov 28 '15 at 15:34

2

If your md5 is done on your production server, it is totally safe, because any error will be caught by your last checksum verification.

Memory is not the only element that can modify data: the network transmission and DVD/Disks can also introduce errors.

An end to end checksum will catch every error (but won't correct it though).

answered Nov 28 '15 at 15:34

Xavier Nicollet

600
3
10

most common L2 protocols (wifi, ethernet) do have checksumming, so I won't agree that data modification wiil go unnoticed if happen on the wire. – DukeLion Nov 29 '15 at 11:02
MD5 will catch more errors, it should be stronger and calculated on the whole file, making it extremely difficult to miss an unintentional error. Some extra careful people can also consider using SHA256sum. – Xavier Nicollet Nov 29 '15 at 15:45

Greg Askew · Answer 4 · 2015-11-29T12:53:24.580

0

can file verification save me from memory errors?

File verification is a good idea, but it cannot compensate for memory errors due to the operating system will crash or hang if there is a memory error.

edited Nov 29 '15 at 12:53

answered Nov 28 '15 at 14:01

Greg Askew

34,339
3
52
81

Is it safe to use not ECC RAM for cold backup server?

4 Answers4