103

Lots of different programs, such as Darik's Boot and Nuke, let you write over a hard drive multiple times under the guise of it being more secure than just doing it once. Why?

Gilles 'SO- stop being evil'
  • 50,912
  • 13
  • 120
  • 179
Tom Marthenal
  • 3,272
  • 4
  • 22
  • 26

4 Answers4

113

Summary: it was marginally better on older drives, but doesn't matter now. Multiple passes erase a tree with overkill but miss the rest of the forest. Use encryption.

The origin lies in work by Peter Gutmann, who showed that there is some memory in a disk bit: a zero that's been overwritten with a zero can be distinguished from a one that's been overwritten with a zero, with a probability higher than 1/2. However, Gutmann's work has been somewhat overhyped, and does not extend to modern disks. “The urban legend of multipass hard disk overwrite and DoD 5220-22-M” by Brian Smithson has a good overview of the topic.

The article that started it is “Secure Deletion of Data from Magnetic and Solid-State Memory” by Peter Gutmann, presented at USENIX in 1996. He measured data remanence after repeated wipes, and saw that after 31 passes, he was unable (with expensive equipment) to distinguish a multiply-overwritten one from a multiply-overwritten zero. Hence he proposed a 35-pass wipe as an overkill measure.

Note that this attack assumes an attacker with physical access to the disk and somewhat expensive equipment. It is rather unrealistic to assume that an attacker with such means will choose this method of attack rather than, say, lead pipe cryptography.

Gutmann's findings do not extend to modern disk technologies, which pack data more and more. “Overwriting Hard Drive Data: The Great Wiping Controversy” by Craig Wright, Dave Kleiman and Shyaam Sundhar is a recent article on the topic; they were unable to replicate Gutmann's recovery with recent drives. They also note that the probability of recovering successive bits does not have a strong correlation, meaning that an attacker is very unlikely to recover, say, a full secret key or even a byte. Overwriting with zeroes is slightly less destructive than overwriting with random data, but even a single pass with zeroes makes the probability of any useful recovery very low. Gutmann somewhat contests the article; however, he agrees with the conclusion that his recovery techniques are not applicable to modern disks:

Any modern drive will most likely be a hopeless task, what with ultra-high densities and use of perpendicular recording I don't see how MFM would even get a usable image, and then the use of EPRML will mean that even if you could magically transfer some sort of image into a file, the ability to decode that to recover the original data would be quite challenging.

Gutmann later studied flash technologies, which show more remanence.

If you're worried about an attacker with physical possession of the disk and expensive equipment, the quality of the overwrite is not what you should worry about. Disks reallocate sectors: if a sector is detected as defective, then the disk will not make it accessible to software ever again, but the data that was stored there may be recovered by the attacker. This phenomenon is worse on SSD due to their wear leveling.

Some storage media have a secure erase command (ATA Secure Erase). UCSD CMRR provides a DOS utility to perform this command; under Linux you can use hdparm --security-erase. Note that this command may not have gone through extensive testing, and you will not be able to perform it if the disk died because of fried electronics, a failed motor, or crashed heads (unless you repair the damage, which would cost more than a new disk).

If you're concerned about an attacker getting hold of the disk, don't put any confidential data on it. Or if you do, encrypt it. Encryption is cheap and reliable (well, as reliable as your password choice and system integrity).

Gilles 'SO- stop being evil'
  • 50,912
  • 13
  • 120
  • 179
  • 21
    I should point out that US Government when it wants to wipe any storage media that contains senistive documents, will destory the media itself, in the case of a hdd its placed in a furnance. In the case of disk media ( CDR, DVDR ) shredder does a wonderful job. – Ramhound Jan 09 '12 at 14:25
  • 1
    Indeed, Gutmann's findings do not extend to modern disk technologies, but that's not where research ended. Also, companies like Heise Security showed it is possible with modern (read: current) discs under the correct conditions. Physically destroying the media (as @Ramhound correctly stated) is probably the safest way to do it... yet, even those media will be multi-written and then low-level formatted before they are passed on for destruction. –  Jan 10 '12 at 00:53
  • 1
    It should be noted that the CMRR utility fails for modern drives too simply because it's a DOS tool; and DOS does not understand SATA or other kinds of recent HDD controller systems. – Billy ONeal Apr 18 '12 at 17:41
  • If a drive decides that a sector seems dubious and redirects all future writes somewhere else, will most secure-erase programs *ever* overwrite the original, or will they simply keep hitting the remapped sector? – supercat Feb 15 '16 at 18:57
  • @supercat I don't know. In principle, they should ensure that all sectors are physically unreadable, but that relies on an optimal implementation. Secure erase firmware is often very opaque. It's hard to test independently because the recovery attacks on logically-but-not-physically-erased sectors aren't cheap. – Gilles 'SO- stop being evil' Feb 15 '16 at 19:00
  • 4
    The purpose of Gutmann's 35 passes was to provide a "universal" wiping pattern that would work on any drive in use at the time, without the user needing to know what encoding system the drive used. It consisted of five patterns each done twice to cover MFM (the low storage density of MFM drives made them particularly vulnerable to data remnance), 15 patterns to cover (2,7)RLL, 18 patterns to cover (1,7)RLL, and eight passes with random data to try to deal with PRML. Since there's some overlap between patterns, this totals up to 35 passes. – Mark Nov 11 '17 at 00:03
  • Guttman is a real madlad... – NotStanding with GoGotaHome May 25 '22 at 15:55
36

There is a well-known reference article by Peter Gutmann on the subject. However, that article is a bit old (15 years) and newer harddisks might not operate as is described.

Some data may fail to be totally obliterated by a single write due to two phenomena:

  • We want to write a bit (0 or 1) but the physical signal is analog. Data is stored by manipulating the orientation of groups of atoms within the ferromagnetic medium; when read back, the head yields an analog signal, which is then decoded with a threshold: e.g., if the signal goes above 3.2 (fictitious unit), it is a 1, otherwise, it is a 0. But the medium may have some remanence: possibly, writing a 1 over what was previously a 0 yields 4.5, while writing a 1 over what was already a 1 pumps up the signal to 4.8. By opening the disk and using a more precise sensor, it is conceivable that the difference could be measured with enough reliability to recover the old data.

  • Data is organized by tracks on the disk. When writing over existing data, the head is roughly positioned over the previous track, but almost never exactly over that track. Each write operation may have a bit of "lateral jitter". Hence, part of the previous data could possibly still be readable "on the side".

Multiple writes with various patterns aim at counterbalancing these two effects.

Modern hard disks achieve a very high data density. It makes sense that the higher data density is, the harder it becomes to recover traces of old overwritten data. It is plausible that recovering overwritten data is no longer possible with today's technology. At least, nobody is currently positively advertising such a service (but this does not mean that it cannot be done...).

Note that when a disk detects a damaged sector (checksum failure upon reading), the next write operation over that sector will be silently remapped to a spare sector. This means that the damaged sector (which has at least on wrong bit, but not necessarily more then one) will remain untouched forever after that event, and no amount of rewriting can change that (the disk electronic board itself will refuse to use that sector ever again). If you want to be sure to erase data, it is much better to never let it reach the disk in the first place: use full-disk encryption.

Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • 3
    +1 for this answer for going over the details of the science. As a computer scientist and full stack software engineer, this is the kind of answer that actually answered the question for me. – But I'm Not A Wrapper Class Nov 05 '14 at 16:05
6

The answers provided so far are informative but incomplete. Data are stored on a (magnetic) hard disk using Manchester coding, such that it's not whether the magnetic domain points up or down that encodes one or zero, it's the transitions between up and down that encode the bits.

Manchester coding usually starts with a little bit of nonsense data suitable for defining the 'rhythm' of the signal. It's possible to imagine that if your attempt to overwrite the data with all zeroes once wasn't exactly in phase with the timing under which the original data were stored, it'd still be super-easy to detect the original rhythm and edges, and reconstruct all of the data.

  • That's a good point, but there's a long way from “it's possible to imagine” to “you can actually recover information”. As far as I understand Wright et al's work, they did take this alignment issue into account. – Gilles 'SO- stop being evil' Jan 08 '12 at 18:59
  • Graham - it used to be relatively easy. It isn't so much these days - tolerances are tighter, and transitions are much smaller. – Rory Alsop Jan 11 '12 at 21:12
4

A nice question with takes a two-part answer:

  1. When talking about "ye average hard disc", there's a first reason why to overwrite multiple times.

    In short: HDs are like magnetic discs. There's the chance of remaining "shadow bytes" that could be recovered. That's the reason why NSA and co have been using multiple-overwrites for ages. These "shadow bytes" are nothing more than "ghosty" remnants of formerly deleted and overwritten data. HD recovery companies actually use those to recover critical data when it's lost.

    Another way would be to use strong magnetics to kill your HD, but that way - chances are big you'll either not use a strong enough magnetic force, or you'll destroy more of your HD than you'ld want to. So that's not really an option.

  2. You'll probably also want to check some of the data-erasure standards for more detailed information http://en.wikipedia.org/wiki/Data_erasure#Standards and notice that the standards only overwriting 1 time, always make data-erasure-verification mandatory.

    On "some operating systems" (note that I'm not starting a political discussion about the best or worst operating system here), this erasure-verification is not as solid as it should be... which means their "erasure-verification" is actually unsafe enough to worry about than even starting to think about potential shadow-byte remnants which could be recovered.

UPDATE

Since this is such a much-discussed subject, with many people saying "pro" and "contra" while quoting the "Secure Deletion of Data from Magnetic and Solid-State Memory" paper, published by Peter Gutmann in 1996 ( http://www.usenix.org/publications/library/proceedings/sec96/full_papers/gutmann/index.html ), the following should be noted:

...Security researchers from Heise Security, who have reviewed the paper presented at last year's edition of the International Conference on Information Systems Security (ICISS), explain that a single byte of data can be recovered with a 56 percent probability, but only if the head is positioned precisely eight times, which in itself has a probability of occurring of only 0.97%. “Recovering anything beyond a single byte is even less likely,” the researchers conclude...

( source: http://news.softpedia.com/news/Data-Wiping-Myth-Put-to-Rest-102376.shtml )

On the other hand, there are military and governmental data-erasure that define multiple overwrites as a must. (I will refrain from judging that, but you should think about the "why" of this fact.)

In the end, it's a question of "what data" you want to "protect" by erasing it, and how important it is that the erased data will be unrecoverable in any potential case. Let's put it this way: if you want to delete your personal dairy, you probably don't need to overwrite each and every free sector... but if you're working on plans for a nuclear power plant or some "secret project" for your government, you'll not want to let a single byte as is.

When people ask me personally, I always answer: "better safe than sorry. There are solid standards defining multiple-overwrites of freed sectors as a must. There is a reason for that. Scientific papers (even recent ones) show that recovery of erased data is possible using different means. Even when the chance is small, I wouldn't take the risk and I think it would be unprofessional to advise anyone not to think about that risk."

I guess that wraps it up best.

WRAPPING IT UP

So, to correctly answer your question:

Lots of different programs, such as Darik's Boot and Nuke, let you write over a hard drive multiple times under the guise of it being more secure than just doing it once. Why?

They do that, based on Gutmann's paper and the existing standards used by governmental institutions.

  • Just guessing, but it seems to me that writing random instead of zero bytes would be better as far as hiding the shadow goes... – 700 Software Jan 08 '12 at 00:05
  • 1
    @GeorgeBailey Very slightly better, but not so much that overwriting with zero bytes isn't effective; see http://grot.com/wordpress/?p=154 and the Wright, Kleiman and Sundhar paper or my answer. – Gilles 'SO- stop being evil' Jan 08 '12 at 01:28
  • 4
    @e-sushi, Can you cite a source on *HD recovery companies actually use those to recover critical data when it's lost*? – 700 Software Jan 08 '12 at 22:27
  • Source of that? A friend of mine runs a data-recovery company. So there's no "link" I can provide. Anyway, I've updated my answer with some more information to give you some of the useful details you seem to be missing. –  Jan 10 '12 at 00:46
  • 4
    "*NSA [...] have been using multiple-overwrites for ages*" Any source for this? My source say they have used physical destruction and not multiple-overwrites. Would love if you had a solid reference. – Nicolas Raoul Jul 30 '15 at 08:22
  • @NicolasRaoul solid references about a top sekret government agency in the states? – NotStanding with GoGotaHome May 25 '22 at 15:57