Often when I see people talking about safe deletion, I see lots of advice's that you should use random values to overwrite. My question is why is it better or more efficient than filling all bytes with 0xFF
that put all bits on?
3 Answers
Multiple overwrite passes with random values were recommended in the 1980s and 1990s, as described in the history recounted by @FlorianBidabe. Back when data was encoded on disk platters that had physically wide read/write heads and space between the tracks to ensure data isolation, a write head might erase 80% of the magnetic media holding the bit, ensuring the drive's crude electronics couldn't read it. But these loose tolerances left 20% of the magnetic domain behind, which were exploitable by someone looking to recover data from a disk. Writing random data instead of all zeroes helped mask the original value of the bit. It was discovered that multiple passes of writes increased the chances that more of that remaining 20% would be overwritten, giving people more confidence their data was truly unrecoverable.
This recommendation is no longer applicable to modern disk hardware. The reason is that hard drive manufacturers have been improving manufacturing tolerances to increase the data density on disk drives. That 80% sloppy write head of yesterday was replaced with more accurate motors, smaller read/write heads, and completely new technology such as vertical recording of magnetic fields. Today, an encoded field occupies a tiny space of only a few atoms on the platter, which is how manufacturers can get data densities in the range of 2.5 TB/square inch or more.
But no matter how tiny the magnetic fields on the platters have become, drive manufacturers have been incorporating sophisticated computing techniques to improve drive performance and reliability. These techniques are meant to preserve data, and actually can prevent the secure deletion of data using overwriting techniques. For example, a drive will reserve an unused fraction of blocks just in case a part of the platter becomes unreliable, and in case of failure it will automatically migrate the data from the bad part of the platter to the good part. That means it no longer matters if you overwrite the disk with zeros, 0xFFs, or random numbers. If the sector that went bad used to hold a secret, that secret would never be overwritten by any pattern of wiping.
These continual improvements of hard drive technologies have created the need for more reliable techniques to sanitize disk drives. Today, cryptographic wiping is recommended. The data on the drive can be encrypted with a random key stored only in a flash chip on the drive. To destroy the data, the flash memory is erased and the only copy of the key is wiped out of existence. With no decryption key available, the encrypted data is securely and instantly destroyed.
For more information on acceptable disk sanitization techniques, I recommend reading the NIST standard 800-88, rev 1.
- 33,650
- 3
- 57
- 110
Data remanence is the residual representation of digital data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting of storage media that does not remove data previously written to the media, or through physical properties of the storage media that allow previously written data to be recovered.
The simplest overwrite technique writes the same data everywhere—often just a pattern of all zeros. At a minimum, this will prevent the data from being retrieved simply by reading from the media again using standard system functions.
In an attempt to counter more advanced data recovery techniques, specific overwrite patterns and multiple passes have often been prescribed. These may be generic patterns intended to eradicate any trace signatures, for example, the seven-pass pattern: 0xF6, 0x00, 0xFF, random, 0x00, 0xFF, random; sometimes erroneously[clarification needed] attributed to the US standard DOD 5220.22-M.
Feasibility of recovering overwritten data
Peter Gutmann investigated data recovery from nominally overwritten media in the mid-1990s. He suggested magnetic force microscopy may be able to recover such data, and developed specific patterns, for specific drive technologies, designed to counter such.[2] These patterns have come to be known as the Gutmann method.
Daniel Feenberg, an economist at the private National Bureau of Economic Research, claims that the chances of overwritten data being recovered from a modern hard drive amount to "urban legend".[3] He also points to the "18½ minute gap" Rose Mary Woods created on a tape of Richard Nixon discussing the Watergate break-in. Erased information in the gap has not been recovered, and Feenberg claims doing so would be an easy task compared to recovery of a modern high density digital signal.
As of November 2007, the United States Department of Defense considers overwriting acceptable for clearing magnetic media within the same security area/zone, but not as a sanitization method. Only degaussing or physical destruction is acceptable for the latter.[4]
On the other hand, according to the 2006 NIST Special Publication 800-88 (p. 7): "Studies have shown that most of today’s media can be effectively cleared by one overwrite" and "for ATA disk drives manufactured after 2001 (over 15 GB) the terms clearing and purging have converged."[5] An analysis by Wright et al. of recovery techniques, including magnetic force microscopy, also concludes that a single wipe is all that is required for modern drives. They point out that the long time required for multiple wipes "has created a situation where many organisations ignore the issue all together – resulting in data leaks and loss."[6]
- 703
- 4
- 10
So not too too long ago this used to be a recommendation. The reason was that Peter Gutmann devised a way to recover (with > 50% odds) the previous state of a bit. Writing all the same values would essential make it possible to reverse the write. The accepted answer on this question has more details.
The origin lies in work by Peter Gutmann, who showed that there is some memory in a disk bit: a zero that's been overwritten with a zero can be distinguished from a one that's been overwritten with a zero, with a probability higher than 1/2. However, Gutmann's work has been somewhat overhyped, and does not extend to modern disks.
It's less of a concern these days.
- 5,105
- 1
- 17
- 29