zero fill vs random fill

24

8

Many tutorials suggest that i should fill a disk with /dev/urandom instead of /dev/zero if i want it to be unrecoverable. But I don't quite get it, how can a disk still be recoverable after being zero-filled? And is this just very specialized people (read government agencies) who can recover a zero-filled disk, or something your average geek can do?

PS: I'm not THAT worried about my data, I sell used-computers from time to time, and I'd rather the average joe buyers won't recover anything funny from them.

Waleed Hamra

Posted 2012-12-21T20:15:37.390

Reputation: 504

2I'm adding this as a comment because I'm not 100% certain myself, but from using various programs to do this, there's often the option to set how many passes of zero-writes you want to do, which leads me to believe that even writing once doesn't totally obliterate the data. Writing random characters would make it harder for baddies to tell which layers or whatever were good data, and which are just random jibberish. – Chris Stauffer – 2012-12-21T20:21:27.623

I think this program is all you'll ever need for this type of thing (DBAN), the documentation extensively covers this topic.

– Breakthrough – 2012-12-21T23:05:00.600

Answers

14

While filling a disk with /dev/zero will zero it out, most (currently available) recovery software cannot recover files from a single pass. More passes make the erase more secure, but take more time.

/dev/urandom is considered more secure, because it fills the disk with random data (from the Linux kernel's entropy pools), making it harder for recovery software to find any meaningful data (it also takes longer).

In short, a moderate number of passes /dev/urandom is safer if you are trying to securely erase data, but for most casual applications, a few passes from /dev/zero will suffice.

I usually use the two in combination when erasing disks (always erase before reselling or recycling your computer!).

neersighted

Posted 2012-12-21T20:15:37.390

Reputation: 1 206

Ah, I wasn't very far off with my comment. Thank you for posting this answer! – Chris Stauffer – 2012-12-21T20:23:10.620

2@DavidSchwartz I always like to be safe, instead of sorry. Future proofing and paranoia is generally a good thing when dealing with potentially sensitive data. The US DoD's standard is for three passes for all HDDs, even modern ones. There is also no way of knowing the recency of the Asker's HDD, or if it is a SSD, in which case erasing data is near impossible, without filling the whole thing with random data (the urandom method), which will cause massive wear on the SSD, potentially rendering it unusable. – neersighted – 2012-12-21T20:34:30.767

care to explain why wiping an SSD is near impossible? – Waleed Hamra – 2012-12-21T20:37:54.330

3

@WaleedHamra The comments section is much to short, please read this excellent article at Ars Technica. The technology has improved a bit since then (2011), but you should not assume a SSD (or any solid-state device) will reliably delete data when trying to erase sensitive documents.

– neersighted – 2012-12-21T20:41:08.603

6Regarding SSDs: The article is mostly concerned with the deletion of single files, not the entire drive. Regarding HDDs: With modern HDDs, there's no way to recover more even a single bit with decent probability. Saying that most recovery software can still recover files from a single pass is simply wrong. – Dennis – 2012-12-21T20:45:39.850

very interesting, actually, a linked article provides better explanation as to why. thanks @neersighted

– Waleed Hamra – 2012-12-21T20:49:12.177

@Dennis The issue is filling the SSD with random data can wear it out rapidly (zeroing is not quite as bad, but still damaging). Also, David Schwartz's statement was that no (currently available) recovery software could recover from a single pass (on a modern HDD), not the inverse. – neersighted – 2012-12-21T20:49:45.957

@DavidSchwartz That was an oversight, fixed. I do deal with a large assortment of HDDs at work, some of them quite old. With those, even 3 passes may not be safe. It really depends on the quality and recency of the HDD, but unless a TLA (three-letter-agency) is after you or your company, a pass or two should be fine. – neersighted – 2012-12-21T21:09:10.567

22

Many tutorials suggest that i should fill a disk with /dev/urandom instead of /dev/zero if i want it to be unrecoverable.

Whatever you do, do not use /dev/urandom.

On my i7-3770, /dev/urandom produces an astonishing 1 GB of pseudo-randomly generated data per minute. For a 4 TB hard drive, a single wipe with /dev/urandom would take over 66 hours!

If you absolutely must use pseudo-randomly generated data (more on that below), at least use a decently fast way of generating it. For example

openssl enc -aes-128-ctr -pass file:/dev/random 2>/dev/null | tail -c+17

prints an infinite stream of bytes. It uses AES in CTR mode and a password read from /dev/random, so it's cryptographically secure for any hard drive smaller than 1,000,000 TB.

It's also fast. Very fast. On the same machine, it managed to generate 1.5 GB per second, so it's 90 times faster than /dev/urandom. That's more than any consumer-level hard drive can handle.

[I]s this just very specialized people (read government agencies) who can recover a zero-filled disk, or something your average geek can do?

In Overwriting Hard Drive Data: The Great Wiping Controversy, the authors conclude that overwriting a pristine drive (only used for the test) once with non-random data lower the probability of recovering a single bit correctly to 92%. This means that a single byte (one ASCII character) can be recovered with only 51% probability; and there's no way of telling if the byte has been recovered correctly or not.

In real world scenarios (slightly used drive), the probability drops to 56% for a single bit and merely 9% for a single byte.

They took a new drive, wiped it three times to simulate short-term usage, wrote a short text to it and wiped the drive once with non-random data. These were the results:

Original text:

Secure deletion of data - Peter Gutmann - 1996
Abstract
With the use of increasingly sophisticated encryption systems, an attacker wishing to gain access to sensitive data is forced to look elsewhere for information. One avenue of attack is the recovery of supposedly erased data from magnetic media or random-access memory.

Recovered text:

¡ÄuÜtÞdM@ª""îFnFã:à•ÅÒ̾‘¨L‘¿ôPÙ!#¯ -×LˆÙÆ!mC 
2´³„‡·}NŽýñêZØ^›l©þì®·äÖŒv¿^œº0TÏ[ªHÝBš¸ð 
7zô|»òëÖ/""º[ýÀ†,kR¿xt¸÷\Í2$Iå""•ÑU%TóÁ’ØoxÈ$i 
Wï^™oËS²Œ,Ê%ñ ÖeS» eüB®Èk‹|YrÍȶ=ÏÌSáöp¥D 
ôÈŽ"|ûÚA6¸œ÷U•$µM¢;Òæe•ÏÏMÀùœç]#•Q
                                                          Á¹Ù""—OX“h 
ÍýïÉûË Ã""W$5Ä=rB+5•ö–GßÜä9ïõNë-ߨYa“–ì%×Ó¿Ô[Mãü 
·†Î‚ƒ‚…[Ä‚KDnFJˆ·×ÅŒ¿êäd¬sPÖí8'v0æ#!)YÐúÆ© 
k-‹HÈø$°•Ø°Ïm/Wîc@Û»Ì"„zbíþ00000000000000000

Dennis

Posted 2012-12-21T20:15:37.390

Reputation: 42 934

@David That's a matter of perspective. I see that advantage as so huge that wiping once is clearly not enough. For example you can check if a given hard disk contained known files. – CodesInChaos – 2014-11-24T15:43:57.810

Thanks, I was looking for a document just like this the other day. – David – 2012-12-21T21:17:02.400

13Note that that 56% per bit is only slightly better than chance. – Daniel R Hicks – 2012-12-21T22:06:15.463

11

At the microscopic level a hard drive bit has neither "1", nor "0", but a magnetic charge. there is a threshold above which the charge is considered a "1". Likewise the bits geometric location is not precise, but falls within a given space.

The theory is that a tiny bit of the previous charge is still present in a newly written bit, so if you just zero the disk it might be possible for someone to set a new much lower threshold for what is considered a 1, and still recover the data. Writing random data makes this much harder.

The theory behind multiple passes has to do with the geometric location of the bit on the disk. If the current pass is a little further ahead or behind, then a remnant of the old bit might be peeking out from aside of the new bit. two or three passes (especially of random data) make it much less likely that a previous bit would be identifiable.

As others have already said, These fears are mostly overblown. The biggest risk is data that is only deleted by the OS, or not deleted at all.

Joshua Clayton

Posted 2012-12-21T20:15:37.390

Reputation: 211

"there is a threshold above which the charge is considered a "1" -- Not quite, a flux reversal is required to flip the bit value. You seem to be describing something like TTL logic, i.e. voltage thresholds. – sawdust – 2014-11-22T05:02:50.070

@sawdust, you're right, but I've left the answer as is, since I could not think of a clear way to phrase it. "magnetic field orientation??!?!" cringe – Joshua Clayton – 2017-12-25T01:07:24.270

4

BTW, in many of the newer disks there is now an internal hardware disk command that will logically shread your disk. However, this is not implemented in any disk controller or driver software that I have ever seen.

Also, what you are asking has been the subject of considerable debate over the years. With varying methods and procedures being proposed to subvert any type of hardware data recovery. So much so that many of the "wipe" agents, you will notice a plethora of available wipe algorithms.

What I do is really to destroy the disk manually, and never worry about any possible later disclosure. I guess it is easy for me to do this at home but for work its a different situation.

mdpc

Posted 2012-12-21T20:15:37.390

Reputation: 4 176

3@neersighted OP asked in the context of re-selling a computer. Melting down the hard drive would probably bring down the computer's resale value. – chiliNUT – 2014-11-24T17:57:09.810

+1. The best way to ensure data is not recovered is to physically destroy the HDD. Drilling holes has proved to be unreliable, so I usually bring it to a recycling center to be smashed and melted. A metal-shredder also works well. There are also companies that specialize in HDD destruction. – neersighted – 2012-12-21T20:47:11.900

@mdpc thanks for your answer.As explained though, I'm not THAT paranoid over the data... Just want some sane precaution, while maintaining a healthy budget :) – Waleed Hamra – 2012-12-21T20:51:48.350

4

I can't point to any articles, but I've read several that indicated that in real life (outside of black helicopter establishments) the chance of recovering any amount of meaningful data after a single "wipe" with random data is vanishingly small.

The real risk is probably with various forms of "smart" drives (especially SDDs) than may not write the new data where the old data was, at least for "edge" conditions. (This could also occur, to a more limited degree, with older drives that do sector relocation for error recovery.) This creates the possibility that a few tracks or sectors are pristine, even after a wipe. A clever hacker could probably figure out how to access these areas.

But, realistically speaking, this is not a big hazard if you're an ordinary joe smith selling to an ordinary jack jones -- you have nothing of real value on the drives and the buyer is unlikely to spend more than a few fruitless minutes trying to find stuff. Even if a sector sneaks through here and there it's not at all likely to be the one with your credit card info on it. The bigger hazard if you've got nuclear secrets on the drive and the buyer is a spy for The Bad Guys -- then even a tiny risk of a tiny leak is too much.

Daniel R Hicks

Posted 2012-12-21T20:15:37.390

Reputation: 5 783

3

Where zero-fill can be enough for a HDD, SSD may need to use random bytes to fill the sectors, as SSD can claim a sector zeroed only by one bit that states it in a sector cell (not filling the data part with zeroes).

pbies

Posted 2012-12-21T20:15:37.390

Reputation: 1 633

This wouldn't really help for an SSD, as SSDs can't zero individual bits (only entire blocks of between 256KiB and 4 MiB). If a file is "modified", it is actually relocated and the old location marked as dirty. To further make things hard, mainstream operating systems don't comprehend the way SSD's work, so a flash translation layer, (FTL) firmware running on the SSD obscures these copy-on-modify operations from the OS, hiding them behind commands designed for older spinning platter HDD's. The good news is that most SSD manufacturers include a special secure erase command. – Joshua Clayton – 2017-12-25T01:26:50.783

1

Here is another angle: Some methods for disk encryption try to provide deniable encryption by making the encrypted disks look like random garbage. If your unused disks have been filled with (actual) random garbage, it adds to the plausibility of the denial.

Thomas Padron-McCarthy

Posted 2012-12-21T20:15:37.390

Reputation: 281