Is there a faster way to verify that a drive has been fully zeroed?

4

In the coming months, I'm going to need to zero out a lot of disks. After wiping each drive, I need a quick way of making sure that the drive has been completely filled with zeroes.

I could open each one in a hex editor, but all this does is allow me to see that certain parts of it have been zeroed, which is increasingly pointless the bigger a drive gets, as it doesn't verify for sure that no non-zero characters exist on it.

I decided to run some benchmarks to test a few tools that I came across. I timed each tool in a series of 3 separate runs verifying the wipe of the same 1TB disk, with each run executing overnight at the same system load. To deal with caching, each run executed the tools at randomised positions, with a sleep of at least 500 seconds between each tool.

Below is each tool's average run across the 3 tests, sorted from slowest to fastest.

From myself:

time hexdump /dev/sda

0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
e8e0db6000

real    284m35.474s
user    223m4.261s
sys     2m49.729s

From Gordon Davisson:

time od /dev/sda

0000000 000000 000000 000000 000000 000000 000000 000000 000000
*
16434066660000

real    148m34.707s
user    77m10.749s
sys     2m54.611s

From Neal:

time cmp /dev/zero /dev/sda 

cmp: EOF on /dev/sda

real    137m55.505s
user    8m9.031s
sys     3m53.127s

From Beardy:

time badblocks -sv -t 0x00 /dev/sda

Checking blocks 0 to 976762583
Checking for bad blocks in read-only mode
Testing with pattern 0x00: done
Pass completed, 0 bad blocks found. (0/0/0 errors)

real    137m50.213s
user    5m19.287s
sys     4m49.803s

From Hennes:

time dd if=/dev/sda status=progress bs=4M | tr --squeeze-repeats "\000" "D"

1000156954624 bytes (1.0 TB, 931 GiB) copied, 8269.01 s, 121 MB/s
238467+1 records in
238467+1 records out
1000204886016 bytes (1.0 TB, 932 GiB) copied, 8269.65 s, 121 MB/s
D
real    137m49.868s
user    27m5.841s
sys     28m3.609s

From Bob1:

time iszero < /dev/sda

1000204886016 bytes processed
0 nonzero characters encountered.

real    137m49.400s
user    15m9.189s
sys     3m28.042s

Even the fastest of the tools tested seem to cap out at the 137 minute mark, which is 2 hours and 16 mins, whereas a full wipe of the disk averages just 2 hours and 30 minutes.

This is what prompted me to ask this question - it seems like it should be possible for such a tool to be at least half the speed it takes to wipe a drive, given that the disk only needs to be read from and not written to.

Does an alternative, faster solution to the above exist?

In an ideal world the solution I'm looking for would read the entire disk and print any non-zero characters it finds, just like Bob's C++ program. This would allow me to go back and selectively wipe any non-zero bytes rather than the entire disk. However, this wouldn't be a strict requirement if the tool was very fast at reading the disk.


1. This is a C++ program written by Bob, with the buffer size increased to 4194304 (4 MiB) and compiled with:

g++ -Wl,--stack,16777216 -O3 -march=native -o iszero iszero.cpp

Hashim

Posted 2019-12-15T20:52:48.193

Reputation: 6 967

2Why do you think reading is faster than writing? – Daniel B – 2019-12-15T20:55:39.483

@DanielB Because that's usually the case for storage hardware, whether it's disk or RAM. Reading from is always faster than writing to. – Hashim – 2019-12-15T20:59:07.080

3I'm going to need to zero out a lot of disks – The best way to make this process fast is to work with as many disks as possible in parallel. – I need a quick way of making sure that the drive has been completely filled with zeroes – After you write zeros? Then I understand you don't trust the firmware/hardware/software/OS. Normally when you write zeros and there is no error, you do write zeros. Can you elaborate? – Kamil Maciorowski – 2019-12-15T21:13:56.043

I'm unsure why this question is generating so much confusion. I understand how the process of wiping drives works and I'm aware of how to make that process faster. My current hardware limits me to connecting one drive at a time, upgrading is out of the question, and most importantly, that's not what this question is asking. Verification that a drive was properly wiped is a standard stage of wiping for both security and regulation purposes, because there are many things that can potentially go wrong with software/hardware/firmware, and that is the question being asked here. – Hashim – 2019-12-15T21:23:35.913

1The action of getting data from a device where mechanical parts are involved will always be comparatively slow (e.g. SSD is a lot faster). Your 137 minutes for reading with dd is probably the among the fastest speed you can get; it might depend on which type of disk you are accessing to some extent (e.g. 5k, 7k and 10k rpm disks are likely to differ some, the same with 3 or 6 gbps SATA - All of this if the disks involved can SUSTAIN the indicated speed, not just deliver correctly predicted buffer contents in bursts). – Hannu – 2019-12-15T23:07:03.000

What are the specs of the disks, and what kind of connection is used? It looks like your transfer speed is 121 MByte per second, this is most likely a hardware limit, and your choice of tool (among the faster ones) can't make a difference. – Hans-Martin Mosner – 2019-12-16T06:47:00.170

@grawity I'm curious, how come you deleted your answer? It seemed to be the closest thing to answering the question than anyone else had done. – Hashim – 2019-12-16T17:09:56.520

@Hans-MartinMosner It's via SATA 3 Gb/s (i.e. a 375MB theoretical limit), so I doubt the interface is the issue. – Hashim – 2019-12-16T17:11:59.377

1you're reading a full 1TB of data (every byte) across a SATA 3Gb/s bus, which has its own signaling overhead so you're not going to get 3Gb/s of data. And if they're mechanical hard drives you probably can't even saturate the data bus due to real-world physical limits. Since you really want to read every byte rather than use a statistical sampling of bytes, then the 180 Mb/s you seem to be getting is around the limit for older mechanical drives (which I assume these are because you suggest you're taking them out of service). – simpleuser – 2019-12-17T00:30:24.037

using dd without count which let dd write until the disk is full, like "sudo dd if=/dev/zero of=/dev/hda1 bs=1024k“ and use killall to show the process like "sudo watch -n 5 killall -USR1 dd", when dd completes, check out there is a "disk full" like message at the end and check out the writing process information triggered by killall is about the size of the drive. With all these information, the chance that the disk is not full zero is almost zero. There is no need to read back at all. If you really care about security that much, destroy the disk and by a new one which is very cheap. – jw_ – 2019-12-18T02:53:49.887

Answers

1

The read and write speeds of magnetic hard disks are approximately the same. The same is true of tape drives, RAM, CD-/DVD-/BD-R, and even floppy disks. With spinning media, its mainly a function of how fast the data moves under the heads (or laser assemblies for optical drives). If read and write didn't go at the same speed, you'd have to spin up (or down) the media to change from read to write and back.

Significantly faster read than write is a flash memory thing.

derobert

Posted 2019-12-15T20:52:48.193

Reputation: 3 366

This seems to be the gist of what I found from my research, and based on that I don't think I'll be able to find a tool to do this any faster than 137 mins. Thanks for confirming this by putting it into clear terms. – Hashim – 2019-12-20T02:28:36.000