Can I pause dd writing, remove the medium, mount it again and continue without causing I/O error?



I'm creating a compressed image of a laptop's internal drive into an external USB disk using a LiveCD GNU/Linux system. Also I use pv to monitor the progress. Here's how the command looks like:

dd if=/dev/sda bs=8M conv=noerror,sync | pv -pterab -s 298G /
| gzip -c --fast > /mnt/backup/

The USB drive is ADATA HD710. I know these drives tend to have a very loose USB plug - it can easily disconnect the drive if I move it. The process seems to be taking longer than I expected and I'm afraid I'll have to move the laptop and the drive. I'm afraid this will disconnect the drive, crash the duplication process and force me to do it all over again.

I've already checked if I can unmount the drive after pausing the process (with Ctrl+Z hotkey). It can only be done in "lazy" mode which means the drive isn't really being unmounted, it's only disconnected from it's mount dir. I waited until the drive finished work (stopped flashing it's LED) and unplugged it. I've plugged it again, mounted to the same directory that it was mounted before and resumed the process with command

fg 1

Gzip has quit with I/O error.

Why? I couldn't unmount the drive even though dd was stopped, because the kernel knew it has opened file on that disk. After I removed the drive the opened file must have been closed, but wasn't reopened for the Gzip process to access after I re-mounted the drive. So when Gzip was trying to continue writing, it got an I/O error, because the kernel denied it's access to the file (which was unexpected).

I also tried stopping the process, adding skip=X (where X is the number of records dd reports to have written after it quits) and appending the rest of the disk image to the dd.gz file. However gunzip quits with an error saying:

gzip: data.dd.gz: invalid compressed data--format violated

I guess it could work if the image wasn't compressed.

Is it possible to stop the process (with Ctrl+C hotkey), and continue it appending to the previously created file? How to match the file positions? Will the two concatenated Gzip archives extract properly to restore the disk from it's compressed image?


As pointed out, you might want to use ddrescue.

However you can simply append to a gzip file. I.e. dd --count 8 | gzip > foo.gz followed by dd --skip 8 | gzip >> foo.gz simply works. The concatenated file will simply extract to the input file.

For the first case you're describing: generally Linux makes an effort not to disturb current processes by keeping a disk mounted as long as it is in use. However if you forcibly unplug the disk, the correct way is to signal an I/O error to the application. The traditional Unix way to associate an open file with its on disk counterpart is not by its path, but by its inode number. On subsequent uses this inode may have been reused by a completely different file, or the inode may not be a file at all. The very idea of persisting open files past a detachment of the backing unit is questionable: the files may have been changed, so the state that was persisted may be nonsensical. On the other hand there's a good case to 'fail fast' and give the application notice of the failed disk, so it may take action, which in this case only amounts to stopping the compression process.

The second case concerns the question as to when you stop gzip. The working example above lets gzip write as much data to the file as it wants to. On the other hand, when you suspend the process or interrupt it, the resulting file is never a valid gzip file. This is du to the fact that the last thing gzip writes to the file is the checksum of the uncompressed data. While this aides in detecting errors in a file, this doesn't help in making gzip interruptible.

TL;DR: No, it's not possible.


