Is there a method of getting a percentage on a DD in linux?

41

14

So here's what's happening.

I started a backup of a drive on my server through a Linux live USB. I started copying the first drive with the dd command vanilla; just sudo dd if=/dev/sda of=/dev/sdc1 and then I remembered that this just leaves the console blank until it finishes.

I needed to run a different backup to the same drive anyway, so I started that one as well with sudo dd if=/dev/sdb of=/dev/sdc3 status=progress and then I got a line of text that shows the current rate of transfer as well as the progress in bytes.

I was hoping for a method that shows a percentage of the backup instead of doing the math of how many bytes are backed up out of 1.8TBs. Is there an easier way to do this than status=progress?

user865814

Posted 2018-01-30T22:46:54.207

Reputation:

Answers

68

See answers from this question [1]

pv

For example you can use pv before you start

sudo apt-get install pv    # if you do not have it
pv < /dev/sda > /dev/sc3   # it is reported to be faster
pv /dev/sda > /dev/sc3     # it seems to have the same speed of the previous one
#or 
sudo dd if=/dev/sda | pv -s 1844G | dd of=/dev/sdc3  # Maybe slower 

Output [2]:

440MB 0:00:38 [11.6MB/s] [======>                             ] 21% ETA 0:02:19

Notes:
Especially for large files you may want to see man dd and set the options needed to speed up all on your hardware, e.g. bs=100M to set the buffer, oflag=sync to count the effective bytes written, maybe direct...
The option -s only takes integer parameters so 1.8T-->1844G.
As you can notice from the first lines you do not need dd at all.


kill -USR1 pid

If you already launched the dd command, once you have individuated its PID (Ctrl-Z +bg and you read it , or pgrep ^dd ... ) you may send a signal USR1 (or SIGUSR1, or SIGINFO see below) and read the output.
If the PID of the program is 1234 with

kill -USR1 1234

dd will answer on the terminal of its STDERR with something similar to

4+1 records in
4+0 records out
41943040 bytes (42 MB) copied, 2.90588 s, 14.4 MB/s

Warning: Under OpenBSD you may have to check in advance the behaviour of kill[3]: use instead
kill -SIGINFO 1234.
It exists the sigaction named SIGINFO. TheSIGUSR1 one, in this case, should terminate the program (dd)...
Under Ubuntu use -SIGUSR1 (10).

Hastur

Posted 2018-01-30T22:46:54.207

Reputation: 15 043

Thank you for the help! this will definitely help with the whole process. I will try the pv < /dev/sda > /dev/sdc3 method and hope that its faster as it reports. I had to cancel the last run of this and turn the server back on today because everyone in my office had been complaining, however this will help with having a definite percentage to fall back on when I am not sure how much time left that I should tell them. Im interested to see the ETA when I get it going again this friday! hahaha. – None – 2018-01-30T23:14:16.663

9you'll almost certainly find that using 'bs' on the dd command hugely speeds it up. Like dd if=/dev/blah of=/tmp/blah bs=100M to transfer 100M blocks at a time – Sirex – 2018-01-31T01:49:15.330

1@Sirex Of course you have to set the bs to optimize the transfer rate in relation with your hardware... In the answer is just repeated the commandline of the OP. :-) – Hastur – 2018-01-31T08:05:27.830

Note that you can also just do pv /dev/sda directly if you want. Also, on the output, you may want to add oflag=sync, otherwise the command completes really quickly, and then sits there silently flushing for ages. The sync flag makes it wait for the data to actually write to disk. – MathematicalOrchid – 2018-01-31T09:15:13.173

Excellent answer. Do note that signalling USR1 to dd can take a while to process. I've done it writing to a USB drive, and the answer only appeared after the writes were finished. – Criggie – 2018-02-01T01:25:12.923

@Hastur bs doesn’t matter in OP’s case, only if e.g. he is piping his dd B(uffer)S(size) to something data-mangling, like gz, xz, gpg, foo... bs is internally 64k and there’ll be no extra love for making bs bigger when just writing to a similar drive - if anything, there will be a delay in between reads and writes. – user2497 – 2018-02-01T06:03:17.670

@Sirex: 100M is way too large, especially if writing to a pipe. Pipe buffers are much smaller than 100MB, so there's no point making a write() system call with that size; it will return early. bs=1M is ok. I often use bs=128k, which is half of L2 cache size on my CPU; it's a tradeoff between more system calls and reading memory that's still hot in cache from being written. – Peter Cordes – 2018-02-01T10:53:18.697

3@Criggie: that's maybe because dd had already finished all the write() system calls, and fsync or close was blocked waiting for the writes to reach disk. With a slow USB stick, the default Linux I/O buffer thresholds for how large dirty write-buffers can be leads to qualitatively different behaviour than with big files on fast disks, because the buffers are as big as what you're copying and it still takes noticeable time. – Peter Cordes – 2018-02-01T11:00:42.013

5Great answer. However, I do want to note that in OpenBSD the right kill signal is SIGINFO, not SIGUSR1. Using -USR1 in OpenBSD will just kill dd. So before you try this out in a new environment, on a transfer that you don't want to interrupt, you may want to familiarize yourself with how the environment acts (on a safer test). – TOOGAM – 2018-02-02T05:17:06.217

1the signals advice for dd is really great info, especially for servers where you can't/don't want to install pv – mike – 2018-02-03T11:48:04.767

38

My go-to tool for this kind of stuff is progress:

This tool can be described as a Tiny, Dirty, Linux-and-OSX-Only C command that looks for coreutils basic commands (cp, mv, dd, tar, gzip/gunzip, cat, etc.) currently running on your system and displays the percentage of copied data. It can also show estimated time and throughput, and provides a "top-like" mode (monitoring).

"<code>progress</code> in action" screenshot

It simply scans /proc for interesting commands, and then looks at directories fd and fdinfo to find opened files and seek positions, and reports status for the largest file.

It's very light, and compatible with virtually any command.

I find it particularly useful because:

  • compared to pv in pipe or dcfldd, I don't have to remember to run a different command when I start the operation, I can monitor stuff after the fact;
  • compared to kill -USR1, it works on virtually any command, I don't have to always double-check the manpage to make sure I'm not accidentally killing the copy; also, it's nice that, when invoked without parameters, it shows the progress for any common "data transfer" command currently running, so I don't even have to look up the PID;
  • compared to pv -d, again I don't need to look up the PID.

Matteo Italia

Posted 2018-01-30T22:46:54.207

Reputation: 1 490

1Note: You can monitor more than just coreutils processes. Simply specify the name of the command with --command <command-name>. – jpaugh – 2018-02-01T15:19:50.300

1This.Is.AWESOME! – Floris – 2018-02-03T19:03:27.573

25

Run dd, then, in a separate shell, invoke the following command:

pv -d $(pidof dd) # root may be required

This will make pv obtain statistics on all the opened file descriptors of the dd process. It will show you both where the read and write buffer sit.

sleblanc

Posted 2018-01-30T22:46:54.207

Reputation: 409

2Works after the fact!? Amazing!! – jpaugh – 2018-01-31T21:16:51.340

3That's very cool. It avoids the memory-bandwidth + context-switch overhead of actually piping all the data through 3 processes! @jpaugh: I guess it just looks at /proc/$PID/fdinfo for file positions, and at /proc/$PID/fd to see which files (and thus the sizes). So yes, very cool, and good idea for a feature, but I wouldn't call it "amazing" because there are Linux APIs that let it poll the file positions of another process. – Peter Cordes – 2018-02-01T10:56:46.187

@PeterCordes I didn't realize file-position was exposed by the kernel. (I've been spending my life carefully preparing pv pipelines in advance.) Of course, I assumed as much once I saw that this does work. – jpaugh – 2018-02-01T15:05:10.653

9

There's an alternative to dd : dcfldd.

dcfldd is an enhanced version of GNU dd with features useful for forensics and security.

Status output - dcfldd can update the user of its progress in terms of the amount of data transferred and how much longer operation will take.

dcfldd if=/dev/zero of=out bs=2G count=1 # test file
dcfldd if=out of=out2 sizeprobe=if
[80% of 2047Mb] 52736 blocks (1648Mb) written. 00:00:01 remaining.

http://dcfldd.sourceforge.net/
https://linux.die.net/man/1/dcfldd

Antonin Décimo

Posted 2018-01-30T22:46:54.207

Reputation: 199

It's a longer command name... clearly, it is inferior. (+1) – jpaugh – 2018-02-01T15:12:28.163

6

As a percentage you'd have to do some maths, but you can get the progress of a dd in human readable form, even after already starting, by doing kill -USR1 $(pidof dd)

The current dd process will display similar to:

11117279 bytes (11 MB, 11 MiB) copied, 13.715 s, 811 kB/s

Sirex

Posted 2018-01-30T22:46:54.207

Reputation: 10 321

4That's basically the same thing that status=progress gives – rakslice – 2018-01-30T23:01:12.547

1I was actually about to say that's the exact same thing that status=progress gives. – None – 2018-01-30T23:02:55.140