Break up a dd image into multiple files

Question

I'm doing some data recovery from a hard disk. The disk has about 300GB of data on it, but I lack another hard drive that has 300GB of free space.

I have three HDs, with 150, 40, and 120 GB free.

For the creation of the first chunk, I was considering doing the following:

sudo dd if=/dev/sdf1 bs=4096 count=150G | gzip > img1.gz

What should I run on the other machines to "pick up where I left off"?

You don't want to use '150G' for count. Count is the number of 'bs' sized blocks you want to use. To get 150GB, a better way would be to use a 'bs=1GB count=150'. — Christopher Cashell, Nov 24 '09 at 16:17

Aaron · Answer 1 · 2019-01-10T05:48:46.373

11

It is my command line:

dd if=/dev/sda bs=4M | gzip -c | split -b 2G - /mnt/backup_sda.img.gz

It will create 2GB files in this fashion:

backup_sda.img.gz.aa
backup_sda.img.gz.ab
backup_sda.img.gz.ac

Restore:

cat /mnt/UDISK1T/backup_sda.img.gz.* | gzip -dc | dd of=/dev/sda bs=4M

Hope it helps.

edited Jan 10 '19 at 05:48

answered Jan 08 '19 at 07:57

Aaron

111
1
4

A concrete extension to https://serverfault.com/a/87147/122599 - thank you! – Jay Taylor May 21 '19 at 17:04

wfaulk · Accepted Answer · 2009-11-21T00:44:28.737

You probably want to consider using tar, as KPWINC says, but to answer your question directly, you want to use dd's "skip" option.

If your first command, as stated, is:

sudo dd if=/dev/sdf1 bs=4096 count=150GB | gzip > img1.gz

Then your second would be:

sudo dd if=/dev/sdf1 bs=4096 skip=150GB count=40GB | gzip > img2.gz

and third:

sudo dd if=/dev/sdf1 bs=4096 skip=190GB count=120GB | gzip > img3.gz

That said, I'm not sure that the "GB" suffix does what you're intending. I think it just does raw math on the root number it follows, not figure out how to get that many gigabytes from the block size you've given. I would do something like this:

dd if=/dev/sdf1 bs=`expr 10 * 1024 * 1024` count=`expr 15 * 1024 * 1024 * 1024`

just to be sure of the math.

Oh, and make sure that your device isn't changing underneath you as you copy it. That would be bad.

Yeah, you do not want to use GB in the manner the original poster listed. A better way is to set your blocksize to a larger and more reasonable number (unless you have a good reason for going smaller, I'd always go at least 1MB). To get 150GB as he listed, I'd do 'bs=1GB count=150'. — Christopher Cashell, Nov 24 '09 at 16:19

score 8 · Answer 3 · answered Nov 22 '09 at 18:30

8

A simple solution might be to just use "/usr/bin/split". It just breaks files up into pieces. You can use "-" as the input file name to read from standard input. The nice thing about split is that it is simple, it doesn't affect the toolchain in any real way and you can "join" the files by just using "cat" to glob them back together (or pipe them to another app).

answered Nov 22 '09 at 18:30

Michael Tiller

189
2

Can't believe this only had 1 vote! – underscore_d May 23 '16 at 20:44

score 1 · Answer 4 · answered Nov 20 '09 at 21:50

tar

I tar might solve your issue. It has the ability to break up files into multiple volumes.

Check out this link:

http://paulbradley.tv/44/

From the page:

The two extra command line options you need to use over and above the standard syntax are -M (--multi-volume) which tells Tar you want to split the file over multiple media disks. You then need to tell Tar how big that media is, so that it can create files of the correct size. To do this you use the --tape-length option, where the value you pass is number x 1024 bytes.

The example below shows the syntax used. Lets say the largefile.tgz is 150 Meg and we need to fit the file on two 100 Meg Zip drives.

tar -c -M --tape-length=102400 --file=disk1.tar largefile.tgz

The value 102400 is 1024 x 100, which will create a 100 Meg file called disk1.tar and then Tar will prompt for volume 2 like below :-

Prepare volume #2 for disk1.tar and hit return:

In the time of tape drives you would have taken the first tape out of the machine and inserted a new tape, and pressed return to continue. As we want Tar to create the remaining 50 Meg in a separate file, we issue the following command :-

n disk2.tar

This instructs Tar to continue writing the remaining 50 Meg of largefile.tgz to a file named disk2.tar. You will then be prompted with the line below, and you can now hit return to continue.

Prepare volume #2 for disk2.tar and hit return:

You would repeat this process until your large file has been completely processed, increasing the disk number in the filename each time you are prompted.

the answer is correct, i might add he wants to go with 10GB files, as 10 is the greatest common divisor for 150, 40, 120 — asdmin, Nov 20 '09 at 23:32
How would I resume the tarring if I had to switch to different computers (removing /dev/sdf1 and reconnecting it to the destination)? — lfaraone, Nov 21 '09 at 00:41

score 1 · Answer 5 · answered Nov 24 '09 at 16:43

A couple of things here. First, the command you listed probably won't work as you're expecting. It looks like you're trying to hit 150GB, but you need to factor in both the block size and the count (count is the number of blocks of block size). So if you want 150GB, you might do bs=1GB count=150. You could then pick up where you left off by adding a skip=150 to skip 150 blocks (each of 1GB) on your second run. Also, to have gzip send its output to standard out, you need to pass it the -c option.

However, before you do that, a couple of other questions. Are you using dd here because the filesystem is corrupted/damaged/etc and you need a bit-for-bit exact disk image copy of it? Or are you just trying to get the data off? A filesystem-aware tool might be more effective. Particularly if the source filesystem isn't full. Options include tar, or you might want to look into something like Clonezilla, Partclone, Partimage, or for a more Windows-specific option to directly access a Windows filesystem, Linux-NTFS (note the previously mentioned tools can handle Windows filesystems to various degrees, too).

If you are set on operating on the partition with a non-filesystem aware program, then using your dd line (as modified above to be correct) will likely work. It's hard to say how well it will compress, but it should be smaller than the original filesystem. If you have read-write access to the original filesystem, it would be worth filling up the free space with a file written from /dev/zero to zero out the unused space before saving it with dd. This will enhance gzip's ability to compress the free space in the disk image.

To operate on the second chunk, just add a skip=XXX bit to your second dd invocation, where 'XXX' is equal to the count= value you gave it the first time. If you wanted to do 150GB on your first one and 40 on your second, you might do:

sudo dd if=/dev/sdf1 bs=1GB count=150 | gzip -c > img1.gz

followed by:

sudo dd if=/dev/sdf1 bs=1GB skip=150 count=40 | gzip -c > img2.gz

Orsiris de Jong · Answer 6 · 2016-03-23T17:23:28.770

Comming late for the party, but I had to do such a script to backup an unknown file system on a FAT32 stick.

This simple script works for me:

# Sets pipefail so failed dd makes the whole pipe command fail
set -o pipefail
dd_result=0
split=0
filenameSplit=
while [ $dd_result == 0 ]
do
    cmd="dd if=/dev/sdX bs=2G count=1 skip=$split | pigz --fast > \"2G.$filenameSplit.myharddisk.gz\""
    echo $cmd
    eval $cmd
    dd_result=$?
    split=$((split + 1))
    filenameSplit=$split
done

EDIT: Have uploaded a basic backup restore script that uses dd, pigz or gzip compression and splits files into chunks that can be set at commandline. Script found here: github.com/deajan/linuxscripts

Break up a dd image into multiple files

6 Answers6