25

I know that doing a dd if=/dev/hda of=/dev/hdb does a deep hard drive copy. I've heard that people have been able to speed up the process by increasing the number of bytes that are read and written at a time (default: 512) with the bs option.

My question is:

  • What determines the ideal byte size for copying from a hard drive?

and

  • Why does that determine the ideal byte size?
James T
  • 545
  • 1
  • 4
  • 9
  • I thought that it would have to divide perfectly evenly into the size of the drive or partition: hence I have tried to determine the exact byte size of the target partition, obtain the prime factors, and determine a reasonably large block that is a multiple of several of those prime factors... – PP. Jun 04 '10 at 08:23
  • 1
    http://serverfault.com/questions/147935/how-to-determine-the-best-byte-size-for-the-dd-command || http://unix.stackexchange.com/questions/9432/is-there-a-way-to-determine-the-optimal-value-for-the-bs-parameter-to-dd || http://superuser.com/questions/234199/good-block-size-for-disk-cloning-with-diskdump-dd – Ciro Santilli OurBigBook.com Aug 24 '15 at 15:53
  • Cat or cp could actually be quite a bunch faster, consider this: https://tecmint.com/backup-or-clone-linux-partitions-using-cat-command ) – Cadoiz Jan 19 '21 at 05:49

3 Answers3

24

As Chris S wrote in this answer the optimum block size is hardware dependent. In my experience it is always greater than the default 512 bytes. If your working with raw devices then the overlying file system geometry will have no effect. I've used the script below to help 'optimize' the block size of dd.

#!/bin/bash
#
#create a file to work with
#
echo "creating a file to work with"
dd if=/dev/zero of=/var/tmp/infile count=1175000

for bs in  1k 2k 4k 8k 16k 32k 64k 128k 256k 512k 1M 2M 4M 8M 
 
do
        echo "Testing block size  = $bs"
        dd if=/var/tmp/infile of=/var/tmp/outfile bs=$bs
        echo ""
done
rm /var/tmp/infile /var/tmp/outfile
Cadoiz
  • 135
  • 5
user9517
  • 114,104
  • 20
  • 206
  • 289
  • 5
    In OS X, "1M...8M" should be lowercased to "1m...8m". – Matt Beckman Apr 13 '12 at 08:22
  • 1
    you could add a "time " in front of dd, to have additionnal infos on what amount of time was spent waiting for IO and what amount is real work... – Olivier Dulac Apr 03 '13 at 12:42
  • 1
    @lain `/dev/zero` is generated, no need to read it from the disk; you could `dd if=/dev/zero ibs=1M count=32 obs=$bs of=/var/tmp/outfile` @olivier-dulac well, `dd` already print the speed; i don't think `time` (or `/usr/bin/time`) add any information worth. – bufh Jun 03 '14 at 15:54
  • @bufh About the latter: what is displayed by dd depends on your system. I want to mention `status=progress` here, which is useful for monitoring, alternatives to that see here: https://unix.stackexchange.com/a/144178/318461 – Cadoiz Jan 19 '21 at 05:59
1

Unfortunately the perfect size will depend on your system bus, hard drive controller, the particular drive itself, and the drivers for each of those. The only way to find the perfect size is to keep trying different sizes. Fair warning that some devices only support one block size, though this is rare, and usually drivers make up the difference anyway.

I find that block size of 2^15 or 2^16 work best for my WDC (8mb cache) SATA drives connected to a Adaptec SAS RAID controller, 4x PCIe, 64-bit FreeBSD 8.0-STABLE. But for my cheap old thumb drive, sizes of 2^10 seem to be fastest.

The "perfect size" is almost always a power of two.

Chris S
  • 77,337
  • 11
  • 120
  • 212
0

I can vouch in the merit of measuring the outcome of performing a test on the device being used prior to wasting any time. I stupidly didn't bother and after measuring, then adjusting my block size slashed my DD duration of a 590Gb transfer in half. The same BS value would have only reduced time by 20% using a different caddy / drive combo.