Why is piping 'dd' through gzip so much faster than a direct copy?

I wanted to backup a path from a computer in my network to another computer in the same network over a 100 Mbit/s line. For this I did

dd if=/local/path of=/remote/path/in/local/network/backup.img

which gave me a very low network transfer speed of something about 50 to 100 kB/s, which would have taken forever. So I stopped it and decided to try gzipping it on the fly to make it much smaller so that the amount to transfer is less. So I did

dd if=/local/path | gzip > /remote/path/in/local/network/backup.img.gz

But now I get something like 1 MB/s network transfer speed, so a factor of 10 to 20 faster. After noticing this, I tested this on several paths and files, and it was always the same.

Why does piping dd through gzip also increase the transfer rates by a large factor instead of only reducing the bytelength of the stream by a large factor? I'd expected even a small decrease in transfer rates instead, due to the higher CPU consumption while compressing, but now I get a double plus. Not that I'm not happy, but I am just wondering. ;)

Foo Bar

Posted 2014-05-29T08:35:55.707

Reputation: 1 270

The simple answer is that dd is outputting at 1MB/s... right into the waiting gzip pipe. It's got very little to do with block size. – Tullo_x86 – 2016-10-21T04:59:32.823

1512 bytes was the standard block size for file storage in early Unix. Since everything is a file in Unix/Linux, it became the default for just about everything. Newer versions of most utilities have increased that but not dd. – DocSalvager – 2014-06-05T20:04:41.660

Answers

100

dd by default uses a very small block size -- 512 bytes (!!). That is, a lot of small reads and writes. It seems that dd, used naively in your first example, was generating a great number of network packets with a very small payload, thus reducing throughput.

On the other hand, gzip is smart enough to do I/O with larger buffers. That is, a smaller number of big writes over the network.

Can you try dd again with a larger bs= parameter and see if it works better this time?

user319088

Posted 2014-05-29T08:35:55.707

Reputation:

@CongMa you can also try and use pigz instead of gzip, it will work even faster – GioMac – 2016-01-28T14:19:47.227

Never mind the fact that the transfer rate is being reported by dd, which is no longer responsible for the network bottleneck. – Tullo_x86 – 2016-10-21T04:58:28.943

20Thanks, tried direct copy without gzip and a blocksize of bs=10M -> fast network transfer of something about 3 or 4 MB/s. Higher blocksize + gzip did not change anything compared to small blocksize + gzip. – Foo Bar – 2014-05-29T14:27:27.497

7If you want to see what high block sizes do try another dd after the gzip. – Joshua – 2014-05-29T16:05:38.763

Is gzip doing its own output buffering, or does it just use stdio? – Barmar – 2014-05-30T19:42:18.930

@Barmar If I'm reading the source correctly, it simply write(3)s to the buffer. – None – 2014-06-03T12:28:21.213

Bit late to this but might I add...

In an interview I was once asked what would be the quickest possible method for cloning bit-for-bit data and of coarse responded with the use of dd or dc3dd (DoD funded). The interviewer confirmed that piping dd to dd is more efficient, as this simply permits simultaneous Read/Write or in programmer terms stdin/stdout, thus ultimatly doubling write speeds and halfing transfer time.

dc3dd verb=on if=/media/backup.img | dc3dd of=/dev/sdb

Sadik Tekin

Posted 2014-05-29T08:35:55.707

Reputation: 41

1I don't think that's true. I just tried now. dd status=progress if=/dev/zero count=100000 bs=1M of=/dev/null was 22.5GB/s, dd status=progress if=/dev/zero count=100000 bs=1M | dd of=/dev/null bs=1M was 2.7GB. So the pipe makes it slower. – falsePockets – 2019-02-25T00:27:19.167

I assume here that the "transfer speed" you're referring to is being reported by dd. This actually does make sense, because dd is actually transferring 10x the amount of data per second! However, dd is not transferring over the network -- that job is being handled by the gzip process.

Some context: gzip will consume data from its input pipe as fast as it can clear its internal buffer. The speed at which gzip's buffer empties depends on a few factors:

The I/O write bandwidth (which is bottlenecked by the network, and has remained constant)
The I/O read bandwidth (which is going to be far higher than 1MB/s reading from a local disk on a modern machine, thus is not a likely bottleneck)
Its compression ratio (which I will assume by your 10x speedup to be around 10%, indicating that you're compressing some kind of highly-repetitive text like a log file or some XML)

So in this case, the network can handle 100kB/s, and gzip is compressing the data around 10:1 (and isn't being bottlenecked by the CPU). This means that while it is outputting 100kB/s, gzip can consume 1MB/s, and the rate of consumption is what dd can see.

Tullo_x86

Posted 2014-05-29T08:35:55.707

Reputation: 140

Cong is correct. You are streaming the blocks off of disk uncompressed to a remote host. Your network interface, network, and your remote server are the limitation. First you need to get DD's performance up. Specifying a bs= parameter that aligns with the disks buffer memory will get the most performance from the disk. Say bs=32M for instance. This will then fill gzip's buffer at sata or sas line rate strait from the drives buffer. The disk will be more inclined to sequential transfer giving better through put. Gzip will compress the data in stream and send it to your location. If you are using NFS that will allow the nfs transmission to be minimial. If you are using SSH then you encur the SSH encapsulation and encryption overhead. If you use netcat then you have no encryption over head.

Robert

Posted 2014-05-29T08:35:55.707

Reputation: 1