Why is tar|tar so much faster than cp?

Answers

Cp does open-read-close-open-write-close in a loop over all files. So reading from one place and writing to another occur fully interleaved. Tar|tar does reading and writing in separate processes, and in addition tar uses multiple threads to read (and write) several files 'at once', effectively allowing the disk controller to fetch, buffer and store many blocks of data at once. All in all, tar allows each component to work efficiently, while cp breaks down the problem in disparate, inefficiently small chunks.

Pum Walters

Posted 2014-07-26T18:34:34.880

Reputation: 81

Can we really say that's true of all cp implementations? How do we know that's true? And why would cp be written in such an inefficient way? Any textbook implementation of a file copy reads a buffer of n bytes at a time, and writes them to disk before reading another n bytes. But you're saying cp always reads the whole file before writing the whole copy? – LarsH – 2017-03-13T02:54:52.600