Optimal compression parameters to improve unzipping speed?

-1

Which parameters for compression into a ZIP-archive will result in optimal unzipping speed?

The size of the archive does not matter a lot in this case.

I was going to use 7-Zip. Maybe other tools can optimize for the decompression?

Andrej

Posted 2016-09-22T12:08:55.467

Reputation: 119

1Not using compression at all is the fastest, natually. That is if we disregard where the “compressed” data comes from and where the uncompressed data goes to. Also, size always matters. So perhaps you should expand a little on your use case. – Daniel B – 2016-09-22T12:36:35.013

In my case the archive is a container for a lot of files (thousands) that have to be retrieved over HTTP in a local network. Getting files one by one over HTTP is slow because of the overhead for each file. – Andrej – 2016-09-23T12:00:24.077

Answers

1

So you’re transferring over the network.

When it comes to decompression, you need to consider all the variables:

  • Where does the data come from?
  • Where does the data go to?
  • How fast is your CPU? Is it typically busy?

If you disregard everything, an uncompressed stream (like tar) will always be the fastest.

However, the uncompressed data needs to go somewhere. How fast can it get there? Even a local storage device typically has reachable limits. The overall goal would typically be to saturate available I/O bandwidth to the destination.

The data has to be retrieved. If it’s a fast network connection (1+ GBit/s), a fast compression algorithm like LZ4 or LZO is most likely still worth it. As the connection gets slower, the bias shifts to more CPU-expensive algorithms. This is more of a sending-side concern though, decompression is almost always faster by orders of magnitude.

The data has to be decompressed. That requires CPU time and sometimes also considerable amounts of memory. How much, exactly, depends on the CPU model. Yet another concern is of course not disrupting other services.

tl;dr: On a fast network connection, the ZIP algorithm is not a suitable choice. Try LZ4.

Daniel B

Posted 2016-09-22T12:08:55.467

Reputation: 40 502