0

(note, this is not a duplicate of Creating a tar file with checksums included)

I'm familiar with using tar + gzip to create a compressed tar file (tar cf - files | gzip > something.tar.gz), and gzip does add a master checksum so it will be apparent if the file gets corrupted. This is nearly the behaviour I want.

However, I have a computer with a (really) slow processor, but a fast network card. I've found that if I use tar plus gzip plus socat, that my network transfer is 1/10th the speed (100Mbps) versus leaving the gzip command out of the pipeline (950Mbps).

Some archive utilities, like 7Zip and Zip support an option for zero compression. I don't see that gzip or bzip2 have such an option. But Zip and 7Zip don't support proper streaming like gzip and bzip2 do (I know that 7z can read/write the plaintext from stdio, but it won't write the compressed file to stdout). I must have a proper streaming "compression" program because I'll be using socat to ship the archive to a remote host.

So the question is, is there a way to create a tar archive, while wrapping the output in a archive-like format, not using gzip or bzip2? Or is there some way to tickle gzip or bzip2 into using "no compression"? Or is there a dirt-simple ultra-fast streaming compression utility which might use only RLE encoding?

caveats - it needs to be fully streaming so I can use socat; the solution must be CPU light; solution must use parts available in cygwin and debian repositories

William
  • 137
  • 5
  • Gzip supports variable levels of compression, with corresponding demands on processor usage. [link](https://stackoverflow.com/questions/28452429/does-gzip-compression-level-have-any-impact-on-decompression) – Don Simon Mar 04 '19 at 18:26
  • gzip at its fastest (using -1 arg) is still intolerably slow. That's why I'm looking for alternatives. – William Mar 06 '19 at 16:50

1 Answers1

1

You can use lz4 as a drop-in replacement for gzip. The compression rate is not as good, but it is usually blazing-fast compared to gzip.

It may still be too slow for what you're trying to do, but it is probably worth a look.

You can also tee the output of tar into two different streams and push one of them into md5sum, sha1sum or sha256sum by calling tar cf - files | tee >(md5sum) | whatever-command.

Andreas Rogge
  • 2,670
  • 10
  • 24
  • The LZ4 idea looks to be a viable solution. It's not zero compression, but it is much much faster than gzip or bzip2. – William Mar 06 '19 at 17:02
  • If you add a -B4 argument to LZ4, it makes it faster, resulting in a 15% to 25% increase in network throughput (which is a good thing). – William Mar 06 '19 at 17:02