6

I store a very large (TBytes) amount of logs. These logs will rarely be extracted, and when they will - only a single file will be required.

Could you recommend an ultra-efficient and extremely stable compression algorithm that's considerably better than bzip2?

Adam Matan
  • 12,504
  • 19
  • 54
  • 73

1 Answers1

15

lzma (aka xz) should do notably better than bzip2, but will take a bit longer.
paq (aka zp) will do quite a bit better yet, but will take ages to compress and just as long to decompress.

Both are available for Windows and *nix environments (most *nix systems have packages available)

A quick test on a smartd log:

Original       3900K
GZip            208K    0.11s
BZip2            71K    3.07s
XZ               13K    1.76s*
ZP                6K   25.68s*

*I've got -O3 compiled ports for xz and zp. The gzip and bzip2 binaries were precompiled with no optimization.

Chris S
  • 77,337
  • 11
  • 120
  • 212
  • 2
    See also http://www.maximumcompression.com/data/log.php – jftuga Jun 12 '11 at 19:11
  • xz is faster than bz2. Even if your quick test shows it. – Brendan Long Jun 12 '11 at 20:22
  • 3
    In linux lzma is also supported by `tar` http://www.gnu.org/software/tar/manual/html_section/Compression.html – leonbloy Jun 12 '11 at 20:27
  • @Brendan, just tested `-Os` compiled versions of `xz` and `bzip2` on 100MB of random data. bzip2 took 45.39s while `xz` took 1:25.77; bzip2 should be faster for any sizable dataset and similar compiler optimizations. @leonbloy, tar also supports the use of external compressors if they can work on stdin to stdout, which all of these compressors do. – Chris S Jun 13 '11 at 01:25
  • @ChrisS but the original 7z doesn't, also a built in alias of `-J` sure gives *something* – Hubert Kario Jan 15 '12 at 20:56
  • @HubertKario The original 7z doesn't what?? Alias what?? – Chris S Jan 16 '12 at 01:00
  • @ChrisS Original 7zip can't work on standard input and standard output. Also, the fact, you can easily use xz compression just by adding `-J` to `tar` instead of `-j` or `-z` sure is handy, no? – Hubert Kario Jan 16 '12 at 09:43