69
42
I often have to gather log files and upload them to a central server (Owned by another company). The central server has a size limit of the file, so I am trying to create the smallest file possible that is still in the zip format.
What are the best setting to use when compressing a text file to a zip format when my only need is a small file size?
I've done the obvious and chosen ultra compression, and I have noticed that LZMA does a better job than deflate, but there are far too many other permutations of options for me to test them all.
1define "normal zip tools". Most "normal zip tools" nowadays like 7z and winrar can extract 7z files. – phuclv – 2016-05-29T11:04:28.873
1Is splitting the the zip in to multiple files an option? – JaredMcAteer – 2011-05-10T14:21:29.297
@Original, I don't think so. (Is that what the 'split to volumes' option is for?) I'd rather keep it simple and have just 1 file. If I really need, I can split the original file (which I have done in the past), but my goal is to keep it in one file. – jjnguy – 2011-05-10T14:30:45.587
Oh, and I saw this question http://superuser.com/questions/178111/what-settings-to-use-when-making-7zip-files-in-order-to-get-maximum-compression-w But it really doesn't answer my question at all.
– jjnguy – 2011-05-10T14:32:11.233I think the exact question you asked isn't answerable. Some text files compress better with different algorithms. Sometimes zip is better, sometimes gzip; sometimes, compression level makes a difference, and sometimes not. It all depends on the file. Therefore, instead of answering the precise question, I've addressed the motivating example, which deals with maximum allowed sizes. Even if you have the best possible algorithm, you're still limited by size, and a particularly large log might not be able to be compressed below that threshold, so you'll need splitting anyway. – Rob Kennedy – 2011-05-10T14:42:19.707
@Rob, ok. Makes sense. I know that the input data is very important in determining the size of a resulting zip file. I wasn't sure if there was a canonical set of settings that usually work best. – jjnguy – 2011-05-10T14:44:22.090
4As soon as you pick anything but the
Deflate
format, it's not a "normal" .zip file anymore, but an "extended" zip file, pioneered by WinZip. They originally kept the extension as .zip, to much consternation (since most normal zip-handling tools can't deal with them), but most archivers use .zipx now to distinguish them from traditional .zip files. If you can use LZMA, switch to .7z and pick PPMd -- it should compress better (and faster!) for text files. – afrazier – 2011-05-20T16:04:40.747@afra, hmmmm. Thanks for the info. I need to keep it in a format that most normal zip tools can unzip. Otherwise I'd be using the 7z format already. – jjnguy – 2011-05-20T16:52:59.613
@Justin: That sucks. Can you use a self-extracting archive? – afrazier – 2011-05-20T18:55:25.887
@afrazier, I'm sending these files to a 3rd party vendor, and they expect to get 'regular' zip files. (Or files they can unzip using the 'standard' method.) – jjnguy – 2011-05-20T19:48:03.437
1
@afrazier: "The .ZIP File Format Specification documents the following compression methods: stored (no compression), Shrunk, Reduced (methods 1-4), Imploded, Tokenizing, Deflated, Deflate64, bzip2, LZMA (EFS), WavPack, PPMd." https://en.wikipedia.org/wiki/Zip_%28file_format%29#Compression_methods
– endolith – 2013-12-13T22:26:29.8432@endolith: bzip2, lzma, wv, and ppmd are all very recent additions to the file format. It's not even safe to assume that your recipient can handle deflate64, much less anything newer. – afrazier – 2013-12-13T22:33:39.190