How to specify level of compression when using tar -zcvf?

149

68

I gzip directories very often at work. What I normally do is

tar -zcvf file.tar.gz /path/to/directory

Is there a way to specify the compression level here? I want to use the best compression possible even if it takes more time to compress.

Lazer

Posted 2011-07-01T18:00:22.243

Reputation: 13 841

Answers

146

GZIP=-9 tar cvzf file.tar.gz /path/to/directory

assuming you're using bash. Generally, set GZIP environment variable to "-9", and run tar normally.

Also - if you really want best compression, don't use gzip. Use lzma or 7z.

And when using gzip (which is good idea for various of reasons anyway) consider using pigz program and not the gzip.

user7385

Posted 2011-07-01T18:00:22.243

Reputation:

3FYI, for .bz2 format, use: BZIP2=-9 tar cvjf file.tar.bz2 /path/to/directory – Tomofumi – 2017-03-02T02:59:46.520

3The environment variable seems to now be GZIP_OPT, the usage should be the same. – Seer – 2017-12-05T11:59:30.863

3From the man page on Ubuntu 16.04 for gzip: "On Vax/VMS, the name of the environment variable is GZIP_OPT, to avoid a conflict with the symbol set for invocation of the program." For sh, csh, and MSDOS it should still just be GZIP – Ponyboy47 – 2018-06-02T16:13:19.823

This is what I get when I try to set GZIP environment variable to -9: gzip: warning: GZIP environment variable is deprecated; use an alias or script – patryk.beza – 2019-10-31T10:43:16.080

15pigz is "parallel gzip" which uses all your cores for gzip compression. You can watch top and see it using anywhere between 200%-400$ CPU. – Felipe Alvarez – 2013-12-09T02:01:03.163

69

Instead of using the gzip flag for tar, gzip the files manually after the tar process, then you can specify the compression level for the gzip program:

tar -cvf files.tar /path/to/file0 /path/to/file1 ; gzip -9 files.tar

Or you could use:

tar cvf - /path/to/file0 /path/to/file1 | gzip -9 - > files.tar.gz

The -9 in the gzip command line tells gzip to use the maximum possible compression level (default is -6).

Edit: Fixed pipe command line based on @depesz comment.

Matrix Mole

Posted 2011-07-01T18:00:22.243

Reputation: 3 303

5Using pipes should be done with: tar cvf - /path/to/directory | gzip -9 - > file.tar.gz – None – 2011-07-01T18:40:08.460

11st example should end with file.tar, since gzip adds the ".gz" extension. – bonsaiviking – 2013-02-04T18:20:24.727

4why don't you skip f -? if there is no file, then it is stdin/out – akostadinov – 2013-09-19T18:52:52.720

addition to the previos comment. From "man tar" section Environtment: TAPE Device or file to use for the archive if --file is not specified. If this environment variable is unset, use stdin or stdout instead. – Mikl – 2013-09-24T17:08:40.957

2and we can reduce "gzip -9 -" -> "gzip -9". From "man gzip" section Description: If no files are specified, or if a file name is "-", the standard input is compressed to the standard output. – Mikl – 2013-09-24T17:18:14.407

55

Modern versions of tar support the xz archive format (GNU tar, since 1.22 in 2009, Busybox since 1.17.0 in 2010).

It's based on lzma2, kind of like a 7-Zip version of gz. This gives better compression if you are ok with the requirement of needing xz support.

tar -Jcvf file.tar.xz /path/to/directory

I just found out here (basically a dupe of this question, but in the Unix stackexchange) that there is also a XZ_OPT=-9 environment variable to control the XZ compression level similar to the GZIP one in the other post.

XZ_OPT=-9 tar -Jcvf file.tar.xz /path/to/directory

David C. Bishop

Posted 2011-07-01T18:00:22.243

Reputation: 1 184

9The trade-off is speed. XZ is significantly slower. – Bell – 2017-04-06T23:02:48.193

2

+1 xz is far better than both bzip2 and gzip. Here's a comparison: http://tukaani.org/lzma/benchmarks.html

– User1 – 2012-12-25T15:44:38.497

34

tar cv /path/to/directory | gzip --best > file.tar.gz

This is Matrix Mole's second solution, but slightly shortened:

When calling tar, option f states that the output is a file. Setting it to - (stdout) makes tar write its output to stdout which is the default behavior without both f and -.

And as stated by the gzip man page, if no files are specified gzip will compress from standard input. There is no need for - in the gzip call.

Option --best (equivalent to -9) sets the highest compression level.

carlito

Posted 2011-07-01T18:00:22.243

Reputation: 521

This works with xz and pixz too. It is a great way to control the number of threads used for parallel compressing without having to create an intermediate .tar file. Like so tar -cv /path/to/dir | pixz -p4 > output.tpxz – joelostblom – 2015-02-12T21:52:34.147

1This works beautifully. Also if you run as root, permissions & owners are preserved too. Otherwise you must specify. Also if it wasn't obvious "-9" is best compression and "-1" is fastest compression. "-1" still takes a looong time if you have lots of files ;-) – PJ Brunet – 2013-12-12T04:04:08.647

10

There is also the option to specify the compression program using -I. This can include the compression level option.

tar -I 'gzip -9' -cvf file.tar.gz /path/to/directory

Chris Gibson

Posted 2011-07-01T18:00:22.243

Reputation: 101

2Older versions of tar such as that provided in CentOS 6 & 7 do not support providing arguments in the -I arg, they will try to treat the whole thing as a program name to exec, and thus fail. At least as of tar 1.29 in Debian Stretch, this does work. – Cheetah – 2017-12-07T00:30:52.240

3

And of course macOS bsd-derived tar has to be different:

tar -czf file.tar.gz --options gzip:compression-level=9 /path/to/directory

rfay

Posted 2011-07-01T18:00:22.243

Reputation: 131