On Linux/Unix, does .tar.gz versus .zip matter?

71

20

Cross-platform programs are sometimes distributed as .tar.gz for the Unix version and .zip for the Windows version. This makes sense when the contents of each must be different.

If, however, the contents are going to be the same, it would be simpler to just have one download. Windows prefers .zip because that's the format it can handle out of the box. Does it matter on Unix? That is, I tried today unzipping a file on Ubuntu Linux, and it worked fine; is there any problem with this on any current Unix-like operating system, or is it okay to just provide a .zip file across the board?

rwallace

Posted 2010-05-29T18:59:49.773

Reputation: 2 021

Answers

33

Necromancing.
Yes, it matters.
Actually, it depends.

tar.gz

  • Stores unix file attributes: uid, gid, permissions (most notably executable). The default may depend on your distribution, and can be toggled with options.
  • Consolidates all files to be archived in one file ("Tape ARchive").
  • Actual compression is done by GZIP, on the one .tar file

zip

  • Stores MSDOS attributes. (Archive, Readonly, Hidden, System)
  • Compresses each file individually, then consolidates the individually compressed files in one file
  • Includes a file table at the end of the file

Because zip compresses the files individually, a zip-archive will most-likely have a larger size (especially with many smaller files - think config files).

So you see, appart from file size, if you zip a bunch of files on Linux/Unix, and then unzip them, the file-attributes will be gone (at the very least those not supported by MS-DOS - depends on what ZIP-software you use). This may matter, or it may not, in which case it doesn't matter (because the file-size difference is in most cases negligible).

Quandary

Posted 2010-05-29T18:59:49.773

Reputation: 1 433

10the standard distro of zip on unix-like systems (info-zip) also stores unix file attributes. – Erik Aronesty – 2018-04-24T21:26:10.507

35

tar gz is better for Linux/Unix as it retains permissions, such as "executable" on scripts.

Zam

Posted 2010-05-29T18:59:49.773

Reputation: 359

4

Standard zip/unzip tools (info-zip) retain permissions on linux, and timestamps on windows. see: https://en.wikipedia.org/wiki/Info-ZIP for typical capabilities... which overcomes the permissions issues and file size limitations while retaining desirable random access and editable archive properties.

– Erik Aronesty – 2018-04-24T21:23:42.020

8OS X's Archive Utility and zip / unzip preserve permissions, but there might be other utilities that don't. – Lri – 2013-01-19T15:33:01.733

33

Most popular Linux distros these days are by default equipped with zip compatibility. But as stated by nc3b, tar and gzip are more common on Linux/Unix systems. If you need 95% compatibility on these systems, consider using tar and gzip. If you need only 85%, zip will do fine.

BloodPhilia

Posted 2010-05-29T18:59:49.773

Reputation: 27 374

@BloodPhilia Actually gzip does care about the suffix. If you try to ungzip a file which doesn't end in .tgz or .gz it will give the error gzip: npm-debug.log.zip: unknown suffix -- ignored. – mtak – 2014-07-31T14:43:05.617

2Okay, 95% is better than 85% :-) A very minor question, does it matter at all if the file extension is .tgz instead of .tar.gz? – rwallace – 2010-05-29T19:34:42.837

8Extension doesn't matter at all, it's just used for reference by users and programs. If the extension is .XXX and you know it's .tar, you could still use tar to untar it. .tgz and .tar.gz are both in fact the same extensions and files with these extensions would be similar. – BloodPhilia – 2010-05-29T19:43:10.810

1@mtak you can always just use gunzip --suffix .zip npm-debug.log.zip or gunzip -c < npm-debug.log.zip > npm-debug.log – Iwan Aucamp – 2017-01-19T13:14:55.547

2On the other hand, for 100% compatibility on Windows you would need to use cab. – kinokijuf – 2011-11-15T17:39:46.910

3tar will store uid, gid and permissions, such as +x on unix systems. zip stores archive, readonly, hidden and system on windows systems. – Andrew De Andrade – 2013-10-30T21:43:01.610

@BloodPhilia, So does that mean that we can GZip a file and rename it as .zip and it will correctly unzip? – Pacerier – 2014-04-24T11:33:23.177

@Pacerier Yes, as long as you use gzip to unzip it. – BloodPhilia – 2014-04-26T19:40:31.040

19

tar/gzip is a pretty crappy format since the archive cannot be randomly accessed, updated, verified or even appended to... without having to decompress the entire archive.

zip is much better in that regard.... you can quickly obtain the contents of a zip file, append to it without recompressing the first part, etc.

zip has some size limitations ... depending on the version of "zip" that you use... and these can be a problem. but the standard info-zip tool that comes with most linux-like os'es has no size limitations and preserves file permissions just fine.

see: https://en.wikipedia.org/wiki/Info-ZIP for capabilities

Erik Aronesty

Posted 2010-05-29T18:59:49.773

Reputation: 394

edited and provided a link – Erik Aronesty – 2018-04-24T21:29:24.920

What kind of limitations are you talking about? – Pacerier – 2014-04-24T11:34:14.410

9

Barebones Unix installs don't contain unzip (i.e. server installs), but they always contain tar and gzip. If your audience is servers, I'd go for gzip.

Also gzip has greater compression than zip, so the file will be smaller.

Rwky

Posted 2010-05-29T18:59:49.773

Reputation: 648

1I wouldn't say gzip compresses better than ZIP. Both use the same DEFLATE algorithm, and all comparisons I've done give similar results in file size. – user1686 – 2010-05-29T21:57:26.867

4Well, tar.gz will compress the whole file in one go, whereas zip compresses files individually. For many small files, the first approach will usually generate noticeably smaller files, because redundancies can be used across files. The difference is not huge though. – sleske – 2010-06-24T16:09:01.227

3

Yes, it matters. Tar is an archiver. And in tar.gz, we compress that archive.

Zip is both an archiver and compressor.

If you compare compression, from my experience, gzip is much better than zip.

And the other significant difference is mentioned in another answer. If you have a very big file archive, and want to extract a small file, Zip allows you to do that. But with tar.gz, you need to extract entire archive.

Rakesh Reddy

Posted 2010-05-29T18:59:49.773

Reputation: 31

Not an archive of gzipped files but a gzip of archived files. That's why you have to extract the whole archive. – m93a – 2015-02-15T16:24:58.327

2

The decision basically comes down to these:

  • GZIP keeps Unix file permissions, as files being allowed to execute.

  • On the other hand ZIP works out of the box in Windows.

Alberto Salvia Novella

Posted 2010-05-29T18:59:49.773

Reputation: 176

1

tar and gzip are a lot more common on *nix-es than unzip. For instance, at the moment on my arch-2009.08 there is no unzip.

nc3b

Posted 2010-05-29T18:59:49.773

Reputation: 1 094

6But there is bsdtar (part of libarchive), which handles ZIP fine. – user1686 – 2010-05-29T19:19:40.827

Oops :"> Didn't know about that. Thanks! :-) – nc3b – 2010-05-30T07:22:06.093