What is the reason that .tar.gz files have two extension instead of one?
From what I understand this type of file is an archive just like RAR files are.
Not quite. Back in ye days of yore when CPU time was very expensive archives were often uncompressed or used the compression hardware on a tape drive. The tool usually use for this was tape archive.
Unlike what the name seems to suggest, it can write the archive to any file (and on Un*x a tape drive is just like a file). So creating a non-compressed archive is easy.
If you want to compress that then you can first create the archive (with all its files and folders) and then compress your archive.tar with something like compress
. Compress adds a .Z suffix so you then have a file with two extensions. (archive.tar.Z).
In time compress was replaced with gzip (and the .gz extension) and many versions of tar can do this compression in place.
Rar (and arj, lha, zip and others) are way more modern and compress files in place. In which case you use one single program and you get one single extenstion. But for tar the extensions and the create whole archive and then compress the whole archive are a result of history.
So RAR archives or ZIP archives aren't also sticking the files together and then compressing the resulting mega-file ? DO they work in a different way ? – yoyo_fun – 2016-11-08T23:28:20.790
1Most of them compress each file and store that file in the archive. Tar stores all files into one file and this single stream gets compressed. It has consequences when you want to extract one part of an archive. If the compression is per file then you only need to extract the information for that file and decompress. With .tar.Z|tar.gz|tar.bz|... you might need all information up to the file you are extracting. – Hennes – 2016-11-08T23:34:12.863
I understand the advantage of compressing first and 'archiving later'. However is there any benefit to archiving first and compressing later. I am thinking that having a larger file to compress may be much more efficient than having to compress single files. I am thinking now about huffman compression and, if I remember correctly it is more efficient to compress larger files or documents or... more data than less data. I am NOT sure though.. i might be wrong. – yoyo_fun – 2016-11-08T23:55:15.620
That is also my understanding; having a larger dataset to compress should allow better total compression. If I understood that correctly then it still is a trade-off beteween easier single file extraction and slighly more efficient archive size. – Hennes – 2016-11-09T00:05:26.840