tar doubled the size of the directory

0

Under Solaris 10 I have directory called apps. I go into that directory and enter:

-bash-3.00# du -hs
7.2G   .

Then I entered:

tar -cvf /var/run/apps.tar apps

The result:

-bash-3.00# ls -lad apps.tar 
-rw-r--r--   1 root     root     14174289408 Mar  5 16:05 apps.tar

How can a 7GB directory become a 14GB tar?

Thomas

Posted 2012-03-05T15:25:17.897

Reputation: 123

1And you're sure you didn't pack the tar into itself? – Der Hochstapler – 2012-03-05T15:36:59.197

I pasted here the commands I used, so yes I'm pretty sure. Is tar following simlinks by default? I really don't get it... – Thomas – 2012-03-05T15:39:06.683

It's not really apparent what your current working directory is. That could be relevant. – Der Hochstapler – 2012-03-05T15:47:53.223

In this question/answers, people discuss how bad solaris links are treated.

– woliveirajr – 2012-03-05T15:48:06.997

Answers

1

tar by itself isn't particularly efficient in storing small files. It uses fixed size 512 bytes blocks. The metadata is also stored in a less compacted way than on the file system. The discrepancy you observe might then be due by a significant number of small files in your apps directory.

To mitigate this effect, I would suggest compressing tar output that way:

  tar cvf - apps | gzip -v > /var/run/apps.tgz

If you happen to use GNU tar, that would be:

  gtar czvf /var/run/apps.tgz apps

jlliagre

Posted 2012-03-05T15:25:17.897

Reputation: 12 469

You're right, I do have a lot of small files in there. Is there a way I can make sure that it's because of the small files and not any simlink mess? – Thomas – 2012-03-05T21:31:30.427

Solaris tar doesn't follow symlinks with the options you use. – jlliagre – 2012-03-05T21:42:57.190

1

Tar does not perform any compression on files, and will always produce a resulting tar-file that is larger than the sum of all of the files that went into it.

The precise reason why the tar is so much larger could be any one of a number of factors such as large numbers of small files not being efficiently stored, hard-links being followed leading to duplication in the directory tree or the flattening of sparse files leading to substantially larger unsparsed files within the tar repository.

If you want to create a small file for backup or transfer to another system, I suggest you compress the tar file with Gzip (to a tar.gz) file or another compression algorithm to produce a much smaller file. Tar files tend to compress well, so this should produce a file substantially below the 14GB you cited.

SecurityMatt

Posted 2012-03-05T15:25:17.897

Reputation: 2 857