Is there a way to determine the decompressed size of a .bz2 file?

35

3

Is there a way to print the decompressed size of a .bz2 file without actually decompressing the entire thing?

endolith

Posted 2009-10-11T18:33:27.023

Reputation: 6 626

So there is no metadata about the original file in the bzip output? >:( – endolith – 2009-10-11T22:30:36.017

not that i've seen reference to. :/ – quack quixote – 2009-10-11T22:35:08.847

Answers

37

As noted by others, bzip2 doesn't provide much information. But this technique works -- you will have to decompress the file, but you won't have to write the decompressed data to disk, which may be a "good enough" solution for you:

$ ls -l foo.bz2
-rw-r--r-- 1 ~quack ~quack 2364418 Jul  4 11:15 foo.bz2

$ bzcat foo.bz2 | wc -c         # bzcat decompresses to stdout, wc -c counts bytes
2928640                         # number of bytes of decompressed data

You can pipe that output into something else to give you a human-readable form:

$ ls -lh foo.bz2
-rw-r--r-- 1 quack quack 2.3M Jul  4 11:15 foo.bz2

$ bzcat foo.bz2 | wc -c | perl -lne 'printf("%.2fM\n", $_/1024/1024)'
2.79M

quack quixote

Posted 2009-10-11T18:33:27.023

Reputation: 37 382

9Well, that only took five minutes of 100% CPU to calculate. – endolith – 2009-10-11T22:39:18.400

2only? AND it would fill up a disk? i've got a compressed tarball of an old linux install that's only 407meg yet took my poor ancient server 30-45 minutes to extract. that included writing to disk, tho, i'll have to run that script to time it. get back to ya in half an hour... :) – quack quixote – 2009-10-11T23:12:19.380

I picked the smallest file for the first test, of course. 140 MB compressed --> 3 GB uncompressed. The larger files are 5 GB compressed... – endolith – 2009-10-12T04:50:39.673

heh .. lemme know how big the 5GBs turn out to be... and how long it takes to figure it out via this XD – quack quixote – 2009-10-12T05:25:58.200

-3

To read .bz extension text file without unzipping.

bzcat dbtax_ext_en.ttl.bz2 |zless

Shashank Motepalli

Posted 2009-10-11T18:33:27.023

Reputation: 1

1bzcat and zless don't work together like this. Use "bzcat file.bz2 | less" or "bzless file.bz2", or if you have a gzipped file, "zcat file.gz | less" or "zless file.gz". In fact, the man page for zless notes that "Zless does not work with compressed data that is piped to it via standard input; it requires that input files be specified as arguments." – Nick Russo – 2018-04-21T23:28:10.313