How is PNG lossless given that it has a compression parameter?

161

PNG files are said to use lossless compression. However, whenever I am in an image editor, such as GIMP and try to save an image as a PNG file, it asks for the compression parameter, which ranges between 0 and 9. If it has a compression parameter that affects the visual precision of the compressed image, how does it make PNG lossless?

Do I get lossless behavior only when I set the compression parameter to 9?

pkout

Posted 2014-11-26T18:11:37.767

Reputation: 1 761

42Most lossless compression algorithms have tunables (like dictionary size) which are generalized in a “how much effort should be made in minimizing the output size” slider. This is valid for ZIP, GZip, BZip2, LZMA, ... – Daniel B – 2014-11-27T10:12:43.480

21The question could be stated differently. If no quality is lost from the compression, then why not always use the compression producing the smallest size? The answer then would be, because it requires more RAM and more CPU time to compress and decompress. Sometimes you want faster compression and don't care as much about compression ratio. – kasperd – 2014-11-27T12:12:50.770

14PNG compression is almost identical to ZIPping files. You can compress them more or less but you get the exact file back when it decompresses -- that's what makes it lossless. – mikebabcock – 2014-11-27T14:31:30.723

13Most compression software such as Zip and Rar allow you to enter "compression level" which allow you to choose between smaller file <--> shorter time. It does not mean these software discard data during compression. This setting (in GIMP, pngcrush, etc) is similar. – Salman A – 2014-11-27T17:50:36.873

Also, from an htg article about this post, I've remembered of a great slightly related video from veritasium + vsauce talking about randomness, entropy, and how they related to data compression: https://www.youtube.com/watch?v=sMb00lz-IfE

– cregox – 2014-11-30T20:54:34.560

you should probably mind that there are some caveats to how "lossless" png really is : https://hsivonen.fi/png-gamma/

– n611x007 – 2014-12-02T16:33:19.980

recommend reading a book like Managing Gigabytes for those who want to learn more about compression (it is, or was, required reading for Google engineers, I believe)

– nothingisnecessary – 2014-12-06T05:56:12.777

2@naxa: There are no caveats to how lossless png really is. It is always 100% lossless. The article only warns you about bugs that some old browsers had in their PNG implementation for handling gamma correction. And that is only meaningful if you need to match the color with CSS colors (which are not gamma corrected). – Pauli L – 2014-12-06T15:58:00.113

@PauliL thanks! I've read it a long time ago I think I remembered the culprit wrong. Your comment brings the relevant information onsite! – n611x007 – 2014-12-08T09:04:21.593

There is a difference between lossless compression (like zip) and lossy compression (like mp3). You can recreate the source from lossless, but not from lossy. – chiliNUT – 2014-12-12T03:13:29.027

Answers

186

PNG is lossless. GIMP is most likely just not using the best word in this case. Think of it as "quality of compression", or in other words, "level of compression". With lower compression, you get a bigger file, but it takes less time to produce, whereas with higher compression, you get a smaller file that takes longer to produce. Typically you get diminishing returns (i.e., not as much decrease in size compared to the increase in time it takes) when going up to the highest compression levels, but it's up to you.

jjlin

Posted 2014-11-26T18:11:37.767

Reputation: 12 964

42Also, PNG compression actually has many tunable parameters where adjustments in either direction can shrink output size depending on the contents of the source - it's far more complex than a simple "better" and "worse" slider. For general purposes, it's not too important, but if you want the absolute smallest then use a tool like pngcrush that can compare many variations for the smallest possible. – Bob – 2014-11-27T03:32:16.830

4A higher compression level increases compression time, but does it also affect decompression as well? – Nolonar – 2014-11-28T08:06:46.940

10@Nolonar Generally no; if anything a higher compression level usually decreases decompression time because there's less data for it to have to read and process. The longer compression time is due to doing a more thorough job of finding patterns to compress (oversimplifying). – fluffy – 2014-11-29T07:15:08.470

1@fluffy LordNeckbeard's answer had the highest compression take 5x longer to decode than the lowest. – André Chalella – 2014-12-02T02:07:15.500

@AndréNeves That's a bit surprising. I wonder if there was some cache line issue or something. – fluffy – 2014-12-02T19:46:27.177

@fluffy That was from very crappy, unscientific tests on a single source. I'll remove it until I can perform some better, reproducible examples which may or may not support my previous observations. – llogan – 2014-12-02T20:39:28.463

@Nolonar: for PNG's zlib compression? Indeed, no. For other types of compression which might need larger buffers to decompress, it could cause slowdowns if those buffers get large enough that they thrash the cache/RAM on the machine doing the decompression, but e.g. 7zip's compression technique requires FAR bigger buffers during compression than during decompression, so buffer sized during decompression is probably only a big deal on very low-memory systems or if you have many decompressions running at once... – SamB – 2014-12-04T19:44:02.423

1For PNG, it is quite common to have longer decompression time for better-compressed files. The problem is that with PNG, one possible trick is to apply the compression algorithm over and over as long as the file gets smaller. Once the size increases, you stop applying it. So it's pretty possible that you apply the compression algorithm 5 or 6 times, which means that you have to decompress the file 5 or 6 times to display the image. – yo' – 2014-12-05T11:16:26.647

Typically you get diminishing returns (i.e., not as much decrease in size compared to the increase in time it takes) when going up to the highest compression levels, but it's up to you. Actually, it is up to the circumstances. If you’re saving a file here or there, you may as well use the maximum compression since it’s irrelevant if it takes an extra few seconds. However, if you’re constantly doing 1,000’s (e.g., in a web-server), then that is where compression level matters. It’s not so much a question of when to use more (that’s clearly what’s desired), but rather, when to use less. – Synetech – 2015-08-23T18:54:34.557

This question/answer was quoted by a full blog-post. Congrats! Rip-off is the highest form of flattery! :-)

– JosiahYoder-deactive except.. – 2019-04-30T14:34:34.343

215

PNG is compressed, but lossless

The compression level is a tradeoff between file size and encoding/decoding speed. To overly generalize, even non-image formats, such as FLAC, have similar concepts.

Different compression levels, same decoded output

Although the file sizes are different, due to the different compression levels, the actual decoded output will be identical.

You can compare the MD5 hashes of the decoded outputs with ffmpeg using the MD5 muxer.

This is best shown with some examples:

Create PNG files:

$ ffmpeg -i input -vframes 1 -compression_level 0 0.png
$ ffmpeg -i input -vframes 1 -compression_level 100 100.png

By default ffmpeg will use -compression_level 100 for PNG output.

Compare file size:

$ du -h *.png
  228K    0.png
  4.0K    100.png

Decode the PNG files and show MD5 hashes:

$ ffmpeg -loglevel error -i 0.png -f md5 -
3d3fbccf770a51f9d81725d4e0539f83

$ ffmpeg -loglevel error -i 100.png -f md5 -
3d3fbccf770a51f9d81725d4e0539f83

Since both hashes are the same you can be assured that the decoded outputs (the uncompressed, raw video) are exactly the same.

llogan

Posted 2014-11-26T18:11:37.767

Reputation: 31 929

27+1 did not know that ffmpeg could handle pngs. – Lekensteyn – 2014-11-27T21:49:09.823

@Lekensteyn It's great for making screenshots. Example to skip 30 seconds and take screenshot: ffmpeg -ss 30 -i input -vframes 1 output.png Also good for making videos out of images and vice versa.

– llogan – 2014-11-27T23:16:31.733

Does it mean that the PNG needs to be decompressed every time it has to be rendered? Because if that is true, we must be – akshay2000 – 2014-11-28T09:25:28.397

If you reread the file from disk or cache, yes, it has to be decompressed. Inside the same page the cache can probably reuse the decompressed version though. – David Mårtensson – 2014-11-28T09:56:16.747

1@akshay2000 Depends on how the program works which renders the PNG. Usually the file is read from disk, decompressed and buffered in the RAM. So as long as it's buffered in the RAM it won't need to decompress the image again. – xZise – 2014-11-29T02:33:16.627

Yeah, most progams would keep around the decompressed pixmap of current images, ready to be fed to whatever graphics API might tell them part of their window needs a redraw. An image viewer program might also pre-decode the next image, so there's no delay when you flip to it. I'm not sure what web browsers do with tabs you haven't even had visible for a long time. But they do always tend to expand to fill all available RAM... – Peter Cordes – 2014-12-02T12:56:40.160

PNG compression happens in two stages.

Pre-compression re-arranges the image data so that it will be more compressible by a general purpose compression algorithm.
The actual compression is done by DEFLATE, which searches for, and eliminates duplicate byte-sequences by replacing them with short tokens.

Since step 2 is a very time/resource intensive task, the underlying zlib library (encapsulation of raw DEFLATE) takes a compression parameter ranging from 1 = Fastest compression, 9 = Best compression, 0 = No compression. That's where the 0-9 range comes from, and GIMP simply passes that parameter down to zlib. Observe that at level 0 your png will actually be slightly larger than the equivalent bitmap.

However, level 9 is only the "best" that zlib will attempt, and is still very much a compromise solution.
To really get a feel for this, if you're willing to spend 1000x more processing power on an exhaustive search, you can gain 3-8% higher data density using zopfli instead of zlib.
The compression is still lossless, it's just a more optimal DEFLATE representation of the data. This approaches the limits of a zlib-compatible libraries, and therefore is the true "best" compression that it's possible to achieve using PNG.

Adria

Posted 2014-11-26T18:11:37.767

Reputation: 341

2Note: Decompression time is the same regardless of the compression level, or iteration count when using zopflipng. – Adria – 2014-11-28T10:05:00.543

A primary motivation for the PNG format was to create a replacement for GIF that was not only free but also an improvement over it in essentially all respects. As a result, PNG compression is completely lossless - that is, the original image data can be reconstructed exactly, bit for bit - just as in GIF and most forms of TIFF.

PNG uses a 2-stage compression process:

Pre-compression: filtering (prediction)
Compression: DEFLATE (see wikipedia)

The precompression step is called filtering, which is a method of reversibly transforming the image data so that the main compression engine can operate more efficiently.

As a simple example, consider a sequence of bytes increasing uniformly from 1 to 255:

1, 2, 3, 4, 5, .... 255

Since there is no repetition in the sequence, it compresses either very poorly or not at all. But a trivial modification of the sequence - namely, leaving the first byte alone but replacing each subsequent byte by the difference between it and its predecessor - transforms the sequence into an extremely compressible set :

1, 1, 1, 1, 1, .... 1

The above transformation is lossless, since no bytes were omitted, and is entirely reversible. The compressed size of this series will be much reduced, but the original series can still be perfectly reconstituted.

Actual image-data is rarely that perfect, but filtering does improve compression in grayscale and truecolor images, and it can help on some palette images as well. PNG supports five types of filters, and an encoder may choose to use a different filter for each row of pixels in the image :

The algorithm works on bytes, but for large pixels (e.g., 24-bit RGB or 64-bit RGBA) only corresponding bytes are compared, meaning the red components of the pixel-colors are handled separately from the green and blue pixel-components.

To choose the best filter for each row, an encoder would need to test all possible combinations. This is clearly impossible, as even a 20-row image would require testing over 95 trillion combinations, where "testing" would involve filtering and compressing the entire image.

Compression levels are normally defined as numbers between 0 (none) and 9 (best). These refer to tradeoffs between speed and size, and relate to how many combinations of row-filters are to be tried. There are no standards as regarding these compression levels, so every image-editor may have its own algorithms as to how many filters to try when optimizing the image-size.

Compression level 0 means that filters are not used at all, which is fast but wasteful. Higher levels mean that more and more combinations are tried on image-rows and only the best ones are retained.

I would guess that the simplest approach to the best compression is to incrementally test-compress each row with each filter, save the smallest result, and repeat for the next row. This amounts to filtering and compressing the entire image five times, which may be a reasonable trade-off for an image that will be transmitted and decoded many times. Lower compression values will do less, at the discretion of the tool's developer.

In addition to filters, the compression level might also affect the zlib compression level which is a number between 0 (no Deflate) and 9 (maximum Deflate). How the specified 0-9 levels affect the usage of filters, which are the main optimization feature of PNG, is still dependent on the tool's developer.

The conclusion is that PNG has a compression parameter that can reduce the file-size very significantly, all without the loss of even a single pixel.

Sources:

Wikipedia Portable Network Graphics
libpng documentation Chapter 9 - Compression and Filtering

harrymc

Posted 2014-11-26T18:11:37.767

Reputation: 306 093

1I don't think compression level setting changes the use of filters. The level 1-9 setting probably just chooses the zlib compression level 1-9, and level 0 means that deflate algorithm is not used at all. Most implementations probably do not change the filters per row, but just use Path filter all the time. – Pauli L – 2014-11-30T13:20:15.850

@PauliL: I don't agree, because in all comparisons of PNG compression software, there are very large differences between the sizes of the generated images. If all products used the same parameters for the same library, then all the sizes should have been the same, as well as the speed. – harrymc – 2014-11-30T14:22:33.637

Do you have any links to such comparisons? – Pauli L – 2014-11-30T15:10:20.963

@PauliL: A quick search came up with this comparison.

– harrymc – 2014-11-30T16:13:41.997

@PauliL: You are probably right that the zlib compression levels are affected by the compression levels of PNG. I have modified my answer accordingly, although no compression tool documents what they do exactly. Perhaps the explanation for the tools with the worst size results is that they use no filters at all, only zlib compression. – harrymc – 2014-12-01T12:12:28.000

Am I correct in my speculation, based on your description, that decompression speed will not be meaningfully impacted by the compression choice? – Brad Werth – 2014-12-01T22:29:26.607

Note that PNGs can be palettized, although this is not default behavior in any graphics editor I know of. Palettized images are lossy if the original had more than 256 distinct colors. – Russell at ISC – 2014-12-01T21:03:20.370

@BradWerth: Undoing a filter is by arithmetic operations over pixels, so it does add some processing time which depends on how efficiently is programmed the decoder. – harrymc – 2014-12-02T07:54:14.703

png seems to introduce a new problem regarding gamma correction: https://hsivonen.fi/png-gamma/

– n611x007 – 2014-12-02T16:35:52.500

@harrymc: The comparison you linked is about PNG re-compressors such as pngcrush, not about image editors. And there is no mention about compression levels. At least pngcrush does not have compression levels, instead, it tries out multiple different filter strategies and then picks the one that gives best result. – Pauli L – 2014-12-06T15:49:54.150

@PauliL: This post is about PNG compression. And unfortunately, most of these products keep their compression algorithms as a big undocumented secret. – harrymc – 2014-12-06T16:01:16.963

OK, I am too late for the bounty, but here is my answer anyway.

PNG is always lossless. It uses Deflate/Inflate algorithm, similar to those used in zip programs.

Deflate algorithm searches repeated sequences of bytes and replaces those with tags. The compression level setting specifies how much effort the program uses to find the optimal combination of byte sequences, and how much memory is reserved for that. It is compromise between time and memory usage vs. compressed file size. However, modern computers are so fast and have enough memory so that there is rarely need to use other than the highest compression setting.

Many PNG implementations use zlib library for compression. Zlib has nine compression levels, 1-9. I don't know the internals of Gimp, but since it has compression level settings 0-9 (0 = no compression), I would assume this setting simply selects the compression level of zlib.

Deflate algorithm is a general purpose compression algorithm, it has not been designed for compressing pictures. Unlike most other lossless image file formats, PNG format is not limited to that. PNG compression takes advantage of the knowledge that we are compressing a 2D image. This is achieved by so called filters.

(Filter is actually a bit misleading term here. It does not actually change the image contents, it just codes it differently. More accurate name would be delta encoder.)

PNG specification specifies 5 different filters (including 0 = none). The filter replaces absolute pixel values with difference from previous pixel to the left, up, diagonal or combination of those. This may significantly improve the compression ratio. Each scan line on the image can use different filter. The encoder can optimize the compression by choosing the best filter for each line.

For details of PNG file format, see PNG Specification.

Since there are virtually infinite number of combinations, it is not possible to try them all. Therefore, different kind of strategies have been developed for finding an effective combination. Most image editors probably do not even try to optimize the filters line by line but instead just used fixed filter (most likely Paeth).

A command line program pngcrush tries several strategies to find the best result. It can significantly reduce the size of PNG file created by other programs, but it may take quite a bit of time on larger images. See Source Forge - pngcrush.

Pauli L

Posted 2014-11-26T18:11:37.767

Reputation: 181

Compression level in lossless stuff is always just trading encode resources (usually time, sometimes also RAM) vs. bitrate. Quality is always 100%.

Of course, lossless compressors can NEVER guarantee any actual compression. Random data is incompressible, there's no pattern to find and no similarity. Shannon information theory and all that. The whole point of lossless data compression is that humans usually work with highly non-random data, but for transmission and storage, we can compress it down into as few bits as possible. Hopefully down to as close as possible to the Kolmogorov complexity of the original.

Whether it's zip or 7z generic data, png images, flac audio, or h.264 (in lossless mode) video, it's the same thing. With some compression algorithms, like lzma (7zip) and bzip2, cranking up the compression setting will increase the DECODER's CPU time (bzip2) or more often just the amount of RAM needed (lzma and bzip2, and h.264 with more reference frames). Often the decoder has to save more decoded output in RAM because decoding the next byte could refer to a byte decoded many megabytes ago (e.g. a video frame that is most similar to one from half a second ago would get encoded with references to 12 frames back). Same thing with bzip2 and choosing a large block size, but that also decompresses slower. lzma has a variable size dictionary, and you could make files that would require 1.5GB of RAM to decode.

Peter Cordes

Posted 2014-11-26T18:11:37.767

Reputation: 3 141

Hmmm I saw an implementation to yank control of the drive stepper motor and head directly to provide guaranteed lossless compression. Manchester encoding is easily beaten if you have a high-res clock source. – Joshua – 2014-12-03T02:23:47.477

@Joshua: Using a higher-density physical storage format is not the same as data compression ... – SamB – 2014-12-19T21:09:22.213

Firstly, PNG is always lossless. The apparent paradox is due to the fact that there are two different kinds of compression possible (for any kind of data): lossy and lossless.

Lossless compression squeezes down the data (i.e. file size) using various tricks, keeping everything and without making any approximation. As a result, it is possible that lossless compression will not actually be able compress things at all. (Technically data with high entropy can be very hard or even impossible to compress for lossless methods.) Lossy compression approximates the real data, but the approximation is imperfect, but this "throwing away" of precision allows typically better compression.

Here is a trivial example of lossless compression: if you have an image made of 1,000 black pixels, instead of storing the value for black 1,000 times, you can store a count (1000) and a value (black) thus compressing a 1000 pixel "image" into just two numbers. (This is a crude form of a lossless compression method called run-length encoding).

GregD

Posted 2014-11-26T18:11:37.767

Reputation: 121