Create the least compressible picture



"A picture is worth a thousand words"—so the old saying goes. The average word is about four characters long, so a picture conveys 4kB of information. But how much entropy, rather than information, can a picture convey?

Your task is to generate an image, exactly 4,000 bytes in size, with the highest entropy possible. You may use any language, libraries, or image format you choose, and you may output to the console or to a file so long as you upload your image here.


Your score is the compression ratio (4000 ÷ compressed size) when your image is compressed with GNU tar version 1.28 and gzip version 1.6, using the DEFLATE algorithm and default settings — specifically, the command tar -czvf out.tar.gz image. The smallest compression ratio wins.

Purple P

Posted 2019-08-13T16:49:48.573

Reputation: 919

1tar includes metadata, including mtime, in output files by default. This affects the final compressed file size - some mtimes compress better than others. Changing the command to gzip -n image would make the output size deterministic regardless of mtime (and input file name). – Nnnes – 2019-08-13T22:34:57.337

1Never mind, gzip -n image cannot produce a file larger than 4023 bytes given a 4000-byte input. It needs 10 bytes for the header, 8 for the footer, 1 for the DEFLATE block header and padding, and 4 for the DEFLATE block size; the rest are just stored as uncompressed bytes. Most files comprised of random bits are stored uncompressed, as they should be. – Nnnes – 2019-08-13T23:40:34.123

I can confirm that. Instead of compression getting worse with increasing entropy, it just flags that the data should be stored uncompressed. Deflate adds 2 bytes to the data. Gzip adds the rest. – Mark Jeronimus – 2019-08-16T21:26:08.613



0.9514747859 (4204-byte output)


Note: the image above is not the actual file I used, but it is the image.

Here is a hexdump of the file:

The file is in the netpbm format, and can be generated with this C code:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
    if (argc < 2) {
        fprintf(stderr, "Please pass in seed.\n");
        return EXIT_FAILURE;
    FILE *fp = fopen("image.pgm", "w");
    int width = 2, height = 1993;
    fprintf(fp, "P5 %d %d 255 ", width, height);
    for (int i = 0; i < width * height; i++) {
        fputc(rand() & 0xFF, fp);
    return 0;

The random seed must be passed into the program. After trying some seeds, I got one which produced a 4204 byte gzipped file. As Nnnes pointed out, tar will include metadata in the file, so your results might differ from mine.

netpbm isn't supported everywhere, but it works with imagemagick's convert (so just do convert image.pgm image.png to turn it into a png).

Why this image/format?

A file which consists of entirely random bytes is very hard to compress (in fact, any possible compression algorithm will do on average, no better than not compressing for random files). The content of the actual file is just P5 2 1993 followed by 3986 random bytes, which is why gzip has such a hard time compressing it.

Leo Tenenbaum

Posted 2019-08-13T16:49:48.573

Reputation: 2 655

For some reason I can't go more than 4201 on my cluster. Out of curiosity, what was the seed you used? – Krzysztof Szewczyk – 2019-08-13T18:53:34.193

@KrzysztofSzewczyk it’s not about the seed, it’s about the image format. Your PNGs’ headers include a lot of deterministic bytes, so they compress easily. – Grimmy – 2019-08-13T19:11:06.313

@Grimy hm, alright, I'll switch tommorow to the RAW format. I didn't expect around eight bytes having so big impact on the image. – Krzysztof Szewczyk – 2019-08-13T19:13:30.857


@KrzysztofSzewczyk There's potentially much more than 8 bytes of non-random data in a random PNG file. The bare minimum is to include the file signature followed by the IHDR, IDAT and IEND chunks, but most PNG generators will include a couple of optional chunks that are likely to compress pretty well -- as Grimy said -- except maybe the CRC's that can be assumed to be pretty random.

– Arnauld – 2019-08-14T11:53:49.517

Yeah, I'll consider this. Honestly I went for trial and error here, and it's kinda my mistake. – Krzysztof Szewczyk – 2019-08-14T11:58:07.660

The file you generated has 4KB of entropy, but the image doesn't. (Even the PNG has 785 bytes!) You're filling it with 3990 random bytes, but it's only displaying 3990 random bits, so it can be reduced to this 542-byte PBM file: - Rows in PBM are padded to the end of the byte, so there are a bunch of 0 bits every 14th (though still not enough for gzip to pick up on them, as it works on the byte level). 336 x 95 works for 3990 * 8 bits. [Also see my comment on the main post - the different sizes are not a result of different seeds.]

– Nnnes – 2019-08-15T00:00:05.280

@Nnnes Good point! I switched to using a pgm file. The png is now >5000 bytes. If it's not the seed that's changing the size of the output, I'm not sure what in the metadata is doing that, because the file name was the same every time... – Leo Tenenbaum – 2019-08-15T00:37:14.503


Brainfuck, 4201 bytes compressed.

Image format used is PNG. I'm pretty sure the challenge is over because I'm leaving 4 instances modified script overnight.


So how does it work?

Using a Java program I'm generating a JPG file. Then, it's compressed and it's size is being checked prompting me shall I keep it. I ran this script for a while and it generated me a few tar.gz files with varying sizes. Then, after a new winner is found, Brainfuck code is regenerated.

Bash script used:



while true; do

    java Start 
    tar -czf out.tar.gz target.png
    size="$(wc -c <"$filename")"
    printf "%s/%s " "$size" "$max"

    if [ "$max" -lt "$size" ]; then
        read -p "Keep? " -n 1 -r
        if [[ $REPLY =~ ^[Yy]$ ]]
            java -jar out.jar out.tar.gz > "out/sub$"
        echo "Crappy result, skipping."

Screenshot of the program running:

alt text

It could be fully automated removing the read and keeping implicitly, but I'd wish to have control on it.

The code


Krzysztof Szewczyk

Posted 2019-08-13T16:49:48.573

Reputation: 3 819

2The submission is the image itself, not the code – MilkyWay90 – 2019-08-13T21:18:46.317

Can you edit the header to remove the necessary brainfuck part, and update your score to the compression ratio? – Jo King – 2019-08-15T01:10:57.837