Compress an image to a 4 KiB preview

In this challenge you will be creating an image preview compression algorithm. It's goal is to reduce an arbitrary image file to a 4 KiB preview image, that can be used to quickly identify images with very little bandwidth.

You must write two programs (or one combined program): a compressor and a decompressor. Both must take a file or stdin as input, and output to a file or stdout. The compressor must accept one image in a mainstream lossless image format of choice (e.g. PNG, BMP, PPM), and output a file of at most 4096 bytes. The decompressor must accept any file generated by the compressor, and output an image as close as possible to the input. Note that there is no source code size limit for the encoder/decoder, so you can be creative in your algorithm.

Restrictions:

No 'cheating'. Your programs may not use hidden inputs, storing data on the internet, etc. You are also forbidden from including features / data pertaining only to the set of scoring images.
For libraries / tools / built-ins you are allowed to use generic image processing operations (scaling, blurring, color space transformation, etc), but not image decoding / encoding / compression operations (except for compressor input and decompressor output). Generic compression / decompression is also disallowed. It is intended that you implement your own compression for this challenge.
The dimensions of the image output by the decompressor must exactly match those of the original file given to the compressor. You may assume that the image dimensions do not exceed 2¹⁶ in either direction.
Your compressor must run on an average consumer PC in under 5 minutes, and the decompressor must run in under 10 seconds for any image in the set below.

Scoring

To help quick verification and visual comparison, please include a lossless image album of the test corpus after compression using your answer.

Your compressor will be tested using the following corpus of images:

You can download all images in a zip file here.

Your score will be the average structural similarity index for your compressor on all the images. We will be using the open source dssim for this challenge. It is easily built from source, or if you're on Ubuntu it also has a PPA. It is preferred if you score your own answer, but if you do not know how to build C applications and you do not run Debian/Ubuntu, you can let someone else score for you. dssim expects input/output in PNG, so convert your output to PNG first if you output in a different format.

To make scoring painless, here's a quick helper Python script, usage python score.py corpus_dir compressed_dir:

import glob, sys, os, subprocess

scores = []
for img in sorted(os.listdir(sys.argv[1])):
    ref, preview = (os.path.join(sys.argv[i], img) for i in (1, 2))
    sys.stdout.write("Comparing {} to {}... ".format(ref, preview))
    out = subprocess.check_output(["dssim", ref, preview]).decode("utf-8").split()[0]
    print(out)
    scores.append(float(out))

print("Average score: {:.6f}".format(sum(scores) / len(scores)))

Lowest score wins.

orlp

Posted 2016-01-28T07:51:13.320

Reputation: 37 067

does the compressed picture have to be viewable? – Eumel – 2016-01-28T14:51:55.460

2@Eumel You can consider the decompressor as a viewer. So no, your compressed format can be arbitrary and is entirely up to you. Only after decompression a viewable image has to come out. – orlp – 2016-01-28T15:23:26.433

7You may assume that the image dimensions do not exceed 2^32 in either direction. Isn't this a little excessive? This means I have to use up 16 bytes to store a pair of (x,y) coordinates. Few image files have dimensions of more than 2^16 (65536) pixels in either direction, and 2^11 is sufficient for all the images in the corpus. – Peter Olson – 2016-01-29T02:22:30.260

@PeterOlson I'll change it to 2^16. – orlp – 2016-01-29T13:28:37.860

@orlp The rules state that the decompressor must take less than ten seconds to decode an image. The idea I have will potentially take a few minutes to generate reference files that will be used in subsequent calls to the decompressor. i.e. it is a once off activity similar to "installing" an application. Would this solution be disqualified? – Moogie – 2016-02-10T03:40:02.980

@Moogie That is allowed iff those reference files are unrelated to any of the image files in the corpus. – orlp – 2016-02-10T09:19:17.000

@orlp well yes, they will be related... i.e. it would create data used for the compression of the files, however they will not be derived from the files at all... (if i could generate the image files then i would do so directly :P ) – Moogie – 2016-02-10T10:54:21.490

Is the 5-minute rule for the compression for any single image, or for the entire corpus? – Mego – 2016-02-10T19:57:28.303

@Mego Single image. – orlp – 2016-02-10T20:58:16.593

To be a bit pedantic DSSIM actually measures dis-similarity : 1/SSIM - 1, but as you can see it is very related to SSIM. For a good match, SSIM values should be close to 1 and DSSIM should instead be close to 0. However dis-similarity does not have any upper bound as SSIM can get arbitrarily close to 0. – mathreadler – 2016-02-22T15:28:48.153

Am I allowed to make a 1x1 pixel preview image? – 12Me21 – 2017-01-25T12:55:37.133

@12Me21 No. "The dimensions of the image output by the decompressor must exactly match those of the original file given to the compressor. You may assume that the image dimensions do not exceed 216 in either direction." – orlp – 2017-01-25T13:08:49.607

Answers

Python with PIL, score 0.094218

Compressor:

#!/usr/bin/env python
from __future__ import division
import sys, traceback, os
from PIL import Image
from fractions import Fraction
import time, io

def image_bytes(img, scale):
    w,h = [int(dim*scale) for dim in img.size]
    bio = io.BytesIO()
    img.resize((w,h), Image.LANCZOS).save(bio, format='PPM')
    return len(bio.getvalue())

def compress(img):
    w,h = img.size
    w1,w2 = w // 256, w % 256
    h1,h2 = h // 256, h % 256
    n = w*h
    total_size = 4*1024 - 8 #4 KiB minus 8 bytes for
                            # original and new sizes
    #beginning guess for the optimal scaling
    scale = Fraction(total_size, image_bytes(img, 1))
    #now we do a binary search for the optimal dimensions,
    # with the restriction that we maintain the scale factor
    low,high = Fraction(0),Fraction(1)
    best = None
    start_time = time.time()
    iter_count = 0
    while iter_count < 100: #scientifically chosen, not arbitrary at all
        #make sure we don't take longer than 5 minutes for the whole program
        #10 seconds is more than reasonable for the loading/saving
        if time.time() - start_time >= 5*60-10:
            break
        size = image_bytes(img, scale)
        if size > total_size:
            high = scale
        elif size < total_size:
            low = scale
            if best is None or total_size-size < best[1]:
                best = (scale, total_size-size)
        else:
            break
        scale = (low+high)/2
        iter_count += 1
    w_new, h_new = [int(dim*best[0]) for dim in (w,h)]
    wn1,wn2 = w_new // 256, w_new % 256
    hn1, hn2 = h_new // 256, h_new % 256
    i_new = img.resize((w_new, h_new), Image.LANCZOS)
    bio = io.BytesIO()
    i_new.save(bio, format='PPM')
    return ''.join(map(chr, (w1,w2,h1,h2,wn1,wn2,hn1,hn2))) + bio.getvalue()

if __name__ == '__main__':
    for f in sorted(os.listdir(sys.argv[1])):
        try:
            print("Compressing {}".format(f))
            with Image.open(os.path.join(sys.argv[1],f)) as img:
                with open(os.path.join(sys.argv[2], f), 'wb') as out:
                    out.write(compress(img.convert(mode='RGB')))
        except:
            print("Exception with {}".format(f))
            traceback.print_exc()
            continue

Decompressor:

#!/usr/bin/env python
from __future__ import division
import sys, traceback, os
from PIL import Image
from fractions import Fraction
import io

def process_rect(rect):
    return rect

def decompress(compressed):
    w1,w2,h1,h2,wn1,wn2,hn1,hn2 = map(ord, compressed[:8])
    w,h = (w1*256+w2, h1*256+h2)
    wc, hc = (wn1*256+wn2, hn1*256+hn2)
    img_bytes = compressed[8:]
    bio = io.BytesIO(img_bytes)
    img = Image.open(bio)
    return img.resize((w,h), Image.LANCZOS)


if __name__ == '__main__':
    for f in sorted(os.listdir(sys.argv[1])):
        try:
            print("Decompressing {}".format(f))
            with open(os.path.join(sys.argv[1],f), 'rb') as img:
                decompress(img.read()).save(os.path.join(sys.argv[2],f))
        except:
            print("Exception with {}".format(f))
            traceback.print_exc()
            continue

Both scripts take input via command line arguments, as two directories (input and output), and convert all images in the input directory.

The idea is to find a size that fits under 4 KiB and has the same aspect ratio as the original, and use a Lanczos filter to get as high of a quality out of the downsampled image as possible.

Imgur album of compressed images, after resizing to original dimensions

Scoring script output:

Comparing corpus/1 - starry.png to test/1 - starry.png... 0.159444
Comparing corpus/2 - source.png to test/2 - source.png... 0.103666
Comparing corpus/3 - room.png to test/3 - room.png... 0.065547
Comparing corpus/4 - rainbow.png to test/4 - rainbow.png... 0.001020
Comparing corpus/5 - margin.png to test/5 - margin.png... 0.282746
Comparing corpus/6 - llama.png to test/6 - llama.png... 0.057997
Comparing corpus/7 - kid.png to test/7 - kid.png... 0.061476
Comparing corpus/8 - julia.png to test/8 - julia.png... 0.021848
Average score: 0.094218

Mego

Posted 2016-01-28T07:51:13.320

Reputation: 32 998

I just realized that your solution uses WebP, which is disallowed. Your solution is invalid. – orlp – 2017-01-25T12:19:56.923

@orlp It's fixed now to use PPM, which is an uncompressed format. – Mego – 2017-01-25T12:45:56.803

Alright. This challenge does expose quite a bit weakness of DSSIM though. I would argue that most of Moogie's images look substantially better. – orlp – 2017-01-25T12:57:36.810

@orlp They look good as thumbnails. When blown up to their original dimensions (using Lanczos), they look about the same quality or worse. I'm working on getting thumbnails of my output uploaded. – Mego – 2017-01-25T12:59:16.007

Java (vanilla, should work with java 1.5+), score 0.672

It does not generate particularly good dssim scores but, to my eye it produces more human friendly images...

Album: http://imgur.com/a/yL31U

Scoring script output:

Comparing corpus/1 - starry.png to test/1 - starry.png... 2.3521
Comparing corpus/2 - source.png to test/2 - source.png... 1.738
Comparing corpus/3 - room.png to test/3 - room.png... 0.1829
Comparing corpus/4 - rainbow.png to test/4 - rainbow.png... 0.0633
Comparing corpus/5 - margin.png to test/5 - margin.png... 0.4224
Comparing corpus/6 - llama.png to test/6 - llama.png... 0.204
Comparing corpus/7 - kid.png to test/7 - kid.png... 0.36335
Comparing corpus/8 - julia.png to test/8 - julia.png... 0.05
Average score: 0.672

The compression chain:

1. if filter data has already been generated goto step 6
2. create sample images from random shapes and colours
3. take sample patches from these sample images
4. perform k-clustering of patches based on similarity of luminosity and chomanosity.
5. generate similarity ordered lists for each patch as compared to the other patches.
6. read in image
7. reduce image size to current multiplier * blocksize
8. iterate over image comparing each blocksize block against the list of clustered luminosity patches and chromanosity patches, selecting the closest match
9. output the index of the closet match from the similarity ordered list (of the previous block) (repeat 8 for chromanosity)
10. perform entropy encoding using deflate.
11. if output size is < 4096 bytes then increment current multiplier and repeat step 7
12. write previous output to disk.

To decompress, inflate and then read the block indexes and output the corresponding patch to the output file, then resize to the original dimensions.

Unfortunately the code is too large for stackoverflow so can be found at https://gist.github.com/anonymous/989ab8a1bb6ec14f6ea9

To run:

Usage: 
       For single image compression: java CompressAnImageToA4kibPreview -c <INPUT IMAGE> [<COMPRESSED IMAGE>]
       For multiple image compression: java CompressAnImageToA4kibPreview -c <INPUT IMAGES DIR> [<COMPRESSED IMAGE DIR>]
       For single image decompression: java CompressAnImageToA4kibPreview -d <COMPRESSED IMAGE> [<DECOMPRESSED IMAGE>
       For multiple image decompression: java CompressAnImageToA4kibPreview -d <COMPRESSED IMAGE DIR> [<DECOMPRESSED IMAGES DIR>]

If optional parameters are not set then defaults will be used:
       For single image compression, compressed image will be created in same directory are input image and have '.compressed' file extension.
       For multiple image compression, compressed images will be created in a new 'out' sub directory of <INPUT IMAGES DIR> and have '.compressed' file extensions.
       For single image decompression, decompressed image will be created in same directory are input image and have '.out.png' file extension.
       For multiple image decompression, decompressed images will be created a new 'out' sub directory of <COMPRESSED IMAGE DIR> and have '.png' file extensions.

The first time this application is run, required files will be generated and saved in a directory relative to execution working dir. This may take a few minutes. For subsequent executions, this step will not need to be performed.

Moogie

Posted 2016-01-28T07:51:13.320

Reputation: 1 505

This looks amazing. Just to confirm, steps 1-6 do not rely on the corpus at all? Also, would you mind if I rehost your code on gist.github.com instead? – orlp – 2016-02-13T11:08:35.950

Correct, does not use any of the corpus files as input. you can see the images it produces to generate the patches buy changing the constant "OUTPUT_SAMPLE_IMAGES" to true. It will output these images to the working folder: data/images/working/ – Moogie – 2016-02-13T11:12:06.900

@orlp now using gist.github – Moogie – 2016-02-13T11:12:28.067

The results are stunning, but doesn't using deflate/inflate break the rule about disallowing generic compression/decompression? – algmyr – 2018-07-30T04:36:38.337

@algmyr It has been a while, but i think i had interpreted the no generic compression rule as meaning no generic 'image' compression... i.e. jpeg, etc. But I may have interpreted that incorrectly, in which case, yes, this submission should be diqualified. – Moogie – 2018-07-31T04:06:22.357