Sort all the pixels of an image by the number of occurrences

8

1

Input

The name of a file in the raster graphics format of your choice. The chosen format must support at least 8 bits per channel and 3 channels.

Output

A file in the same format, with the same dimensions and pixels as the first, but whose pixels are grouped in descending order of the number of times they occur, sorted from left-to-right, top-to-bottom.

  • If certain colours of pixels appear the same number of times, their order is unspecified.
  • You must not overwrite the input file (use a different file name for the output).
  • Any and all third-party image processing libraries are permitted.

Example

panda

Will give output similar to:

sorted pixels from panda image

Especially in the lower parts of the image some variation may occur, due to different tie breaking between colours of equal frequency.

EMBLEM

Posted 2015-03-14T23:17:52.767

Reputation: 2 179

1Can we use other file formats, like PNM, for the output? – FUZxxl – 2015-03-14T23:19:40.003

@EMBLEM I'm not sure what you mean with “forego grayscale.” – FUZxxl – 2015-03-14T23:25:56.583

@MartinBüttner Good point, I've changed it to input and output being the same formats. – EMBLEM – 2015-03-14T23:26:30.420

@FUZxxl doesn't matter anymore, I've changed it to input and output being the same formats. – EMBLEM – 2015-03-14T23:26:59.170

Can we provide input using the C preprocessor? – FUZxxl – 2015-03-14T23:28:56.333

@MartinBüttner No, and by votes. – EMBLEM – 2015-03-14T23:29:20.743

1@EMBLEM I think he meant “ties” as in “two pixels occur with the same frequency.” Can I assume that their order is unspecified in this case? – FUZxxl – 2015-03-14T23:30:12.540

@MartinBüttner Unspecified. – EMBLEM – 2015-03-14T23:31:32.537

@FUZxxl Sure, C Preprocessor is fine. – EMBLEM – 2015-03-14T23:32:32.473

1To be clear on input/output... can we use a string (file name), a stream of bytes, or a complex File-type object...? That would make a large difference in some languages. – Geobits – 2015-03-14T23:34:56.467

@Bigtoes limiting to filename. – EMBLEM – 2015-03-14T23:36:15.393

@MartinBüttner Agreed. Sorry. :( – EMBLEM – 2015-03-14T23:39:24.117

If using PPM, do we have to allo for comments or not? – Maltysen – 2015-03-15T00:38:52.380

@MartinBüttner Because it couldn't handle transparency. – EMBLEM – 2015-03-21T21:17:24.987

@MartinBüttner Oops. I'd intended that to be a requirement, but it never got in there. Accepted. – EMBLEM – 2015-03-21T22:28:03.033

Answers

4

J, 94 81 bytes

A function taking the name of a PNG file (with no transparency channel) and writing the result in the input filename prepended by "o".

f=.3 :0
load'graphics/png'
(($a)$#/|:\:~(#,{.)/.~,a=.readpng y)writepng'o',y
)

input output

Method

  • Create a list of every color-number and the number of it's appearances
  • Sort list by number of it's appearances
  • Multiplicate every color-number equal it's appearances
  • Reshape the new list to the shape of the original image

randomra

Posted 2015-03-14T23:17:52.767

Reputation: 19 909

you can save one character if you make the function dyadic and save the sorted PNG into x. – FUZxxl – 2015-03-15T12:02:01.307

@FUZxxl It would actually save 3 chars but I don't think that's in line with the specification. I could also save the image to a 1-letter extensionless file but that's not too elegant. – randomra – 2015-03-15T13:07:46.337

@FUZxxl Well, turned out there was quite some extra chars in the body of the function... – randomra – 2015-03-17T10:59:57.590

I don't speak J, but in the specification there is a string as input. Here the input file name is hardcoded in the program. – edc65 – 2015-03-17T11:01:23.860

@edc65 In J if you define a function it's argument is always accessed by the y character. (If you define a function with two arguments those are accessed by x and y and you cannot define functions with more arguments.) – randomra – 2015-03-17T11:04:58.160

6

Mathematica, 125 123 bytes

Export["a"<>#,i=ImageData@Import@#;Image@ArrayReshape[Join@@ConstantArray@@@SortBy[Tally[Join@@i],-Last@#&],Dimensions@i]]&

This defines an unnamed function which takes the file name in any common image format and writes the result to the a file with the same name but prepended with a. The result looks a bit different from the OP's, since Mathematica's SortBy breaks ties by default sort order, so the bottom bits where loads of ties occur look a bit neater:

Not a panda.

The implementation itself is really straightforward:

  • ImageData to get a grid of colour values.
  • Join to flatten the array.
  • Tally to count the occurrence of each colour.
  • SortBy[...,-Last@#&] to sort by frequencies from highest to lowest.
  • ConstantArray and Join to expand the tallies again.
  • ArrayReshape to recover the original image shape (obtained with Dimensions).
  • Image to convert the data back to an image object.

FYI, 22 bytes are used on file I/O. An equivalent version that takes and returns an image object comes in at 103 bytes:

(i=ImageData@#;Image@ArrayReshape[Join@@ConstantArray@@@SortBy[Tally[Join@@i],-Last@#&],Dimensions@i])&

Martin Ender

Posted 2015-03-14T23:17:52.767

Reputation: 184 808

4

Python2/PIL, 244 226 225 223 222 202 186 182 170 159

from PIL.Image import*
s=raw_input();i=open(s);g=list(i.getdata());i.putdata(sum([[c[1]]*-c[0]for c in sorted((-g.count(r),r)for r in set(g))],[]));i.save(2*s)

Changelog

<p><b>Edit 1:</b> saved 18 bytes with new algorithm.</p>
<p><b>Edit 2:</b> saved 1 byte with remembering that Python 2 has int division without <code>from __future__ import division</code>. Whoops.</p>
<p><b>Edit 3:</b> saved 2 bytes by swapping <code>cmp</code> for <code>key</code>.</p>
<p><b>Edit 4:</b> saved 1 byte by clever unpacking. <code>w=i.size[0]</code> -> <code>w,h=i.size</code></p>
<p><b>Edit 5:</b> moved <code>list()</code> to an earlier position, drastically improving runtime. No length change.</p>
<p><b>Edit 6:</b> saved 20 bytes by remembering that there is also sequence sorting in Python. Probably slower, but also produces images more like the the Mathematica one.</p>
<p><b>Edit 7:</b> saved 16 bytes by changing <code>putpixel</code> to a list and <code>putdata</code>. Under 200!</p>
<p><b>Edit 8:</b> saved 4 bytes by changing <code>range(...)</code> to <code>...*[0]</code> since we don't need the values.</p>
<p><b>Edit 9:</b> saved 12 bytes by converting the <code>for</code> loop to a <code>sum</code>.</p>
<p><b>Edit 10:</b> saved 11 bytes by noticing that the image size doesn't matter anymore (since edit 7). Whoops v2.0.</p>

Shorter version by stokastic, 123

from PIL.Image import*
s=raw_input();i=open(s);d=list(i.getdata());i.putdata(sorted(d,key=lambda D:d.count(D)));i.save(2*s)

Well, let's at least try, even though it's already beaten.

It's extremely slow, Panda processed for several minutes on my laptop.

Saves with a filename with the original filename repeated twice.

panda

PurkkaKoodari

Posted 2015-03-14T23:17:52.767

Reputation: 16 699

you can save 32 bytes by using i=open(raw_input());d=list(i.getdata());i.putdata(sorted(d,key=lambda D:d.count(D)));i.save('o.png'), although it is very very slow for large images (calls list.count on each pixel). – stokastic – 2015-03-16T17:44:45.020

@stokastic Well, I'll add that but credit you for it. I prefer reading the name to a variable (what if the user inputs an image called o.png?) – PurkkaKoodari – 2015-03-17T06:52:58.117

fair enough! :) – stokastic – 2015-03-17T19:17:25.787

2

Python, 1197 bytes

# -*- coding: utf-8 -*-
from __future__ import division
from collections import Counter
import png
from sys import argv,exit
script,file_name,output_file_name=argv
def group(s,n):
    return zip(*[iter(s)]*n)
def flatten(list_of_lists):
    result=[]
    for lst in list_of_lists:
        result.extend(lst)
    return result
def group_list(old_list,tuples_per_list):
    new_list=[]
    i=1
    appended_row=[]
    for item in old_list:
        appended_row.extend(flatten([item]))
        if i==tuples_per_list:
            new_list.append(appended_row)
            i=1
            appended_row=[]
        else:
            i+=1
    return new_list
input_image=png.Reader(file_name)
image_data=input_image.read()
width=image_data[0]
height=image_data[1]
pixels=list(image_data[2])
if image_data[3]["alpha"]:
    ints_per_colour=4
elif image_data[3]["greyscale"]:
    ints_per_colour=2
else:
    ints_per_colour=3
colours=Counter(colour for row in pixels for colour in group(row,ints_per_colour)).most_common()
pixel_list=flatten([element]*count for element,count in colours)
ordered_rows=group_list(pixel_list,width)
f=open(output_file_name,"wb")
png_object=png.Writer(width,height,greyscale=image_data[3]["greyscale"],alpha=image_data[3]["alpha"])
png_object.write(f,ordered_rows)
f.close()

The png module I used.

EMBLEM

Posted 2015-03-14T23:17:52.767

Reputation: 2 179

2Thank you for providing a reference solution! We usually put these into the challenge post itself so people can find them more easily, but posting them as an answer is okay, too. – FUZxxl – 2015-03-14T23:20:43.000

1

C# 413

Complete program. Pass the file name on command line, the output is saved in the same format as file "o".

Not using some neat features of linq, like SelectMany and Enumerable.Range, as the program would be cleaner but longer.

using System.Collections.Generic;using System.Linq;using System.Drawing;
class P{
static void Main(string[]a){
var d=new Dictionary<Color,int>();var b=new Bitmap(a[0]);
int n,x,y,w=b.Width,h=b.Height;
for (x=w;x-->0;)for(y=h;y-->0;){var p=b.GetPixel(x,y);d[p]=d.ContainsKey(p)?d[p]+1:1;}
y=h;foreach(var q in d.OrderBy(v=>v.Value)){for(n=q.Value;n-->0;){
if(x<=0){x=w;--y;}b.SetPixel(--x, y, q.Key);}}b.Save("o");
}}

Readable formatting courtesy of VS2010

using System.Collections.Generic;
using System.Linq;
using System.Drawing;

class P
{
    static void Main(string[] a)
    {
        var d = new Dictionary<Color, int>();
        var b = new Bitmap(a[0]);
        int n,x,y,w = b.Width, h=b.Height;

        for (x = w; x-- > 0;)    
            for (y = h; y-- > 0;)
            {
                var p = b.GetPixel(x, y);
                d[p] = d.ContainsKey(p) ? d[p]+1 : 1;
            }
        y = h;
        foreach (var q in d.OrderBy(v => v.Value))
        {
            for (n = q.Value; n-- > 0; )
            {
                if (x <= 0)
                {
                    x = w;
                    --y;
                 }
                 b.SetPixel(--x, y, q.Key);
            }
        }
        b.Save(a[0]+".");
    }
}

edc65

Posted 2015-03-14T23:17:52.767

Reputation: 31 086

1

Python 2 : 191 bytes

Here's my attempt. I figured I might be able to save some space by using Counter, but it didn't end up as small as Pietu1998's answer.

from collections import Counter
from PIL import Image
i=Image.open(raw_input())
s=reduce(lambda x,y:x+y,map(lambda(a,b):[a]*b,Counter(i.getdata()).most_common()))
i.putdata(s)
i.save("o.png")

Output from Panda

Output from Panda

danmcardle

Posted 2015-03-14T23:17:52.767

Reputation: 695