Build a Nonographic Magnitude Optimizer™

12

A nonogram is a Japanese puzzle game in which the goal is to draw a black-and-white picture according to a list of contiguous regions, like so:

An example nonogram with a "lambda".

Define the nonographic magnitude of a row or column to be the number of contiguous black regions in that row or column. For example, the top row has a nonographic magnitude of 1, because there is one region of 2 squares in that row. The 8th row has a nonographic magnitude of 3 because it has 2, 2, 1.

An empty row or column has a nonographic magnitude of 0.


Your task is to write a program that takes a solution grid for a nonogram, and produces a solution grid with as few filled-in squares as possible where every row and column has the same nonographic magnutide as the given solution grid.

For example, a nonogram grid with all the squares filled in has a nonographic magnitude of 1 on every row or column:

A 10x10 nonogram where every square is filled in.

The same nonographic magnitude could be achieved just by having a diagonal stripe through the grid, reducing the number of filled-in squares dramatically:

A 10x10 nonogram with the same nonographic magnitude as the above.


Your program will receive an input consisting of 50,000 lines from this file (1.32 MB tar.gz text file; 2.15 MB unzipped), each representing a single 16×16 nonogram solution grid with randomly (80% black) filled-in squares, and output another 50,000 lines, each containing the optimized solution grid for the corresponding input grid.

Each grid is represented as a base64 string with 43 characters (encoding squares from left to right, then top to bottom), and your program will need to return its output in the same format. For example, the first grid in the file is E/lu/+7/f/3rp//f799xn/9//2mv//nvj/bt/yc9/40=, and renders as:

first example nonogram

The grid starts with E which maps to 000100, so the first six cells in the top row are all white except the fourth one. The next character is / which maps to 111111, so the next 6 cells are all black — and so on.


Your program must actually return a solution grid with the correct nonographic magnitudes for each of the 50,000 test cases. It is allowed to return the same grid as the input if nothing better was found.

The program to return the fewest total filled-in squares (in any language) is the winner, with shorter code being the tiebreaker.


Current scoreboard:

  1. 3,637,260 — Sleafar, Java
  2. 7,270,894 — flawr, Matlab
  3. 10,239,288 — Joe Z., Bash

Joe Z.

Posted 2016-01-15T19:15:46.197

Reputation: 30 589

1I do not really see the point of the base 64 encoding and working with that is a real pain. Wouldn't it be easier to just make lines of ones and zeros? Or encode the whole thing as a bitmap? – flawr – 2016-01-15T20:32:01.103

@flawr: It reduces the filesize, mainly (by a factor of 6 compared to just 1's and 0's). Also, bitmaps would be even harder to work with. – Joe Z. – 2016-01-15T20:33:21.347

You could just make a black and white image, easy to read/write and same size as b64 encoding. – flawr – 2016-01-15T20:45:45.423

Could you please post e.g. the first example as an image, such that we can see if our base 64 implementation does work correctly? – flawr – 2016-01-15T21:24:44.800

@flawr Alright, done. – Joe Z. – 2016-01-16T00:07:17.987

2also not a fan of the b64 encoding, for input and/or output. why not just let the i/o be in any convenient format? – Sparr – 2016-01-16T05:28:51.550

@Sparr: I could relax the requirement on output, but the input file is what it is — you'd have to re-encode the input for your program yourself. – Joe Z. – 2016-01-16T15:37:58.973

Can you confirm that these are the correct magnitudes for the given inputs (each pair of lines is 16 numbers for the 16 row magnitudes and then 16 numbers for the 16 column magnitudes): https://drive.google.com/file/d/0B-0NVE9E8UJiNFdmYlk1aV9aYzQ/view?usp=sharing

– quintopia – 2016-01-17T18:01:04.477

@quintopia: The first one is correct. Let me check the second one by generating it... – Joe Z. – 2016-01-18T00:50:59.557

1Assuming it is, I should have an optimal solution by this time tomorrow. – quintopia – 2016-01-18T04:50:56.400

Answers

7

Python 2 & PuLP — 2,644,688 squares (optimally minimized); 10,753,553 squares (optimally maximized)

Minimally golfed to 1152 bytes

from pulp import*
x=0
f=open("c","r")
g=open("s","w")
for k,m in enumerate(f):
 if k%2:
    b=map(int,m.split())
    p=LpProblem("Nn",LpMinimize)
    q=map(str,range(18))
    ir=q[1:18]
    e=LpVariable.dicts("c",(q,q),0,1,LpInteger)
    rs=LpVariable.dicts("rs",(ir,ir),0,1,LpInteger)
    cs=LpVariable.dicts("cs",(ir,ir),0,1,LpInteger)
    p+=sum(e[r][c] for r in q for c in q),""
    for i in q:p+=e["0"][i]==0,"";p+=e[i]["0"]==0,"";p+=e["17"][i]==0,"";p+=e[i]["17"]==0,""
    for o in range(289):i=o/17+1;j=o%17+1;si=str(i);sj=str(j);l=e[si][str(j-1)];ls=rs[si][sj];p+=e[si][sj]<=l+ls,"";p+=e[si][sj]>=l-ls,"";p+=e[si][sj]>=ls-l,"";p+=e[si][sj]<=2-ls-l,"";l=e[str(i-1)][sj];ls=cs[si][sj];p+=e[si][sj]<=l+ls,"";p+=e[si][sj]>=l-ls,"";p+=e[si][sj]>=ls-l,"";p+=e[si][sj]<=2-ls-l,""
    for r,z in enumerate(a):p+=lpSum([rs[str(r+1)][c] for c in ir])==2*z,""
    for c,z in enumerate(b):p+=lpSum([cs[r][str(c+1)] for r in ir])==2*z,""
    p.solve()
    for r in ir:
     for c in ir:g.write(str(int(e[r][c].value()))+" ")
     g.write('\n')
    g.write('%d:%d\n\n'%(-~k/2,value(p.objective)))
    x+=value(p.objective)
 else:a=map(int,m.split())
print x

(NB: the heavily indented lines start with tabs, not spaces.)

Example output: https://drive.google.com/file/d/0B-0NVE9E8UJiX3IyQkJZVk82Vkk/view?usp=sharing

It turns out that problems like are readily convertible to Integer Linear Programs, and I needed a basic problem to learn how to use PuLP—a python interface for a variety of LP solvers—for a project of my own. It also turns out that PuLP is extremely easy to use, and the ungolfed LP builder worked perfectly the first time I tried it.

The two nice things about employing a branch-and-bound IP solver to do the hard work of solving this for me (other than not having to implement a branch and bound solver) are that

  • Purpose-built solvers are really fast. This program solves all 50000 problems in about 17 hours on my relatively low-end home PC. Each instance took from 1-1.5 seconds to solve.
  • They produce guaranteed optimal solutions (or tell you that they failed to do so). Thus, I can be confident that no one will beat my score in squares (although someone might tie it and beat me on the golfing part).

How to use this program

First, you'll need to install PuLP. pip install pulp should do the trick if you have pip installed.

Then, you'll need to put the following in a file called "c": https://drive.google.com/file/d/0B-0NVE9E8UJiNFdmYlk1aV9aYzQ/view?usp=sharing

Then, run this program in any late Python 2 build from the same directory. In less than a day, you'll have a file called "s" which contains 50,000 solved nonogram grids (in readable format), each with the total number of filled squares listed below it.

If you'd like to maximize the number of filled squares instead, change the LpMinimize on line 8 to LpMaximize instead. You will get output very much like this: https://drive.google.com/file/d/0B-0NVE9E8UJiYjJ2bzlvZ0RXcUU/view?usp=sharing

Input format

This program uses a modified input format, since Joe Z. said that we would be allowed to re-encode the input format if we like in a comment on the OP. Click the link above to see what it looks like. It consists of 10000 lines, each containing 16 numbers. The even numbered lines are the magnitudes for the rows of a given instance, while the odd numbered lines are the magnitudes for the columns of the same instance as the line above them. This file was generated by the following program:

from bitqueue import *

with open("nonograms_b64.txt","r") as f:
    with open("nonogram_clues.txt","w") as g:
        for line in f:
            q = BitQueue(line.decode('base64'))
            nonogram = []
            for i in range(256):
                if not i%16: row = []
                row.append(q.nextBit())
                if not -~i%16: nonogram.append(row)
            s=""
            for row in nonogram:
                blocks=0                         #magnitude counter
                for i in range(16):
                    if row[i]==1 and (i==0 or row[i-1]==0): blocks+=1
                s+=str(blocks)+" "
            print >>g, s
            nonogram = map(list, zip(*nonogram)) #transpose the array to make columns rows
            s=""
            for row in nonogram:
                blocks=0
                for i in range(16):
                    if row[i]==1 and (i==0 or row[i-1]==0): blocks+=1
                s+=str(blocks)+" "
            print >>g, s

(This re-encoding program also gave me an extra opportunity to test my custom BitQueue class I created for the same project mentioned above. It is simply a queue to which data can be pushed as sequences of bits OR bytes, and from which data can be popped either a bit or a byte at a time. In this instance, it worked perfectly.)

I re-encoded the input for the specific reason that to build an ILP, the extra information about the grids that were used to generate the magnitudes is perfectly useless. The magnitudes are the only constraints, and so the magnitudes are all I needed access to.

Ungolfed ILP builder

from pulp import *
total = 0
with open("nonogram_clues.txt","r") as f:
    with open("solutions.txt","w") as g:
        for k,line in enumerate(f):
            if k%2:
                colclues=map(int,line.split())
                prob = LpProblem("Nonogram",LpMinimize)
                seq = map(str,range(18))
                rows = seq
                cols = seq
                irows = seq[1:18]
                icols = seq[1:18]
                cells = LpVariable.dicts("cell",(rows,cols),0,1,LpInteger)
                rowseps = LpVariable.dicts("rowsep",(irows,icols),0,1,LpInteger)
                colseps = LpVariable.dicts("colsep",(irows,icols),0,1,LpInteger)
                prob += sum(cells[r][c] for r in rows for c in cols),""
                for i in rows:
                    prob += cells["0"][i] == 0,""
                    prob += cells[i]["0"] == 0,""
                    prob += cells["17"][i] == 0,""
                    prob += cells[i]["17"] == 0,""
                for i in range(1,18):
                    for j in range(1,18):
                        si = str(i); sj = str(j)
                        l = cells[si][str(j-1)]; ls = rowseps[si][sj]
                        prob += cells[si][sj] <= l + ls,""
                        prob += cells[si][sj] >= l - ls,""
                        prob += cells[si][sj] >= ls - l,""
                        prob += cells[si][sj] <= 2 - ls - l,""
                        l = cells[str(i-1)][sj]; ls = colseps[si][sj]
                        prob += cells[si][sj] <= l + ls,""
                        prob += cells[si][sj] >= l - ls,""
                        prob += cells[si][sj] >= ls - l,""
                        prob += cells[si][sj] <= 2 - ls - l,""
                for r,clue in enumerate(rowclues):
                    prob += lpSum([rowseps[str(r+1)][c] for c in icols]) == 2 * clue,""
                for c,clue in enumerate(colclues):
                    prob += lpSum([colseps[r][str(c+1)] for r in irows]) == 2 * clue,""
                prob.solve()
                print "Status for problem %d: "%(-~k/2),LpStatus[prob.status]
                for r in rows[1:18]:
                    for c in cols[1:18]:
                        g.write(str(int(cells[r][c].value()))+" ")
                    g.write('\n')
                g.write('Filled squares for %d: %d\n\n'%(-~k/2,value(prob.objective)))
                total += value(prob.objective)
            else:
                rowclues=map(int,line.split())
print "Total number of filled squares: %d"%total

This is the program that actually produced the "example output" linked above. Hence the extra long strings at the end of each grid, which I truncated when golfing it. (The golfed version should produce identical output, minus the words "Filled squares for ")

How it Works

cells = LpVariable.dicts("cell",(rows,cols),0,1,LpInteger)
rowseps = LpVariable.dicts("rowsep",(irows,icols),0,1,LpInteger)
colseps = LpVariable.dicts("colsep",(irows,icols),0,1,LpInteger)

I use an 18x18 grid, with the center 16x16 part being the actual puzzle solution. cells is this grid. The first line creates 324 binary variables: "cell_0_0", "cell_0_1", and so on. I also create grids of the "spaces" between and around the cells in the solution part of the grid. rowseps points to the 289 variables which symbolize the spaces that separate cells horizontally, while colseps similarly points to variables that mark the spaces that separate cells vertically. Here's a unicode diagram:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|□|0
  - - - - - - - - - - - - - - - - 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The 0s and s are the binary values tracked by the cell variables, the |s are the binary values tracked by the rowsep variables, and the -s are the binary values tracked by the colsep variables.

prob += sum(cells[r][c] for r in rows for c in cols),""

This is the objective function. Just the sum of all the cell variables. Since these are binary variables, this is just exactly the number of filled squares in the solution.

for i in rows:
    prob += cells["0"][i] == 0,""
    prob += cells[i]["0"] == 0,""
    prob += cells["17"][i] == 0,""
    prob += cells[i]["17"] == 0,""

This just sets the cells around the outer edge of the grid to zero (which is why I represented them as zeroes above). This is the most expedient way to track how many "blocks" of cells are filled, since it ensures that every change from unfilled to filled (moving across a column or row) is matched by a corresponding change from filled to unfilled (and vice versa), even if the first or last cell in the row is filled. This is the sole reason for using an 18x18 grid in the first place. It's not the only way to count blocks, but I think it is the simplest.

for i in range(1,18):
    for j in range(1,18):
        si = str(i); sj = str(j)
        l = cells[si][str(j-1)]; ls = rowseps[si][sj]
        prob += cells[si][sj] <= l + ls,""
        prob += cells[si][sj] >= l - ls,""
        prob += cells[si][sj] >= ls - l,""
        prob += cells[si][sj] <= 2 - ls - l,""
        l = cells[str(i-1)][sj]; ls = colseps[si][sj]
        prob += cells[si][sj] <= l + ls,""
        prob += cells[si][sj] >= l - ls,""
        prob += cells[si][sj] >= ls - l,""
        prob += cells[si][sj] <= 2 - ls - l,""

This is the real meat of the logic of the ILP. Basically it requires that each cell (other than those in the first row and column) be the logical xor of the cell and separator directly to its left in its row and directly above it in its column. I got the constraints that simulate an xor within a {0,1} integer program from this wonderful answer: https://cs.stackexchange.com/a/12118/44289

To explain a bit more: this xor constraint makes it so that the separators can be 1 if and only if they lie between cells which are 0 and 1 (marking a change from unfilled to filled or vice versa). Thus, there will be exactly twice as many 1-valued separators in a row or column as the number of blocks in that row or column. In other words, the sum of the separators on a given row or column is exactly twice the magnitude of that row/column. Hence the following constraints:

for r,clue in enumerate(rowclues):
    prob += lpSum([rowseps[str(r+1)][c] for c in icols]) == 2 * clue,""
for c,clue in enumerate(colclues):
    prob += lpSum([colseps[r][str(c+1)] for r in irows]) == 2 * clue,""

And that's pretty much it. The rest just asks the default solver to solve the ILP, then formats the resulting solution as it writes it to the file.

quintopia

Posted 2016-01-15T19:15:46.197

Reputation: 3 899

Really good answer. Make me wanna learn about LP solvers. Do you think it could be used to solve flood it puzzle (link) for a 19x19, 6 colors board (regarding time to compute a solution) ? I have already answered that contest (and won it) however my method (A* search algorithm) only give non optimal solutions.

– tigrou – 2017-02-10T20:24:58.353

@tigrou Thanks. I'm not sure the flood problem is linear enough to admit such a solution. I certainly can't see how to model it this way. – quintopia – 2017-02-10T22:36:32.590

It seems somebody already tried it : https://kunigami.blog/2012/09/16/flood-it-an-exact-approach/ However they couldn't optimal solutions in feasible time for a 14x14 board.

– tigrou – 2017-02-10T22:42:14.567

3

Java, 6,093,092 4,332,656 3,637,260 squares (minimized), 10,567,550 10,567,691 10,568,746 squares (maximized)

Both variants of the program repeatedly perform operations on the source grid, without changing the magnitude.

Minimizer

shrink()

enter image description here

If a black square has 2 white neighbors and 2 black neighbors in a 90° angle, it can be replaced by a white square.

moveLine()

enter image description here enter image description here

In configurations like above the black line can be moved to the right. This is done repeatedly for all 4 line directions clockwise and counterclockwise, to open up new shrink possibilities.

Maximizer

Uncomment the line in main() and comment out the line above it for this version.

grow()

enter image description here

If a white square has 2 white neighbors and 2 black neighbors in a 90° angle, it can be replaced by a black square.

moveLine()

Same as in Minimizer.

Source

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileReader;
import java.io.FileWriter;
import java.util.Arrays;
import java.util.Base64;
import java.util.function.Function;

public class Main {
    private static final int SIZE = 16;
    private static final int SIZE_4 = SIZE + 4;
    private static final int E = 0;
    private static final int N = 1;
    private static final int W = 2;
    private static final int S = 3;

    private static final Base64.Decoder decoder = Base64.getMimeDecoder();
    private static final Base64.Encoder encoder = Base64.getMimeEncoder();
    private static int sourceBlack = 0;
    private static int targetBlack = 0;

    private static class Nonogram {
        private final boolean[] cells = new boolean[SIZE_4 * SIZE_4];
        private final int[] magnitudes;

        public Nonogram(String encoded) {
            super();
            byte[] decoded = decoder.decode(encoded);
            for (int i = 0; i < decoded.length; ++ i) {
                for (int j = 0; j < 8; ++ j) {
                    if ((decoded[i] & (1 << (7 - j))) != 0) {
                        int k = i * 8 + j;
                        cells[getPos(k / SIZE, k % SIZE)] = true;
                        ++ sourceBlack;
                    }
                }
            }
            magnitudes = calcMagnitudes();
        }

        private int getPos(int row, int col) {
            return (row + 2) * SIZE_4 + col + 2;
        }

        private int move(int pos, int dir, int count) {
            switch (dir) {
                case E: return pos + count;
                case N: return pos - count * SIZE_4;
                case W: return pos - count;
                case S: return pos + count * SIZE_4;
                default: return pos;
            }
        }

        private int move(int pos, int dir) {
            return move(pos, dir, 1);
        }

        private int[] calcMagnitudes() {
            int[] result = new int[SIZE * 2];
            for (int row = 0; row < SIZE; ++ row) {
                for (int col = 0; col < SIZE; ++ col) {
                    int pos = getPos(row, col);
                    if (cells[pos]) {
                        if (!cells[move(pos, W)]) {
                            ++ result[row + SIZE];
                        }
                        if (!cells[move(pos, N)]) {
                            ++ result[col];
                        }
                    }
                }
            }
            return result;
        }

        private boolean isBlack(int pos) {
            return cells[pos];
        }

        private boolean isWhite(int pos) {
            return !cells[pos];
        }

        private boolean allBlack(int pos, int dir, int count) {
            int p = pos;
            for (int i = 0; i < count; ++ i) {
                if (isWhite(p)) {
                    return false;
                }
                p = move(p, dir);
            }
            return true;
        }

        private boolean allWhite(int pos, int dir, int count) {
            int p = pos;
            for (int i = 0; i < count; ++ i) {
                if (isBlack(p)) {
                    return false;
                }
                p = move(p, dir);
            }
            return true;
        }

        private int findWhite(int pos, int dir) {
            int count = 0;
            int p = pos;
            while (cells[p]) {
                ++ count;
                p = move(p, dir);
            }
            return count;
        }

        @SafeVarargs
        private final void forEach(Function<Integer, Boolean>... processors) {
            outer:
            for (;;) {
                for (Function<Integer, Boolean> processor : processors) {
                    for (int row = 0; row < SIZE; ++ row) {
                        for (int col = 0; col < SIZE; ++ col) {
                            if (processor.apply(getPos(row, col))) {
                                continue outer;
                            }
                        }
                    }
                }
                return;
            }
        }

        private boolean shrink(int pos) {
            if (cells[pos] && cells[move(pos, W)] != cells[move(pos, E)] &&
                    cells[move(pos, N)] != cells[move(pos, S)]) {
                cells[pos] = false;
                return true;
            }
            return false;
        }

        private boolean grow(int pos) {
            if (!cells[pos] && cells[move(pos, W)] != cells[move(pos, E)] &&
                    cells[move(pos, N)] != cells[move(pos, S)]) {
                cells[pos] = true;
                return true;
            }
            return false;
        }

        private boolean moveLine(boolean clockwise, int dir, int sourcePos) {
            int from = (dir + (clockwise ? 1 : 3)) % 4;
            int to = (dir + (clockwise ? 3 : 1)) % 4;
            int opp = (dir + 2) % 4;
            if (isBlack(sourcePos) && isWhite(move(sourcePos, from)) && isWhite(move(sourcePos, dir))) {
                int toCount = findWhite(move(move(sourcePos, dir), to), to) + 1;
                if (allWhite(move(sourcePos, to), to, toCount + 1)) {
                    int lineCount = 1;
                    int tmpPos = move(sourcePos, opp);
                    while (isBlack(tmpPos) && isWhite(move(tmpPos, from)) && allWhite(move(tmpPos, to),  to, toCount + 1)) {
                        ++ lineCount;
                        tmpPos = move(tmpPos, opp);
                    }
                    if (allBlack(tmpPos, to, toCount + 1)) {
                        tmpPos = sourcePos;
                        for (int j = 0; j < lineCount; ++ j) {
                            cells[tmpPos] = false;
                            cells[move(tmpPos, to, toCount)] = true;
                            tmpPos = move(tmpPos, opp);
                        }
                        return true;
                    }
                }
            }
            return false;
        }

        public Nonogram minimize() {
            for (int i = 0; i < 5; ++ i) {
                forEach(pos -> shrink(pos), pos -> moveLine(true, E, pos), pos -> moveLine(true, N, pos),
                        pos -> moveLine(true, W, pos), pos -> moveLine(true, S, pos));
                forEach(pos -> shrink(pos), pos -> moveLine(false, E, pos), pos -> moveLine(false, N, pos),
                        pos -> moveLine(false, W, pos), pos -> moveLine(false, S, pos));
            }
            return this;
        }

        public Nonogram maximize() {
            for (int i = 0; i < 5; ++ i) {
                forEach(pos -> grow(pos), pos -> moveLine(true, E, pos), pos -> moveLine(true, N, pos),
                        pos -> moveLine(true, W, pos), pos -> moveLine(true, S, pos));
                forEach(pos -> grow(pos), pos -> moveLine(false, E, pos), pos -> moveLine(false, N, pos),
                        pos -> moveLine(false, W, pos), pos -> moveLine(false, S, pos));
            }
            return this;
        }

        public String toBase64() {
            if (!Arrays.equals(magnitudes, calcMagnitudes())) {
                throw new RuntimeException("Something went wrong!");
            }
            byte[] decoded = new byte[SIZE * SIZE / 8];
            for (int i = 0; i < decoded.length; ++ i) {
                for (int j = 0; j < 8; ++ j) {
                    int k = i * 8 + j;
                    if (cells[getPos(k / SIZE, k % SIZE)]) {
                        decoded[i] |= 1 << (7 - j);
                        ++ targetBlack;
                    }
                }
            }
            return encoder.encodeToString(decoded);
        }

        @Override
        public String toString() {
            StringBuilder b = new StringBuilder();
            for (int row = 0; row < SIZE; ++ row) {
                for (int col = 0; col < SIZE; ++ col) {
                    b.append(cells[getPos(row, col)] ? '#' : ' ');
                }
                b.append('\n');
            }
            return b.toString();
        }
    }

    public static void main(String[] args) throws Exception {
        try (BufferedWriter writer = new BufferedWriter(new FileWriter("solutions_b64.txt"));
                BufferedReader reader = new BufferedReader(new FileReader("nonograms_b64.txt"))) {
            String line;
            while ((line = reader.readLine()) != null) {
                writer.write(new Nonogram(line).minimize().toBase64() + "\n");
                //writer.write(new Nonogram(line).maximize().toBase64() + "\n");
            }
        }
        System.out.printf("%d -> %d", sourceBlack, targetBlack);
    }
}

Sleafar

Posted 2016-01-15T19:15:46.197

Reputation: 2 722

1

Bash — 10,239,288 squares

As a last-place reference solution:

cp nonograms_b64.txt solutions_b64.txt

Since your program is allowed to return the same grid if it can't find a better solution, printing out the whole file verbatim is also valid.

There are a total of 10,239,288 black squares in the test file, which is pretty close to the 10,240,000 you'd expect from 80% of squares filled in out of 50,000 grids with 256 squares each. As usual with my test-battery questions, I've selected the number of test cases with the expectation that the optimal scores will be around the 2-million range, although I suspect the scores will be closer to 4 or 5 million this time around.


If anyone can create a solution that maximizes the black squares rather than minimizing them and manages to get over 10,240,000, I might consider giving it a bounty.

Joe Z.

Posted 2016-01-15T19:15:46.197

Reputation: 30 589

1

Matlab, 7,270,894 squares (~71% of original)

Idea is a simple repeated greedy search: For every black square, try if you can set it to white without changing the nonographic magnitudes. Repeat this twice. (You could achieve way better results with more repetitions, but not for free: It results in an correspondingly longer runtime. Now it was about 80 minutes. I would do that, if we would not have to compute all 50k testcases...)

Here the code (each function in a separate file, as usual.)

function D = b64decode(E)
% accepts a string of base 64 encoded data, and returns a array of zeros
% and ones
F = E;
assert( mod(numel(E),4)==0 && 0 <= sum(E=='=') && sum(E=='=') <= 2,'flawed base 64 code')

F('A' <= E & E<= 'Z') = F('A' <= E & E<= 'Z') -'A';       %upper case
F('a' <= E & E<= 'z') = F('a' <= E & E<= 'z') -'a' + 26;  %lower case
F('0'<= E & E <= '9') = F('0'<= E & E <= '9') -'0' + 52;  %digits
F( E == '+') = 62;
F( E == '/') = 63;
F( E == '=') = 0;

D=zeros(1,numel(E)*3*8/4);

for k=1:numel(E);
    D(6*(k-1)+1 + (0:5)) = dec2bin(F(k),6)-'0';
end

if E(end) == '=';
    D(end-7:end) = [];
    if E(end-1) == '=';
        D(end-7:end) = [];
    end
end
end

function E = b64encode(D)
assert(mod(numel(D),8)==0,'flawed byte code');
N=0;
while mod(numel(D),6) ~= 0;
    D = [D,zeros(1,8)];
    N = N+1;
end
dict = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';

E=zeros(1,numel(D)/6)+'=';
for k=0:numel(E)-N-1;
    E(k+1) = dict(bin2dec( [D(6*k+(1:6))+'0',''] ) + 1);
end

E = [E,''];
end


function [B,T,O] = reduce_greedy(N)
% reduce greedily
NM = nomographic_magnitude(N);
B = N; %current best
M = N;
T = nnz(N); %current number of filled squares
O = T;

for r=1:2;  %how many repetitions
    I = find(B);
    for k=1:numel(I);
        M = B;
        M( I(k) ) = 0;
        %check whether we have a valid solution
        if all(NM == nomographic_magnitude(M))
            if T > nnz(M); %did we actually reduce the number of filled squares?
                B = M;
                T = nnz(M);
            end
        end
    end
end


%% main file
string_in = fileread('nonograms_b64.txt');
string_out = string_in;
tic
total_new = 0;  %total number of black squares
total_old = 0;
M = 50000;
for k=1:M;
    disp(k/M); %display the progress
    line = string_in(45*(k-1)+(1:44));
    decoded = b64decode(line);        
    nonogram = reshape(decoded,16,[]) ;%store nonogram as a matrix
    [B,T,O] = reduce_greedy(nonogram);
    disp([nnz(B),nnz(nonogram)])
    total_new = total_new + T;
    total_old = total_old + O;
    string_in(45*(k-1)+(1:44)) = b64encode(B(:).');
end
toc
F = fopen('nonograms_b64_out.txt','w');
fprintf(F,string_out);
%%
disp([total_new,total_old])

flawr

Posted 2016-01-15T19:15:46.197

Reputation: 40 560