Build a minimum-clue Sudoku unsolver

16

3

My attempt at stating this question, but with a more objective solving criterion.

Your task is to build a program or function that takes a solved Sudoku grid S in the format of your choice and attempts to generate a problem grid with as few clues as possible that has S as its unique solution. (It doesn't matter what method S is the unique solution by, including brute force, as long as the solution is provably unique.)


Your program will be scored by running it through a set of 100,000 solution grids found in this file (7.82 MB download), and adding up the number of clues in all 100,000 problem grids that your solution produces.

The Sudoku solutions in the above test file are expressed as an 81-character string, from left-to-right, then top-to-bottom. The code required to turn the input in the test file into a usable solution will not count towards your solution's byte count.

As in my Flood Paint challenge, your program must actually produce a valid output for all 100,000 puzzles to be considered a valid solution. The program that outputs the fewest total clues for all 100,000 test cases is the winner, with shorter code breaking a tie.


Current scoreboard:

  1. 2,361,024 - nutki, C
  2. 2,580,210 - es1024, PHP
  3. 6,000,000 - CarpetPython, Python 2
  4. 7,200,000 - Joe Z., Python

Joe Z.

Posted 2015-04-07T00:51:27.053

Reputation: 30 589

Also, you can be sure that any solution claiming less than 1,700,000 solutions is a phony, but I want to see just how low these can go. – Joe Z. – 2015-04-07T07:51:52.577

Answers

8

C - 2,361,024 2,509,949 clues

Remove clues starting from the last cell if a brute force solver finds only one unique solution.

Second try: use heuristic to decide in which order to remove clues instead of starting from the last. This makes the code run much slower (20 minutes instead of 2 to calculate the result). I could make the solver faster, to experiment with different heuristics, but for now it will do.

#include <stdio.h>
#include <string.h>
char ll[100];
short b[81];
char m[81];
char idx[81][24];
int s;
char lg[513];
void pri2() {
    int i;
    for(i=0;i<81;i++) putchar(lg[b[i]]);
    putchar('\n');
}
void solve(pos){
int i,p;
if (s > 1) return;
if (pos == 81) { s++; return; }
if (b[pos]) return solve(pos+1);
for (p=i=0;i<24;i++) p |= b[idx[pos][i]];
for (i = 0; i < 9; i++) if (!(p&(1<<i))) {
    b[pos] = 1 << i;
    solve(pos + 1);
}
b[pos] = 0;
}
int main() {
    int i,j,t;
    for(i=0;i<9;i++) lg[1<<i]='1'+i;
    lg[0] = '.';
    for(i=0;i<81;i++) {
    t = 0;
    for(j=0;j<9;j++) if(i/9*9 + j != i) idx[i][t++] = i/9*9 + j;
    for(j=0;j<9;j++) if(i%9 + j*9 != i) idx[i][t++] = i%9 + j*9;
    for(j=0;j<81;j++) if(j/27 == i/27 && i%9/3 == j%9/3 && i!=j) idx[i][t++] = j;
    }
    while(scanf("%s ",ll)>0) {
    memset(m, 0, sizeof(m));
    for(i=0;i<81;i++) b[i] = 1 << (ll[i]-'1');
    for(i=0;i<81;i++) {
    int j,k,l = 99;
    for(k=0;k<81;k++) if (m[k] <= l) l = m[k], j = k;
    m[j] = 24;
    t = b[j]; b[j] = 0;
    s = 0; solve(0);
    if (s > 1) b[j] = t;
    else for(k=0;k<24;k++) m[idx[j][k]]++;
    }
    pri2();
    }
    return 0;
}

nutki

Posted 2015-04-07T00:51:27.053

Reputation: 3 634

1

Python — 7,200,000 clues

As usual, here is a last-place reference solution:

def f(x): return x[:72] + "." * 9

Removing the bottom row of numbers provably leaves the puzzle solvable in all cases, as each column still has 8 of 9 numbers filled, and each number in the bottom row is simply the ninth number left over in the column.

If any serious contender manages to legally score worse than this one, I will be astonished.

Joe Z.

Posted 2015-04-07T00:51:27.053

Reputation: 30 589

I mean, you could remove just the last one. – seequ – 2015-04-07T06:55:15.517

You could also just leave the whole thing solved, but neither of those would be a serious contender. – Joe Z. – 2015-04-07T07:11:57.293

So why is this a serious contender? – theonlygusti – 2015-04-07T22:34:23.937

It's not. That's why I said I'd be astonished if any serious contender managed to score worse than this non-serious contender. – Joe Z. – 2015-04-07T22:41:26.980

1

Python 2 - 6,000,000 clues

A simple solution that uses the 3 common methods of solving these puzzles:

def f(x): 
    return ''.join('.' if i<9 or i%9==0 or (i+23)%27 in (0,3) else c 
        for i,c in enumerate(x))

This function produces clue formats like this:

.........
.dddddddd
.dddddddd
.ddd.dd.d
.dddddddd
.dddddddd
.ddd.dd.d
.dddddddd
.dddddddd

This can always be solved. The 4 3x3 parts are solved first, then the 8 columns, then the 9 rows.

Logic Knight

Posted 2015-04-07T00:51:27.053

Reputation: 6 622

1

PHP — 2,580,210 clues

This first removes the last row and column, and bottom right corner of every box. It then tries to clear out each cell, running the board through a simple solver after each change to ensure the board is still unambiguously solvable.

Much of the code below was modified from one of my old answers. printBoard uses 0s for empty cells.

<?php
// checks each row/col/block and removes impossible candidates
function reduce($cand){
    do{
        $old = $cand;
        for($r = 0; $r < 9; ++$r){
        for($c = 0; $c < 9; ++$c){
            if(count($cand[$r][$c]) == 1){ // if filled in
                // remove values from row and col and block
                $remove = $cand[$r][$c];
                for($i = 0; $i < 9; ++$i){
                    $cand[$r][$i] = array_diff($cand[$r][$i],$remove);
                    $cand[$i][$c] = array_diff($cand[$i][$c],$remove);
                    $br = floor($r/3)*3+$i/3;
                    $bc = floor($c/3)*3+$i%3;
                    $cand[$br][$bc] = array_diff($cand[$br][$bc],$remove);
                }
                $cand[$r][$c] = $remove;
            }
        }}
    }while($old != $cand);
    return $cand;
}

// checks candidate list for completion
function done($cand){
    for($r = 0; $r < 9; ++$r){
    for($c = 0; $c < 9; ++$c){
        if(count($cand[$r][$c]) != 1)
            return false;
    }}
    return true;
}

// board format: [[1,2,0,3,..],[..],..], $b[$row][$col]
function solve($board){
    $cand = [[],[],[],[],[],[],[],[],[]];
    for($r = 0; $r < 9; ++$r){
    for($c = 0; $c < 9; ++$c){
        if($board[$r][$c]){ // if filled in
            $cand[$r][$c] = [$board[$r][$c]];
        }else{
            $cand[$r][$c] = range(1, 9);
        }
    }}
    $cand = reduce($cand);

    if(done($cand))  // goto not really necessary
        goto end;    // but it feels good to use it 
    else return false;

    end:
    // back to board format
    $b = [];
    for($r = 0; $r < 9; ++$r){
        $b[$r] = [];
        for($c = 0; $c < 9; ++$c){
            if(count($cand[$r][$c]) == 1)
                $b[$r][$c] = array_pop($cand[$r][$c]);
            else 
                $b[$r][$c] = 0;
        }
    }
    return $b;
}

function add_zeros($board, $ind){
    for($r = 0; $r < 9; ++$r){
    for($c = 0; $c < 9; ++$c){
        $R = ($r + (int)($ind/9)) % 9;
        $C = ($c + (int)($ind%9)) % 9;
        if($board[$R][$C]){
            $tmp = $board[$R][$C];
            $board[$R][$C] = 0;
            if(!solve($board))
                $board[$R][$C] = $tmp;
        }   
    }}
    return $board;
}

function generate($board, $ind){
    // remove last row+col
    $board[8] = [0,0,0,0,0,0,0,0,0];
    foreach($board as &$j) $j[8] = 0;

    // remove bottom corner of each box
    $board[2][2] = $board[2][5] = $board[5][2] = $board[5][5] = 0;

    $board = add_zeros($board, $ind);

    return $board;    
}
function countClues($board){
    $str = implode(array_map('implode', $board));
    return 81 - substr_count($str, '0');
}

function generateBoard($board){
    return generate($board, 0);
}

function printBoard($board){
    for($i = 0; $i < 9; ++$i){
        echo implode(' ', $board[$i]) . PHP_EOL;
    }
    flush();
}
function readBoard($str){
    $tmp = str_split($str, 9);
    $board = [];
    for($i = 0; $i < 9; ++$i)
        $board[] = str_split($tmp[$i], 1);
    return $board;
}
// testing
$n = 0;
$f = fopen('ppcg_sudoku_testing.txt', 'r');
while(($l = fgets($f)) !== false){
    $board = readBoard(trim($l));
    $n += countClues(generateBoard($board));
}
echo $n;

es1024

Posted 2015-04-07T00:51:27.053

Reputation: 8 953