Sudoku Compression

35

5

Your job is to write a program (or two separate programs) in any language that:

  1. Can take a completed Sudoku board as input (in any logical format) and compress it into a string of characters
  2. Can take the compressed string as input and decompress it to get the exact same completed Sudoku board (output in any logical format of 9 rows)

Note: Use the rules of Sudoku to your advantage; that is the idea behind this challenge.
Sudoku rules on Wikipedia

Rules

  • Only printable ASCII characters (32 - 126) are allowed in the compressed output (eg. no multibyte characters).
  • You can assume that the input is a valid 3x3 Sudoku board (normal rules, no variations).
  • I won't impose a time-limit, but do not create a brute-force algorithm. Or, submitters should be able to test their submissions before posting (Thanks Jan Dvorak).

If you have any questions or concerns, you can ask for clarification or make suggestions in the comments.

Winning Conditions

Score = sum of the number of characters from all ten test cases

Lowest score wins.

Test Cases

You may use these to test how well your program works.

9 7 3 5 8 1 4 2 6
5 2 6 4 7 3 1 9 8
1 8 4 2 9 6 7 5 3
2 4 7 8 6 5 3 1 9
3 9 8 1 2 4 6 7 5
6 5 1 7 3 9 8 4 2
8 1 9 3 4 2 5 6 7
7 6 5 9 1 8 2 3 4
4 3 2 6 5 7 9 8 1

7 2 4 8 6 5 1 9 3
1 6 9 2 4 3 8 7 5
3 8 5 1 9 7 2 4 6
8 9 6 7 2 4 3 5 1
2 7 3 9 5 1 6 8 4
4 5 1 3 8 6 9 2 7
5 4 2 6 3 9 7 1 8
6 1 8 5 7 2 4 3 9
9 3 7 4 1 8 5 6 2

1 5 7 6 8 2 3 4 9
4 3 2 5 1 9 6 8 7
6 9 8 3 4 7 2 5 1
8 2 5 4 7 6 1 9 3
7 1 3 9 2 8 4 6 5
9 6 4 1 3 5 7 2 8
5 4 1 2 9 3 8 7 6
2 8 9 7 6 1 5 3 4
3 7 6 8 5 4 9 1 2

8 3 5 4 1 6 9 2 7
2 9 6 8 5 7 4 3 1
4 1 7 2 9 3 6 5 8
5 6 9 1 3 4 7 8 2
1 2 3 6 7 8 5 4 9
7 4 8 5 2 9 1 6 3
6 5 2 7 8 1 3 9 4
9 8 1 3 4 5 2 7 6
3 7 4 9 6 2 8 1 5

6 2 8 4 5 1 7 9 3
5 9 4 7 3 2 6 8 1
7 1 3 6 8 9 5 4 2
2 4 7 3 1 5 8 6 9
9 6 1 8 2 7 3 5 4
3 8 5 9 6 4 2 1 7
1 5 6 2 4 3 9 7 8
4 3 9 5 7 8 1 2 6
8 7 2 1 9 6 4 3 5

1 2 3 4 5 6 7 8 9
4 5 6 7 8 9 1 2 3
7 8 9 1 2 3 4 5 6
2 1 4 3 6 5 8 9 7
3 6 5 8 9 7 2 1 4
8 9 7 2 1 4 3 6 5
5 3 1 6 4 8 9 7 2
6 4 8 9 7 2 5 3 1
9 7 2 5 3 1 6 4 8

1 4 5 7 9 2 8 3 6
3 7 6 5 8 4 1 9 2
2 9 8 3 6 1 7 5 4
7 3 1 9 2 8 6 4 5
8 5 9 6 4 7 3 2 1
4 6 2 1 3 5 9 8 7
6 2 4 8 7 3 5 1 9
5 8 7 4 1 9 2 6 3
9 1 3 2 5 6 4 7 8

5 2 7 4 1 6 9 3 8
8 6 4 3 2 9 1 5 7
1 3 9 5 7 8 6 4 2
2 9 1 8 5 4 3 7 6
3 4 8 6 9 7 5 2 1
6 7 5 1 3 2 4 8 9
7 1 2 9 4 5 8 6 3
4 8 3 2 6 1 7 9 5
9 5 6 7 8 3 2 1 4

2 4 6 7 1 3 9 8 5
1 8 5 4 9 6 7 3 2
9 3 7 8 2 5 1 4 6
6 7 8 5 4 2 3 9 1
4 9 3 1 6 8 2 5 7
5 1 2 3 7 9 4 6 8
8 2 4 9 5 7 6 1 3
7 5 9 6 3 1 8 2 4
3 6 1 2 8 4 5 7 9

8 6 1 2 9 4 5 7 3
4 7 5 3 1 8 6 9 2
3 9 2 5 6 7 8 1 4
2 3 6 4 5 9 7 8 1
1 5 4 7 8 3 2 6 9
9 8 7 6 2 1 3 4 5
5 2 9 1 7 6 4 3 8
6 4 8 9 3 2 1 5 7
7 1 3 8 4 5 9 2 6

Credit to http://www.opensky.ca/~jdhildeb/software/sudokugen/ for some of these

If you find any issues with the test cases, please tell me.

kukac67

Posted 2014-11-15T00:26:23.360

Reputation: 2 159

Which bytes are allowed should be more clearly stated: anything (0-255), ASCII (0-127) or printable ASCII (32-126), etc. – feersum – 2014-11-15T00:39:26.620

@MartinBüttner You're right. I will add more test cases. – kukac67 – 2014-11-15T00:39:45.050

@feersum Printable ASCII (so, 32-126). I will update the rules. – kukac67 – 2014-11-15T00:43:04.130

5

Also, there should be a time limit, to prevent a solution that enumerates every board configuration and checks whether it is one of the 6670903752021072936960 possible solved Sudoku grids.

– feersum – 2014-11-15T00:44:00.633

@feersum I didn't want to impose a time-limit, though I suppose I should. However, I don't want to do a hard cut-off, just prevent brute-forcing. – kukac67 – 2014-11-15T00:45:51.137

@feersum let's assume the usual "submitters should be able to test their submissions before submitting them" – John Dvorak – 2014-11-15T00:47:19.623

@JanDvorak I haven't heard about this one...and often see untestable programs. – feersum – 2014-11-15T00:48:40.493

3You may want to change the scoring. As it stands there is nothing stopping me from hardcoding the test cases to 1-char codes and just use 81-char codes for everything else – TwiNight – 2014-11-15T05:11:16.080

4@TwiNight apart from it being a standard loophole, you mean? – John Dvorak – 2014-11-15T05:42:27.177

4Despite my answer below, I think the best way to solve this would be to write a sudoku solver, then remove the maximum number of digits from the grid such that the puzzle is still soluble (that should be all but four or five numbers). Then compress that. The decompressor also contains the solver. – abligh – 2014-11-15T13:36:16.190

@abligh Yes. A Sudoku to be solved is a compressed Sudoku solution. 95 characters for the compression is too generous to non-Sudoku solutions, so those have been run with. They'll be cut to pieces later. The question gives a strong hint. Also, I haven't tried, but even with the number solutions, why not replace the 9 by zero before making the big numbers? – Bill Woodger – 2014-11-15T16:15:31.787

@BillWoodger most of the solutions (incl mine) subtract one from each digit so they are in fact 0..8 (base 9). – abligh – 2014-11-15T17:10:01.293

@JanDvorak Declaring it to be a standard loophole won't matter much since there is really no way to draw a line between the obvious hardcoding and then simply writing code which is only optimized for the test cases. It would be much better to use the standard solution and score according to the sum of decompression code and compressed data. (And here it is important that it is the decompression code and not the compression code, which is included in the scoring rule.) – kasperd – 2014-11-16T01:24:41.313

@kasperd that would be too big of a change at this point, however. Might be good enough for a separate challenge, though – John Dvorak – 2014-11-16T02:11:43.000

4@kasperd it is indeed difficult to draw the line (see the fudge subroutine in my second answer that gains 12 points). A fairer test would be to require that (a) the test solutions work, (b) score on 1,000 randomly generated Sudoku grids and divide the answer by 100. I believe the best one can do with random data is about 110, based on 10 x log-base-95(6670903752021072936960) – abligh – 2014-11-16T11:52:39.390

Is there a deadline for submitting answers? When will the winner be declared? I have a plan, but didn't get to implementing it yet. Well, if the one with the hardcoded test examples counts, the race is run. My goal is the best generic solution. :) – Reto Koradi – 2014-11-18T08:16:44.463

@RetoKoradi There's still time :) At least 3 more days, most likely more. – kukac67 – 2014-11-19T01:17:02.910

There is another fundamental issue with this challenge. Most, if not all of the solutions are taking advantage of information beyond the allowed byte values 32..126. In particular, they depend on knowing the length of the string or file. If there are multiple boards in a file, you would need a delimiter to end each board, which would increase the number of characters per board by one. Or you would need to use some other scheme to encode the end of the board in the compressed data. – Mark Adler – 2014-11-22T07:47:06.083

Another more subtle issue with this challenge is the 6th test case. It is easy to see that this is a very special board, in the sense that the lowest values are selected for each cell, except for one, assuming some ordering of filling in those cells and ordering of the digits. That selects for solutions that code in that order, which can provide a very small integer for that board, and a so a single character compression. This can be seen in Jan Dvorak's and averykhoo's solutions, both of which code that board to ")". So the winning solutions are fudged, even if not intentionally. – Mark Adler – 2014-11-22T09:46:43.007

Answers

26

Haskell, 107 points

import Control.Monad
import Data.List

type Elem = Char
type Board = [[Elem]]
type Constraints = ([Elem],[Elem],[Elem])

digits :: [Elem]
digits = "123456789"
noCons :: Constraints
noCons = ([],[],[])
disjointCons :: Constraints
disjointCons = ("123","456","789") -- constraints from a single block - up to isomorphism
triples :: [a] -> [[a]]
triples [a,b,c,d,e,f,g,h,i] = [[a,b,c],[d,e,f],[g,h,i]]
(+++) :: Constraints -> Constraints -> Constraints
(a,b,c) +++ (d,e,f) = (a++d,b++e,c++f)

maxB = 12096 
-- length $ assignments noCons disjointCons
maxC = 216 -- worst case: rows can be assigned independently
maxD = maxB
maxE = 448
-- foldl1' max [length $ assignments disjointCons colCons
--             | (_, colCons) <- map constraints $ assignments ([],[1],[1]) ([],[1],[1]),
--               let ([a,d,g],[b,e,h],[c,f,i]) = colCons,
--               a < d, d < g, b < e, e < h, c < f, f < i]
maxF = 2 ^ 3 -- for each row the relevant column constraints can be in the same column (no assignment), 
             -- or in two or three columns (two assignments)
maxG = maxC
maxH = maxF

-- constraints -> list of block solutions
assignments :: Constraints -> Constraints -> [[Elem]]
assignments (r1,r2,r3) (c1,c2,c3) = do
    a <- digits  \\ (r1 ++ c1); let digits1 = digits  \\ [a]
    b <- digits1 \\ (r1 ++ c2); let digits2 = digits1 \\ [b]
    c <- digits2 \\ (r1 ++ c3); let digits3 = digits2 \\ [c]
    d <- digits3 \\ (r2 ++ c1); let digits4 = digits3 \\ [d]
    e <- digits4 \\ (r2 ++ c2); let digits5 = digits4 \\ [e]
    f <- digits5 \\ (r2 ++ c3); let digits6 = digits5 \\ [f]
    g <- digits6 \\ (r3 ++ c1); let digits7 = digits6 \\ [g]
    h <- digits7 \\ (r3 ++ c2); let digits8 = digits7 \\ [h]
    i <- digits8 \\ (r3 ++ c3)
    return [a,b,c,d,e,f,g,h,i]

-- block solution -> tuple of constraints
constraints :: [Elem] -> (Constraints, Constraints)
constraints [a,b,c,d,e,f,g,h,i] = (([a,b,c],[d,e,f],[g,h,i]),([a,d,g],[b,e,h],[c,f,i]))

------------------------------------------------------------------------------------------

-- solution -> Integer
solution2ix :: Board -> Integer
solution2ix [a,b,c,d,e,f,g,h,i] =
    let (ar, ac) = constraints a
        (br, bc) = constraints b
        (_ , cc) = constraints c
        (dr, dc) = constraints d
        (er, ec) = constraints e
        (_ , fc) = constraints f
        (gr, _ ) = constraints g
        (hr, _ ) = constraints h
        (_ , _ ) = constraints i

        Just ixA = findIndex (a ==) $ assignments noCons      noCons
        Just ixB = findIndex (b ==) $ assignments ar          noCons
        Just ixC = findIndex (c ==) $ assignments (ar +++ br) noCons
        Just ixD = findIndex (d ==) $ assignments noCons      ac
        Just ixE = findIndex (e ==) $ assignments dr          bc
        Just ixF = findIndex (f ==) $ assignments (dr +++ er) cc
        Just ixG = findIndex (g ==) $ assignments noCons      (ac +++ dc)
        Just ixH = findIndex (h ==) $ assignments gr          (bc +++ ec)
        Just ixI = findIndex (i ==) $ assignments (gr +++ hr) (cc +++ fc)

    in foldr (\(i,m) acc -> fromIntegral i + m * acc) (fromIntegral ixA)
     $ zip [ixH, ixG, ixF, ixE, ixD, ixC, ixB] [maxH, maxG, maxF, maxE, maxD, maxC, maxB]

--    list of rows 
-- -> list of threes of triples
-- -> three triples of threes of triples 
-- -> three threes of triples of triples
-- -> nine triples of triples
-- -> nine blocks
toBoard :: [[Elem]] -> Board
toBoard = map concat . concat . map transpose . triples . map triples

toBase95 :: Integer -> String
toBase95 0 = ""
toBase95 ix = toEnum (32 + fromInteger (ix `mod` 95)) : toBase95 (ix `div` 95)

------------------------------------------------------------------------------------------

ix2solution :: Integer -> Board
ix2solution ix =
    let (ixH', ixH) = ix   `divMod` maxH
        (ixG', ixG) = ixH' `divMod` maxG
        (ixF', ixF) = ixG' `divMod` maxF
        (ixE', ixE) = ixF' `divMod` maxE
        (ixD', ixD) = ixE' `divMod` maxD
        (ixC', ixC) = ixD' `divMod` maxC
        (ixA , ixB) = ixC' `divMod` maxB

        a = assignments noCons      noCons      !! fromIntegral ixA
        (ra, ca) = constraints a
        b = assignments ra          noCons      !! fromIntegral ixB
        (rb, cb) = constraints b
        c = assignments (ra +++ rb) noCons      !! fromIntegral ixC
        (_ , cc) = constraints c
        d = assignments noCons      ca          !! fromIntegral ixD
        (rd, cd) = constraints d
        e = assignments rd          cb          !! fromIntegral ixE
        (re, ce) = constraints e
        f = assignments (rd +++ re) cc          !! fromIntegral ixF
        (_ , cf) = constraints f
        g = assignments noCons      (ca +++ cd) !! fromIntegral ixG
        (rg, _ ) = constraints g
        h = assignments rg          (cb +++ ce) !! fromIntegral ixH
        (rh, _ ) = constraints h
        [i] = assignments (rg +++ rh) (cc +++ cf)
    in  [a,b,c,d,e,f,g,h,i]

--    nine blocks
-- -> nine triples of triples
-- -> three threes of triples of triples
-- -> three triples of threes of triples
-- -> list of threes of triples
-- -> list of rows
fromBoard :: Board -> [[Elem]]
fromBoard = map concat . concat . map transpose . triples . map triples

fromBase95 :: String -> Integer
fromBase95 ""     = 0
fromBase95 (x:xs) = (toInteger $ fromEnum x) - 32 + 95 * fromBase95 xs

------------------------------------------------------------------------------------------

main = do line <- getLine
          if length line <= 12
             then putStrLn $ unlines $ map (intersperse ' ') $ fromBoard $ ix2solution $ fromBase95 line
             else do nextLines <- replicateM 8 getLine
                     putStrLn $ toBase95 $ solution2ix $ toBoard $ map (map head.words) $ line:nextLines

The test case results:

q`3T/v50 =3,
^0NK(F4(V6T(
d KTTB{pJc[
B]^v[omnBF-*
WZslDPbcOm7'
)
ukVl2x/[+6F
qzw>GjmPxzo%
KE:*GH@H>(m!
SeM=kA`'3(X*

The code isn't pretty, but it works. The basis of the algorithm is that while enumerating all solutions would take too long, enumerating all solutions within a single block is rather quick - in fact, it's faster than the subsequent conversion to base95. The whole thing runs within seconds in the interpreter on my low-end machine. A compiled program would finish immediately.

The heavy lifting is done by the solution2ix function, which, for each 3x3 block, it generates all possible permutations, subject to constraints from the left and from above, until it finds the one in the encoded solution, remembering only the index of said permutation. Then it combines the indexes using some precomputed weights and the Horner's scheme.

In the other direction, the ix2solution function first decomposes the index into nine values. Then for each block it indexes the list of possible permutations with its respective value, then extracts the constraints for the next blocks.

assignments is a simple but ugly unrolled recursion using the list monad. It generates the list of permutations given a set of constraints.

The real power comes from the tight bounds on the permutation list lengths:

  • The top left corner is unconstrained. The number of permutations is simply 9!. This value is never used except to find an upper bound for the output length.
  • The blocks next to it only have one set of constraints - from the top left. A naive upper bound 6*5*4*6! is seven times worse than the actual count found by enumeration: 12096
  • The top right corner is constrained twice from left. Each row can only have six permutations, and in the worst case (actually in every valid case), the assignment is independent. Similarly for the bottom left corner.
  • The center piece was the hardest to estimate. Once again the brute force wins - count the permutation for each possible set of constraints up to isomorphism. Takes a while, but it's only needed once.
  • The right center piece has a double constraint from the left, which forces each row up to a permutation, but also a single constraint from the top, which ensures only two permutations per row are actually possible. Similarly for the bottom center piece.
  • The bottom right corner is fully determined by its neighbors. The sole permutation is never actually verified when computing the index. Forcing evaluation would be easy, it's just not necessary.

The product of all these limits is 71025136897117189570560 ~= 95^11.5544, which means that no code is longer than 12 characters and almost a half of them should be 11 characters or fewer. I have decided not to distinguish between a shorter string and the same string right-padded with spaces. Spaces anywhere else are significant.

The theoretical limit of encoding efficiency for prefix-free codes - base-95 logarithm of 6670903752021072936960 - is 11.035, meaning that even an optimal algorithm cannot avoid producing length-12 outputs, though it will produce them in only 3.5% of all cases. Allowing length to be significant (or equivalently, adding trailing spaces) does add a few codes (1% of the total amount), but not enough to eliminate the need for length-12 codes.

John Dvorak

Posted 2014-11-15T00:26:23.360

Reputation: 9 048

Do you think working by blocks is more efficient than by rows? – xnor – 2014-11-15T05:42:08.657

@xnor it's definitely easier to verify the constraints that way – John Dvorak – 2014-11-15T05:43:19.603

... and to bound the permutation counts, which is even more important here – John Dvorak – 2014-11-15T05:44:32.373

@xnor, the bigger the blocks, the better the approximation to optimality. Dealing with the top three blocks in one go, then the next three blocks in one go, and finally the bottom ones is probably the logical next step in improving the score. – Peter Taylor – 2014-11-15T21:20:44.003

@PeterTaylor 9!^3 = 4.8e16. That's slightly too high, but handling the first row numerically, then enumerating the next two, the next three and finally the last one might be feasible. I might try that out. – John Dvorak – 2014-11-15T21:57:25.903

theoretically, you could use the length of the string to store information. instead of counting 0 1 2 3 4 5 6 7 8 9 10 11 ... you can count empty-string 0 1 2 3 4 5 6 7 8 9 00 01 02 03 04 ... 10 11 12 ... as they are all different strings – proud haskeller – 2014-11-15T22:08:26.207

It's 9! for the first row, worst case 654*6! for the second row, and 6^3 for the third row. Then the second triple-block will be fewer, and the third will be 6^8 worst case. The second triple-block is where it gets interesting. – Peter Taylor – 2014-11-15T22:11:29.887

@proudhaskeller amended. I find that more of a hassle to the end user. Most consoles don't reveal trailing spaces to the end user, meaning that you'd have to avoid a trailing newline (abusing the prompt as an end-of-string marker), choose a good console (essentially having to output to a file) or using some kind of quote (inflating the code length by 2). Even as a theory it doesn't have that much effect, actually. Allowing newlines and significant length still doesn't push us quite to not needing length-12 codes. Now allowing both newlines and tabs does push us over the boundary. – John Dvorak – 2014-11-15T22:29:15.153

but does the actual score change when taking it into account? – proud haskeller – 2014-11-15T22:38:31.850

Am I counting wrong, or is your result actually 107 characters? There don't appear to be trailing spaces on the two that are 11 characters long. – Mark Adler – 2014-11-17T07:03:30.547

@MarkAdler you are right, fixed. I shouldn't have counted by hand... – John Dvorak – 2014-11-17T08:40:28.770

11.035 is the entropy, but is not the optimal average number of characters. If you encode 0..6670903752021072936959 to base 95 numbers, discard leading zeros, sum the total number of digits, and divide by 6670903752021072936960, you get an average of 11.138 characters. You can do a little better by considering leading zeros to be different codes, e.g. 01 is different from 1, so all possible strings of each length are used, then you get an average of 11.129 characters. That maximizes the use of the hidden string length information. – Mark Adler – 2014-11-22T09:21:20.737

Admittedly, as has been pointed out, my scoring system is broken. However, I will still choose the answer with the lowest character count. This answer ties with another answer at 107 characters, but I'm choosing this one because it has more upvotes. Congrats! :) – kukac67 – 2014-11-24T20:42:02.953

10

Python, 130 points

j1:4}*KYm6?D
h^('gni9X`g'#
$2{]8=6^l=fF!
BS ;1;J:z"^a"
\/)gT)sixb"A+
WI?TFvj%:&3-\$
*iecz`L2|a`X0
eLbt<tf|mFN'&
;KH_TzK$erFa!
7T=1*6$]*"s"!

The algorithm works by encoding each position in the board, one at a time, into a big integer. For each position, it calculates the possible values given all the assignments encoded so far. So if [1,3,7,9] are the possible values for a given position, it takes 2 bits to encode the choice.

The nice thing about this scheme is that if a position has only a single remaining choice, it takes no space to encode.

Once we have the big integer we write it out in base 95.

There are probably better encoding orderings than lexicographic, but I haven't thought a lot about it.

Encoder:

import sys

sets = [range(i*9, i*9+9) for i in xrange(9)]
sets += [range(i, 81, 9) for i in xrange(9)]
sets += [[i/3*27+i%3*3+j/3*9+j%3 for j in xrange(9)] for i in xrange(9)]

M = []
for line in sys.stdin.readlines():
    M += [int(x) for x in line.split()]

A = 0
m = 1
for i in xrange(81):
    allowed = set(xrange(1,10))
    for s in sets:
        if i in s:
            for j in s:
                if j < i: allowed.discard(M[j])
    allowed = sorted(allowed)
    A += m * allowed.index(M[i])
    m *= len(allowed)

s=''
while A != 0:
    s+='%c'%(32+A%95)
    A /= 95
print s

Decoder:

sets = [range(i*9, i*9+9) for i in xrange(9)]
sets += [range(i, 81, 9) for i in xrange(9)]
sets += [[i/3*27+i%3*3+j/3*9+j%3 for j in xrange(9)] for i in xrange(9)]

s=raw_input()
A=0
m=1
while s != '':
    A += m * (ord(s[0])-32)
    s = s[1:]
    m *= 95

M=[]
for i in xrange(81):
    allowed = set(xrange(1,10))
    for s in sets:
        if i in s:
            for j in s:
                if j < i: allowed.discard(M[j])
    allowed = sorted(allowed)
    M += [allowed[A%len(allowed)]]
    A /= len(allowed)

for i in xrange(9):
    print ' '.join(str(x) for x in M[i*9:i*9+9])

Run it like this:

> cat sudoku1 | ./sudokuEnc.py | ./sudokuDec.py
9 7 3 5 8 1 4 2 6
5 2 6 4 7 3 1 9 8
1 8 4 2 9 6 7 5 3
2 4 7 8 6 5 3 1 9
3 9 8 1 2 4 6 7 5
6 5 1 7 3 9 8 4 2
8 1 9 3 4 2 5 6 7
7 6 5 9 1 8 2 3 4
4 3 2 6 5 7 9 8 1

Keith Randall

Posted 2014-11-15T00:26:23.360

Reputation: 19 865

What are the test case outputs? Just curious. The score is impressive given how short the code is compared to mine. – John Dvorak – 2014-11-15T07:11:43.340

@JanDvorak: I've added the encoded boards. – Keith Randall – 2014-11-15T16:09:04.533

7

perl - score 115 113 103 113

Output:

"#1!A_mb_jB)
FEIV1JH~vn"
$\\XRU*LXea.
EBIC5fPxklB
5>jM7(+0MrM
!'Wu9FS2d~!W
":`R60C"}z!k
:B&Jg[fL%\j
"L28Y?3`Q>4w
o0xPz8)_i%-

Output:

                  # note this line is empty
S}_h|bt:za        
%.j0.6w>?RM+
:H$>a>Cy{7C
'57UHjcWQmcw
owmK0NF?!Fv
# }aYExcZlpD
nGl^K]xH(.\
9ii]I$voC,x
!:MR0>I>PuTU

None of those lines have a terminating space. Note that the first line is empty.

This algorithm works as follows. To compress:

  1. Start with an empty 'current' string representing the Sudoku grid

  2. Consider adding in turn each of the digits 1 .. 9 to that string, and determine which is viable.

  3. Get the next digit from the answer grid (and add it to current)

  4. If only one is viable, there is nothing to code

  5. If more than one is viable, count the number of viable options, sort them, and code that digit as the index into the sorted array. Record the digit and the number viable as a 2-tuple in an array.

  6. When all done, code each of the 2-tuples (in reverse order) in a variable based number stored as a bigint.

  7. Express the bigint in base 95.

To decode:

  1. Start with an empty 'current' string representing the Sudoku grid

  2. Decode the base95 number to a bigint

  3. Consider adding in turn each of the digits 1 .. 9 to that string, and determine which is viable.

  4. If only one is viable, there is nothing to code; add that choice to the grid

  5. If more than one is viable, count the number of viable options, sort them, and code that digit as the index into the sorted array.

  6. Decode the variable-base bigint using the number of viable options as the base, and the modulus as the index into the array, and output that digit as a cell value.

In order to determine the number of viable options, Games::Sudoku::Solver is used. That's mainly for clarity as there are 3 line Sudoku solvers on this site.

To do all 10 took 8 seconds on my laptop.

The fudge operation sorts the array differently to achieve the minimal value for the test cases. As documented, this is a fudge. The fudge reduces the score from 115 to 103. It is handcrafted to ensure that the bigint code for the first test is 0. The worst-case score for any sudoku is 12 giving a score of 120. I thus don't think this counts as hard-coding; rather it optimises for the test data. To see it work without this, change sort fudge into sort in both places.

Code follows:

#!/usr/bin/perl

use strict;
use warnings;
use Getopt::Long;
use bigint;
use Games::Sudoku::Solver qw (:Minimal set_solution_max count_occupied_cells);

# NOTE THIS IS NOT USED BY DEFAULT - see below and discussion in comments
my @fudgefactor = qw (9 7 3 5 8 1 4 2 6 5 2 6 4 7 3 1 9 8 1 8 4 2 9 6 7 5 3 2 4 7 8 6 5 3 1 9 3 9 8 1 2 4 6 7 5 6 5 1 7 3 9 8 4 2 8 1 9 3 4 2 5 6 7 7 6 5 9 1 8 2 3 4 4 3 2 6 5 7 9 8 1);
my $fudgeindex=0;
my $fudging=0; # Change to 1 to decrease score by 10

sub isviable
{
    no bigint;
    my $current = shift @_;
    my @test = map {$_ + 0} split(//, substr(($current).("0"x81), 0, 81));
    my @sudoku;
    my @solution;
    set_solution_max (2);
    my $nsolutions;

    eval
    {
        sudoku_set(\@sudoku, \@test);
        $nsolutions = sudoku_solve(\@sudoku, \@solution);
    };
    return 0 unless $nsolutions;
    return ($nsolutions >=1);
}

sub getnextviable
{
    my $current = shift @_; # grid we have so far
    my %viable;

    for (my $i = 1; $i<=9; $i++)
    {
        my $n;
        my $solution;
        $viable{$i} = 1 if (isviable($current.$i));
    }
    return %viable;
}

sub fudge
{
    return $a<=>$b unless ($fudging);
    my $k=$fudgefactor[$fudgeindex];
    my $aa = ($a+10-$k) % 10;
    my $bb = ($b+10-$k) % 10;
    return $aa<=>$bb;
}


sub compress
{
    my @data;
    while (<>)
    {
        chomp;
        foreach my $d (split(/\s+/))
        {
            push @data, $d;
        }
    }

    my $code = 0;
    my $current = "";
    my @codepoints;
    foreach my $d (@data)
    {
        my %viable = getnextviable($current);
        die "Digit $d is unexpectedly not viable - is sudoku impossible?" unless ($viable{$d});

        my $nviable = scalar keys(%viable);
        if ($nviable>1)
        {
            my $n=0;
            foreach my $k (sort fudge keys %viable)
            {
                if ($k==$d)
                {
                    no bigint;
                    my %cp = ( "n"=> $n, "v"=> $nviable);
                    unshift @codepoints, \%cp;
                    last;
                }
                $n++;
            }
        }
        $fudgeindex++;
        $current .= $d;
    }

    foreach my $cp (@codepoints)
    {
        $code = ($code * $cp->{"v"})+$cp->{"n"};
    }

    # print in base 95
    my $out="";
    while ($code)
    {
        my $digit = $code % 95;
        $out = chr($digit+32).$out;
        $code -= $digit;
        $code /= 95;
    }

    print "$out";
}

sub decompress
{
    my $code = 0;

    # Read from base 95 into bigint
    while (<>)
    {
        chomp;
        foreach my $char (split (//, $_))
        {
            my $c =ord($char)-32;
            $code*=95;
            $code+=$c;
        }
    }

    # Reconstruct sudoku
    my $current = "";
    for (my $cell = 0; $cell <81; $cell++)
    {
        my %viable = getnextviable($current);
        my $nviable = scalar keys(%viable);
        die "Cell $cell is unexpectedly not viable - is sudoku impossible?" unless ($nviable);

        my $mod = $code % $nviable;
        $code -= $mod;
        $code /= $nviable;

        my @v = sort fudge keys (%viable);
        my $d = $v[$mod];
        $current .= $d;
        print $d.(($cell %9 != 8)?" ":"\n");
        $fudgeindex++;
    }
}

my $decompress;
GetOptions ("d|decompress" => \$decompress);


if ($decompress)
{
    decompress;
}
else
{
    compress;
}

abligh

Posted 2014-11-15T00:26:23.360

Reputation: 777

5"I thus don't think this counts as hard-coding" is a pretty bold statement, considering one of the test cases is in your code verbatim. – Aaron Dufour – 2014-11-16T22:16:43.097

@AaronDufour you missed out the words following: "but it optimises for the test data". See also the discussion under the question; essentially if you optimise, you get to drop 0 to 12 symbols from 120. By luck the unoptimised solution gives 115; a random constant modulus offset takes that to 113. You can provably subtract up to 12 due to the way the challenge is scored. I'm pretty confident this method still gives the lowest average solution size for a random input set (if you think about it, it must, or must be pretty close to it), which is why I'm saying it's not relying on hard coding. – abligh – 2014-11-16T23:05:34.250

1A perfect coder (i.e. one that enumerates the 6670903752021072936960 cases) can easily have added to it a hard coding of all ten test cases, resulting in a score of 9. Just add 10 to the big integer, and replace it with 0..9 for the special cases. 0 codes as the empty string, and the rest code as one character, hence a score of 9. The impact of this on the average number of characters per board is an increase of 3.3x10^-22, which is undetectable. – Mark Adler – 2014-11-16T23:50:30.740

... which is why the scoring here is broken. I suggested an alternative. – abligh – 2014-11-17T07:27:56.183

-1 for the fudging - even if it's just to demonstrate a problem with the scoring... – John Dvorak – 2014-11-17T16:28:02.783

@JanDvorak: Hang on a sec, see Marc Adler's comments below. It may not be nice, but it doesn't break the rules. The "no hard coding" rule is here: http://meta.codegolf.stackexchange.com/questions/1061/standard-loopholes-which-are-no-longer-funny/1063#1063 - clearly this does not hardcode anything. The solution works, and codes any input. What is in there is one set of input data. By the meta post referred to under standard loopholes, I can't see what the issue is here.

– abligh – 2014-11-17T18:13:49.603

@JanDvorak OK after this discussion on meta: http://meta.codegolf.stackexchange.com/questions/2506/standard-loopholes-what-counts-as-hard-coding and this new loophole that got put in as a result: http://meta.codegolf.stackexchange.com/a/2507/8478 I've removed the fudging and reverted the points to where they were before. Given this wasn't a loophole that was mentioned before, and was unclear, I hope that will persuade you to retract the -1.

– abligh – 2014-11-17T19:04:02.217

6

CJam, 309 bytes

This is just a quick baseline solution. I'm sorry I did this in a golfing language, but it was actually the simplest way to do it. I'll add an explanation of the actual code tomorrow, but I've outlined the algorithm below.

Encoder

q~{);}%);:+:(9b95b32f+:c

Decoder

l:i32f-95b9bW%[0]64*+64<W%:)8/{_:+45\-+}%z{_:+45\-+}%z`

Test it here.

The input of the encoder (on STDIN) and the output of the decoder (on STDOUT) are in the form of a nested CJam array. E.g.

[[8 3 5 4 1 6 9 2 7] [2 9 6 8 5 7 4 3 1] [4 1 7 2 9 3 6 5 8] [5 6 9 1 3 4 7 8 2] [1 2 3 6 7 8 5 4 9] [7 4 8 5 2 9 1 6 3] [6 5 2 7 8 1 3 9 4] [9 8 1 3 4 5 2 7 6] [3 7 4 9 6 2 8 1 5]]

The 10 test outputs are:

U(5wtqmC.-[TM.#aMY#k*)pErHQcg'{
EWrn"^@p+g<5XT5G[r1|bk?q6Nx4~r?
#489pLj5+ML+z@y$]8a@CI,K}B$$Mwn
LF_X^"-h**A!'VZq kHT@F:"ZMD?A0r
?gD;"tw<yG%8y!3S"BC:ojQ!#;i-:\g
qS#"L%`4yei?Ce_r`{@EOl66m^hx77
"EF?` %!H@YX6J0F93->%90O7T#C_5u
9V)R+6@Jx(jg@@U6.DrMO*5G'P<OHv8
(Ua6z{V:hX#sV@g0s<|!X[T,Jy|oQ+K
N,F8F1!@OH1%%zs%dI`Q\q,~oAEl(:O

The algorithm is very simple:

  • Remove the last column and row.
  • Treat the remaining 64 digits as a base-9 number (after decrementing each digit by 1).
  • Convert that to base-95, add 32 to each digit and turn that into the corresponding ASCII character.
  • For the decoding, reverse the base conversion and fill in the the final column and row with the missing numbers.

Martin Ender

Posted 2014-11-15T00:26:23.360

Reputation: 184 808

I added the 10 test cases. Score is the sum of the number of characters in all 10 now. – kukac67 – 2014-11-15T01:08:52.420

@kukac67 Yep, already fixed. – Martin Ender – 2014-11-15T01:10:18.427

I'm sorry, I seem to have changed the 8th test case just after you ran it. I wasn't fast enough. :D – kukac67 – 2014-11-15T01:17:03.493

Decoding the 7th test case, I noticed it didn't work. I think you have a bug. "[[4 5 7 9 2 8 3 3 4]..." – kukac67 – 2014-11-15T01:19:32.390

@kukac67 Should be fixed. I forgot to pad the 64-digit result with leading 0s. – Martin Ender – 2014-11-15T01:24:30.560

Yup, it's working :) – kukac67 – 2014-11-15T01:37:52.447

If there are ten test cases, and each outputs the same number of characters, how come this is 309 bytes? It should be a multiple of 10. Looks to me like you might be wrongly counting \n on all but the last. – abligh – 2014-11-15T17:11:47.257

@abligh One test case has only 30 bytes (the one that has a 1 in the upper left corner). – Martin Ender – 2014-11-15T17:16:03.237

6

Mathematica, score: 130 9

Update:

After this answer was posted, it inspired a new loophole closer: "Optimising for the given test cases". I will however leave this answer as is, as an example of the loophole. Feel free to downvote. I won't be hurt.


This encodes a cell at a time in raster order, and for each cell rules out its value appropriately for subsequent cells using the basic rules of Sudoku. So, for example, when a cell is encoded and only has four possibilities, then a base 4 digit is added to the large integer. It also codes the test cases directly as small integers, still correctly compressing and decompressing all valid Sudoku boards with an average compressed length of ~12.5 characters, 1.5 more than the optimal 11.035, with relatively simple code and no Sudoku solver required.
rule=({#}&/@Union[Join[
        Range[#+1,Ceiling[#,9]],Range[#+9,81,9],
        Flatten[Outer[Plus,Range[Floor[#+8,9],Ceiling[#,27]-9,9],
            Floor[Mod[#-1,9],3]+Range[3]]]]])&/@Range[81];

encode[board_]:=
Block[{step,code,pos},
    step[{left_,x_,m_},n_]:={
        MapAt[Complement[#,{board[[n]]}]&,left,rule[[n]]],
        x+m(FirstPosition[left[[n]],board[[n]]][[1]]-1),m Length[left[[n]]]};
    code=Fold[step,{Table[Range[9],{81}],0,1},Range[81]][[2]];
    pos=Position[{206638498064127103948214,1665188010993633759502287,
        760714067080859855534739,1454154263752219616902129,6131826927558056238360710,
        237833524138130760909081600,8968162948536417279508170,3284755189143784030943149,
        912407486534781347155987,556706937207676220045188},code];
    code=If[pos==={},code+10,pos[[1,1]]-1];
    FromCharacterCode[If[code==0,{},IntegerDigits[code,95]+32]]
]    

decode[str_]:=
Block[{step,code},
    code=FromDigits[ToCharacterCode[str]-32,95];
    code=If[code<10,{206638498064127103948214,1665188010993633759502287,
        760714067080859855534739,1454154263752219616902129,6131826927558056238360710,
        237833524138130760909081600,8968162948536417279508170,3284755189143784030943149,
        912407486534781347155987,556706937207676220045188}[[code+1]],code-10];
    step[{left_,x_,board_},n_]:=Function[z,{
        MapAt[Complement[#,{z}]&,left,rule[[n]]],Quotient[x,Length[left[[n]]]],
        Append[board,z]}][left[[n,Mod[x,Length[left[[n]]]]+1]]];
    Fold[step,{Table[Range[9],{81}],code,{}},Range[81]][[3]]
]

Encoded test cases:

     <- empty string
!
"
#
$
%
&
'
(
)

This does not result in perfect coding (average ~11), since the basic rules do not rule out some choices for which there is in fact no solution. The performance could be made perfect (i.e. the large integer would always be less than the number of possible Sudoku boards) by checking to see if there is no solution to some of the current choices using a Sudoku solver, and eliminating those as well.

Mark Adler

Posted 2014-11-15T00:26:23.360

Reputation: 759

And yes, it is unfortunate that the rules of this challenge allow this solution. – Mark Adler – 2014-11-16T21:52:30.237

1

Yes, the challenge as written falls to this trap, but hardcoding is a standard loophole: http://meta.codegolf.stackexchange.com/a/1063/20260

– xnor – 2014-11-17T05:02:48.290

1From that meta post "your program is expected to do work, not just print a pre-calculated result." In fact this program does all the work to compress the test results, and then simply remaps the resulting large integers representing those boards to the integers 0..9 in order to get this optimal result. There exist boards that map to those integers no matter what. I simply picked the test cases to be those boards. The program encodes and decodes all possible boards, so it does all the work required in the challenge. – Mark Adler – 2014-11-17T05:14:18.943

You're right, that meta post doesn't cover it. A new one was just posted to do so: http://meta.codegolf.stackexchange.com/a/2507/20260

– xnor – 2014-11-17T22:51:38.250

6

Python 2.7, 107 chars total

TL;DR brute-force enumeration of 3x3 squares with top+left constraints

test cases:

import itertools

inputs = """
9 7 3 5 8 1 4 2 6
5 2 6 4 7 3 1 9 8
1 8 4 2 9 6 7 5 3
2 4 7 8 6 5 3 1 9
3 9 8 1 2 4 6 7 5
6 5 1 7 3 9 8 4 2
8 1 9 3 4 2 5 6 7
7 6 5 9 1 8 2 3 4
4 3 2 6 5 7 9 8 1

7 2 4 8 6 5 1 9 3
1 6 9 2 4 3 8 7 5
3 8 5 1 9 7 2 4 6
8 9 6 7 2 4 3 5 1
2 7 3 9 5 1 6 8 4
4 5 1 3 8 6 9 2 7
5 4 2 6 3 9 7 1 8
6 1 8 5 7 2 4 3 9
9 3 7 4 1 8 5 6 2

1 5 7 6 8 2 3 4 9
4 3 2 5 1 9 6 8 7
6 9 8 3 4 7 2 5 1
8 2 5 4 7 6 1 9 3
7 1 3 9 2 8 4 6 5
9 6 4 1 3 5 7 2 8
5 4 1 2 9 3 8 7 6
2 8 9 7 6 1 5 3 4
3 7 6 8 5 4 9 1 2

8 3 5 4 1 6 9 2 7
2 9 6 8 5 7 4 3 1
4 1 7 2 9 3 6 5 8
5 6 9 1 3 4 7 8 2
1 2 3 6 7 8 5 4 9
7 4 8 5 2 9 1 6 3
6 5 2 7 8 1 3 9 4
9 8 1 3 4 5 2 7 6
3 7 4 9 6 2 8 1 5

6 2 8 4 5 1 7 9 3
5 9 4 7 3 2 6 8 1
7 1 3 6 8 9 5 4 2
2 4 7 3 1 5 8 6 9
9 6 1 8 2 7 3 5 4
3 8 5 9 6 4 2 1 7
1 5 6 2 4 3 9 7 8
4 3 9 5 7 8 1 2 6
8 7 2 1 9 6 4 3 5

1 2 3 4 5 6 7 8 9
4 5 6 7 8 9 1 2 3
7 8 9 1 2 3 4 5 6
2 1 4 3 6 5 8 9 7
3 6 5 8 9 7 2 1 4
8 9 7 2 1 4 3 6 5
5 3 1 6 4 8 9 7 2
6 4 8 9 7 2 5 3 1
9 7 2 5 3 1 6 4 8

1 4 5 7 9 2 8 3 6
3 7 6 5 8 4 1 9 2
2 9 8 3 6 1 7 5 4
7 3 1 9 2 8 6 4 5
8 5 9 6 4 7 3 2 1
4 6 2 1 3 5 9 8 7
6 2 4 8 7 3 5 1 9
5 8 7 4 1 9 2 6 3
9 1 3 2 5 6 4 7 8

5 2 7 4 1 6 9 3 8
8 6 4 3 2 9 1 5 7
1 3 9 5 7 8 6 4 2
2 9 1 8 5 4 3 7 6
3 4 8 6 9 7 5 2 1
6 7 5 1 3 2 4 8 9
7 1 2 9 4 5 8 6 3
4 8 3 2 6 1 7 9 5
9 5 6 7 8 3 2 1 4

2 4 6 7 1 3 9 8 5
1 8 5 4 9 6 7 3 2
9 3 7 8 2 5 1 4 6
6 7 8 5 4 2 3 9 1
4 9 3 1 6 8 2 5 7
5 1 2 3 7 9 4 6 8
8 2 4 9 5 7 6 1 3
7 5 9 6 3 1 8 2 4
3 6 1 2 8 4 5 7 9

8 6 1 2 9 4 5 7 3
4 7 5 3 1 8 6 9 2
3 9 2 5 6 7 8 1 4
2 3 6 4 5 9 7 8 1
1 5 4 7 8 3 2 6 9
9 8 7 6 2 1 3 4 5
5 2 9 1 7 6 4 3 8
6 4 8 9 3 2 1 5 7
7 1 3 8 4 5 9 2 6
""".strip().split('\n\n')

helper function to print sudoku

def print_sudoku(m):
    for k in m:
        print' '.join(str(i) for i in k)

generates all possible squares given constraints above and left

see code comment for more details

def potential_squares(u1, u2, u3, l1, l2, l3):
    """
    returns generator of possible squares given lists of digits above and below

           u1 u2 u3
           |  |  |
    l1 --  a  b  c
    l2 --  d  e  f
    l3 --  g  h  i

    if no items exist the empty list must be given
    """
    for a, b, c, d, e, f, g, h, i in itertools.permutations(xrange(1, 10)):
        if a not in u1 and a not in l1 and b not in u2 and b not in l1 and c not in u3 and c not in l1 and d not in u1 and d not in l2 and e not in u2 and e not in l2 and f not in u3 and f not in l2 and g not in u1 and g not in l3 and h not in u2 and h not in l3 and i not in u3 and i not in l3:
            yield (a, b, c, d, e, f, g, h, i)

extracts all squares from sudoku board as tuples

see code comment for more details

def board_to_squares(board):
    """
    finds 9 squares in a 9x9 board in this order:
    1 1 1 2 2 2 3 3 3
    1 1 1 2 2 2 3 3 3
    1 1 1 2 2 2 3 3 3
    4 4 4 5 5 5 6 6 6
    4 4 4 5 5 5 6 6 6
    4 4 4 5 5 5 6 6 6
    7 7 7 8 8 8 9 9 9
    7 7 7 8 8 8 9 9 9
    7 7 7 8 8 8 9 9 9

    returns tuple for each square as follows:
    a b c
    d e f   -->  (a,b,c,d,e,f,g,h,i)
    g h i
    """
    labels = [[3 * i + 1] * 3 + [3 * i + 2] * 3 + [3 * i + 3] * 3 for i in [0, 0, 0, 1, 1, 1, 2, 2, 2]]
    labelled_board = zip(sum(board, []), sum(labels, []))
    return [tuple(a for a, b in labelled_board if b == sq) for sq in xrange(1, 10)]

converts squares back to sudoku board

basically an inverse of the above function

def squares_to_board(squares):
    """
    inverse of above
    """
    board = [[i / 3 * 27 + i % 3 * 3 + j / 3 * 9 + j % 3 for j in range(9)] for i in range(9)]
    flattened = sum([list(square) for square in squares], [])
    for i in range(9):
        for j in range(9):
            board[i][j] = flattened[board[i][j]]
    return board

given squares left, return constraints

see code comment for more details

def sum_rows(*squares):
    """
    takes tuples for squares and returns lists corresponding to the rows:
    l1 -- a b c   j k l
    l2 -- d e f   m n o  ...
    l3 -- g h i   p q r
    """
    l1 = []
    l2 = []
    l3 = []
    if len(squares):
        for a, b, c, d, e, f, g, h, i in squares:
            l1 += [a, b, c]
            l2 += [d, e, f]
            l3 += [g, h, i]
        return l1, l2, l3
    return [], [], []

given squares above, return constraints

see code comment for more details

def sum_cols(*squares):
    """
    takes tuples for squares and returns lists corresponding to the cols:

    u1 u2 u3
    |  |  |
    a  b  c
    d  e  f
    g  h  i

    j  k  l
    m  n  o
    p  q  r

      ...

    """
    u1 = []
    u2 = []
    u3 = []
    if len(squares):
        for a, b, c, d, e, f, g, h, i in squares:
            u1 += [a, d, g]
            u2 += [b, e, h]
            u3 += [c, f, i]
        return u1, u2, u3
    return [], [], []

makes a string

def base95(A):
    if type(A) is int or type(A) is long:
        s = ''
        while A > 0:
            s += chr(32 + A % 95)
            A /= 95
        return s
    if type(A) is str:
        return sum((ord(c) - 32) * (95 ** i) for i, c in enumerate(A))

this is a hardcoded list of dependencies for each square

see code comment for more details

"""
dependencies: every square as labeled
1 2 3
4 5 6
7 8 9
is dependent on those above and to the left

in a dictionary, it is:
square: ([above],[left])
"""
dependencies = {1: ([], []), 2: ([], [1]), 3: ([], [1, 2]), 4: ([1], []), 5: ([2], [4]), 6: ([3], [4, 5]),
                7: ([1, 4], []), 8: ([2, 5], [7]), 9: ([3, 6], [7, 8])}

this is a hardcoded list of max number of possible options for each square

see code comment for more details

"""
max possible options for a given element

  9 8 7   ? ? ?   3 2 1
  6 5 4  (12096)  3 2 1
  3 2 1   ? ? ?   3 2 1

  ? ? ?   ? ? ?   2 2 1
 (12096)  (420)   2 1 1    (limits for squares 2,4 determined experimentally)
  ? ? ?   ? ? ?   1 1 1    (limit for square 5 is a pessimistic guess, might be wrong)

  3 3 3   2 2 1   1 1 1
  2 2 2   2 1 1   1 1 1
  1 1 1   1 1 1   1 1 1
"""
possibilities = [362880, 12096, 216, 12096, 420, 8, 216, 8, 1]

these combine the above functions and convert a board to a list of integers

def factorize_sudoku(board):
    squares = board_to_squares(board)
    factors = []

    for label in xrange(1, 10):
        above, left = dependencies[label]
        u1, u2, u3 = sum_cols(*[sq for i, sq in enumerate(squares) if i + 1 in above])
        l1, l2, l3 = sum_rows(*[sq for i, sq in enumerate(squares) if i + 1 in left])
        for i, k in enumerate(potential_squares(u1, u2, u3, l1, l2, l3)):
            if k == squares[label - 1]:
                factors.append(i)
                continue
    return factors

and back to a board

def unfactorize_sudoku(factors):
    squares = []
    for label in xrange(1, 10):
        factor = factors[label - 1]
        above, left = dependencies[label]
        u1, u2, u3 = sum_cols(*[sq for i, sq in enumerate(squares) if i + 1 in above])
        l1, l2, l3 = sum_rows(*[sq for i, sq in enumerate(squares) if i + 1 in left])
        for i, k in enumerate(potential_squares(u1, u2, u3, l1, l2, l3)):
            if i == factor:
                squares.append(k)
                continue
    return squares

okay that's all the functions

for each board, make string and print it

strings = []
for sudoku in inputs:
    board = [[int(x) for x in line.split()] for line in sudoku.strip().split('\n')]
    print_sudoku(board)
    factors = factorize_sudoku(board)

    i = 0
    for item, modulus in zip(factors, possibilities):
        i *= modulus
        i += item

    strings.append(base95(i))
    print 'integral representation:', i
    print 'bits of entropy:', i.bit_length()
    print 'base95 representation:', strings[-1]
    print ''

now print the total length of all strings

print 'overall output:', strings
print 'total length:', len(''.join(strings))
print ''

and un-stringify, to prove it's not a one-way compression

for string in strings:
    print 'from:', string

    i = base95(string)
    retrieved = []
    for base in possibilities[::-1]:
        retrieved.append(i % base)
        i /= base

    squares = unfactorize_sudoku(retrieved[::-1])
    print_sudoku(squares_to_board(squares))
    print ''

output:

9 7 3 5 8 1 4 2 6
5 2 6 4 7 3 1 9 8
1 8 4 2 9 6 7 5 3
2 4 7 8 6 5 3 1 9
3 9 8 1 2 4 6 7 5
6 5 1 7 3 9 8 4 2
8 1 9 3 4 2 5 6 7
7 6 5 9 1 8 2 3 4
4 3 2 6 5 7 9 8 1
integral representation: 65073646522550110083448
bits of entropy: 76
base95 representation: 23f!dvoR[pI+

7 2 4 8 6 5 1 9 3
1 6 9 2 4 3 8 7 5
3 8 5 1 9 7 2 4 6
8 9 6 7 2 4 3 5 1
2 7 3 9 5 1 6 8 4
4 5 1 3 8 6 9 2 7
5 4 2 6 3 9 7 1 8
6 1 8 5 7 2 4 3 9
9 3 7 4 1 8 5 6 2
integral representation: 45592184788002754998731
bits of entropy: 76
base95 representation: +gel3sJ?vL!(

1 5 7 6 8 2 3 4 9
4 3 2 5 1 9 6 8 7
6 9 8 3 4 7 2 5 1
8 2 5 4 7 6 1 9 3
7 1 3 9 2 8 4 6 5
9 6 4 1 3 5 7 2 8
5 4 1 2 9 3 8 7 6
2 8 9 7 6 1 5 3 4
3 7 6 8 5 4 9 1 2
integral representation: 3351617758498333760666
bits of entropy: 72
base95 representation: !"=W3R"`w|W

8 3 5 4 1 6 9 2 7
2 9 6 8 5 7 4 3 1
4 1 7 2 9 3 6 5 8
5 6 9 1 3 4 7 8 2
1 2 3 6 7 8 5 4 9
7 4 8 5 2 9 1 6 3
6 5 2 7 8 1 3 9 4
9 8 1 3 4 5 2 7 6
3 7 4 9 6 2 8 1 5
integral representation: 54077388556332388193975
bits of entropy: 76
base95 representation: zAu5Rvno.2P)

6 2 8 4 5 1 7 9 3
5 9 4 7 3 2 6 8 1
7 1 3 6 8 9 5 4 2
2 4 7 3 1 5 8 6 9
9 6 1 8 2 7 3 5 4
3 8 5 9 6 4 2 1 7
1 5 6 2 4 3 9 7 8
4 3 9 5 7 8 1 2 6
8 7 2 1 9 6 4 3 5
integral representation: 38664325462033435490761
bits of entropy: 76
base95 representation: ?8KJHGXS^hk&

1 2 3 4 5 6 7 8 9
4 5 6 7 8 9 1 2 3
7 8 9 1 2 3 4 5 6
2 1 4 3 6 5 8 9 7
3 6 5 8 9 7 2 1 4
8 9 7 2 1 4 3 6 5
5 3 1 6 4 8 9 7 2
6 4 8 9 7 2 5 3 1
9 7 2 5 3 1 6 4 8
integral representation: 9
bits of entropy: 4
base95 representation: )

1 4 5 7 9 2 8 3 6
3 7 6 5 8 4 1 9 2
2 9 8 3 6 1 7 5 4
7 3 1 9 2 8 6 4 5
8 5 9 6 4 7 3 2 1
4 6 2 1 3 5 9 8 7
6 2 4 8 7 3 5 1 9
5 8 7 4 1 9 2 6 3
9 1 3 2 5 6 4 7 8
integral representation: 2146071528999475941021
bits of entropy: 71
base95 representation: ]ib2[x.u*pC

5 2 7 4 1 6 9 3 8
8 6 4 3 2 9 1 5 7
1 3 9 5 7 8 6 4 2
2 9 1 8 5 4 3 7 6
3 4 8 6 9 7 5 2 1
6 7 5 1 3 2 4 8 9
7 1 2 9 4 5 8 6 3
4 8 3 2 6 1 7 9 5
9 5 6 7 8 3 2 1 4
integral representation: 31150627593616723824594
bits of entropy: 75
base95 representation: BFK1'H9}r9M%

2 4 6 7 1 3 9 8 5
1 8 5 4 9 6 7 3 2
9 3 7 8 2 5 1 4 6
6 7 8 5 4 2 3 9 1
4 9 3 1 6 8 2 5 7
5 1 2 3 7 9 4 6 8
8 2 4 9 5 7 6 1 3
7 5 9 6 3 1 8 2 4
3 6 1 2 8 4 5 7 9
integral representation: 9659549243898865961967
bits of entropy: 74
base95 representation: ;EOSPiy9T?b!

8 6 1 2 9 4 5 7 3
4 7 5 3 1 8 6 9 2
3 9 2 5 6 7 8 1 4
2 3 6 4 5 9 7 8 1
1 5 4 7 8 3 2 6 9
9 8 7 6 2 1 3 4 5
5 2 9 1 7 6 4 3 8
6 4 8 9 3 2 1 5 7
7 1 3 8 4 5 9 2 6
integral representation: 56473223126891371769434
bits of entropy: 76
base95 representation: 3TLSl3hPU3x)

overall output: ['23f!dvoR[pI+', '+gel3sJ?vL!(', '!"=W3R"`w|W', 'zAu5Rvno.2P)', '?8KJHGXS^hk&', ')', ']ib2[x.u*pC', "BFK1'H9}r9M%", ';EOSPiy9T?b!', '3TLSl3hPU3x)']
total length: 107

from: 23f!dvoR[pI+
9 7 3 5 8 1 4 2 6
5 2 6 4 7 3 1 9 8
1 8 4 2 9 6 7 5 3
2 4 7 8 6 5 3 1 9
3 9 8 1 2 4 6 7 5
6 5 1 7 3 9 8 4 2
8 1 9 3 4 2 5 6 7
7 6 5 9 1 8 2 3 4
4 3 2 6 5 7 9 8 1

from: +gel3sJ?vL!(
7 2 4 8 6 5 1 9 3
1 6 9 2 4 3 8 7 5
3 8 5 1 9 7 2 4 6
8 9 6 7 2 4 3 5 1
2 7 3 9 5 1 6 8 4
4 5 1 3 8 6 9 2 7
5 4 2 6 3 9 7 1 8
6 1 8 5 7 2 4 3 9
9 3 7 4 1 8 5 6 2

from: !"=W3R"`w|W
1 5 7 6 8 2 3 4 9
4 3 2 5 1 9 6 8 7
6 9 8 3 4 7 2 5 1
8 2 5 4 7 6 1 9 3
7 1 3 9 2 8 4 6 5
9 6 4 1 3 5 7 2 8
5 4 1 2 9 3 8 7 6
2 8 9 7 6 1 5 3 4
3 7 6 8 5 4 9 1 2

from: zAu5Rvno.2P)
8 3 5 4 1 6 9 2 7
2 9 6 8 5 7 4 3 1
4 1 7 2 9 3 6 5 8
5 6 9 1 3 4 7 8 2
1 2 3 6 7 8 5 4 9
7 4 8 5 2 9 1 6 3
6 5 2 7 8 1 3 9 4
9 8 1 3 4 5 2 7 6
3 7 4 9 6 2 8 1 5

from: ?8KJHGXS^hk&
6 2 8 4 5 1 7 9 3
5 9 4 7 3 2 6 8 1
7 1 3 6 8 9 5 4 2
2 4 7 3 1 5 8 6 9
9 6 1 8 2 7 3 5 4
3 8 5 9 6 4 2 1 7
1 5 6 2 4 3 9 7 8
4 3 9 5 7 8 1 2 6
8 7 2 1 9 6 4 3 5

from: )
1 2 3 4 5 6 7 8 9
4 5 6 7 8 9 1 2 3
7 8 9 1 2 3 4 5 6
2 1 4 3 6 5 8 9 7
3 6 5 8 9 7 2 1 4
8 9 7 2 1 4 3 6 5
5 3 1 6 4 8 9 7 2
6 4 8 9 7 2 5 3 1
9 7 2 5 3 1 6 4 8

from: ]ib2[x.u*pC
1 4 5 7 9 2 8 3 6
3 7 6 5 8 4 1 9 2
2 9 8 3 6 1 7 5 4
7 3 1 9 2 8 6 4 5
8 5 9 6 4 7 3 2 1
4 6 2 1 3 5 9 8 7
6 2 4 8 7 3 5 1 9
5 8 7 4 1 9 2 6 3
9 1 3 2 5 6 4 7 8

from: BFK1'H9}r9M%
5 2 7 4 1 6 9 3 8
8 6 4 3 2 9 1 5 7
1 3 9 5 7 8 6 4 2
2 9 1 8 5 4 3 7 6
3 4 8 6 9 7 5 2 1
6 7 5 1 3 2 4 8 9
7 1 2 9 4 5 8 6 3
4 8 3 2 6 1 7 9 5
9 5 6 7 8 3 2 1 4

from: ;EOSPiy9T?b!
2 4 6 7 1 3 9 8 5
1 8 5 4 9 6 7 3 2
9 3 7 8 2 5 1 4 6
6 7 8 5 4 2 3 9 1
4 9 3 1 6 8 2 5 7
5 1 2 3 7 9 4 6 8
8 2 4 9 5 7 6 1 3
7 5 9 6 3 1 8 2 4
3 6 1 2 8 4 5 7 9

from: 3TLSl3hPU3x)
8 6 1 2 9 4 5 7 3
4 7 5 3 1 8 6 9 2
3 9 2 5 6 7 8 1 4
2 3 6 4 5 9 7 8 1
1 5 4 7 8 3 2 6 9
9 8 7 6 2 1 3 4 5
5 2 9 1 7 6 4 3 8
6 4 8 9 3 2 1 5 7
7 1 3 8 4 5 9 2 6

averykhoo

Posted 2014-11-15T00:26:23.360

Reputation: 101

4

J, 254 points

Compression
fwrite&'sudoku.z' 1 u: u: 32 + (26$95) #: (9 $ !9x)#. A."1 (1&".);._2 stdin''
Decompression
echo A.&(>:i.9)"1 (9 $ !9x) #: 95x #. 32 -~ 3 u: fread'sudoku.z'

Standard I/O is a bit clumsy in J since jconsole is actually a REPL, so I took the liberty to write the compressed output to file.

Finds the anagram index of each line, treats the resulting nine numbers as a base-(9!) number, and then finally converts to base-95, adds 32 and converts to ASCII just like in Martin Büttner's solution. The anagram index of a permutation of 1..n is simply the index of the permutation in the lexically sorted list of all such permutations, e.g. 5 4 3 2 1 has anagram index 5! - 1 = 119.

All the operations have easy inverses, so decompression is simple.

As a bonus, the examples are in a very J-friendly format, so input/output for decompressed sudokus are exactly as given in the examples (although the input to the encoder requires a trailing newline).


Compressed strings for the testcases:

#p8<!w=C6Cpgi/-+vn)FU]AHr\
"bC]wPv{8ze$l,+jkCPi0,e>-D
2}2EZZB;)WZQF@JChz}~-}}_<
#2Ofs0Mm]).e^raUu^f@sSMWc"
":kkCf2;^U_UDC?I\PC"[*gj|!
#TISE3?d7>oZ_I2.C16Z*gg
,@ CE;zX{.l\xRAc]~@vCw)8R
!oN{|Y6V"C.q<{gq(s?M@O]"]9
VORd2"*T,J;JSh<G=rR*8J1LT
#?bHF:y@oRI8e1Zdl5:BzYO.P.

FireFly

Posted 2014-11-15T00:26:23.360

Reputation: 7 107

If you compress only the first 8 rows, the 9th row is easy to compute. – Keith Randall – 2014-11-15T04:39:07.940

@KeithRandall yes, I thought about that too. I think one can do even better by always leaving out the biggest row, and then storing the index of the row to re-compute. I don't think I'll bother to implement it since it wouldn't get me all the way down to 1xx though. – FireFly – 2014-11-15T13:13:25.580

3

Python 3, 120 points

This program lists all possible 3x3-blocks and remembers which one of them was actually present in the original Sudoku, then puts all those numbers together in a base-95 representation. Although this is very close to hard-coding, it compresses and decompresses the examples in about 5 seconds each on my machine.

import functools

def readSudoku(s):
    values = [int(c) for c in s.split()]
    blocks = []
    for i in range(3):
        for j in range(3):
            block = []
            for k in range(3):
                for l in range(3):
                    block.append(values[i * 27 + k * 9 + j * 3 + l])
            blocks.append(block)
    return blocks

def writeSudoku(blocks):
    text = ""
    for i in range(9):
        for j in range(9):
            text += str(blocks[3 * (i // 3) + (j // 3)][3 * (i % 3) + (j % 3)]) + " "
        text += "\n"
    return text

def toASCII(num):
    chars = "".join(chr(c) for c in range(32, 127))
    if num == 0:
        return chars[0]
    else:
        return (toASCII(num // len(chars)).lstrip(chars[0]) + chars[num % len(chars)])

def toNum(text):
    chars = "".join(chr(c) for c in range(32, 127))
    return sum((len(chars) ** i * chars.index(c) for (i, c) in enumerate(text[::-1])))

def compress(sudoku):
    info = compressInfo(readSudoku(sudoku))
    return toASCII(functools.reduce(lambda old, new: (old[0] + new[0] * old[1], old[1] * new[1]), info, (0, 1))[0])

def compressInfo(sudoku):
    finished = [[0]*9]*9
    indices = [(-1, 0)]*9
    for (index, block) in enumerate(sudoku):
        counter = 0
        actual = -1
        for (location, solution) in enumerate(possibleBlocks(finished, index)):
            counter += 1
            if block == solution:
                actual = location
        if actual == -1:
            print(finished)
            print(block)
            raise ValueError
        finished[index] = block
        indices[index] = (actual, counter)
    return indices

def decompress(text):
    number = toNum(text)
    finished = [[0]*9]*9
    for i in range(9):
        blocks = list(possibleBlocks(finished, i))
        index = number % len(blocks)
        number //= len(blocks)
        finished[i] = blocks[index]
    return writeSudoku(finished)

def possibleBlocks(grid, index):
    horizontals = [grid[i] for i in (3 * (index // 3), 3 * (index // 3) + 1, 3 * (index // 3) + 2)]
    verticals = [grid[i] for i in (index % 3, index % 3 + 3, index % 3 + 6)]
    for i1 in range(1, 10):
        if any((i1 in a[0:3] for a in horizontals)) or\
           any((i1 in a[0::3] for a in verticals)):
            continue
        for i2 in range(1, 10):
            if i2 == i1 or\
               any((i2 in a[0:3] for a in horizontals)) or\
               any((i2 in a[1::3] for a in verticals)):
                continue
            for i3 in range(1, 10):
                if i3 in (i2, i1) or\
                   any((i3 in a[0:3] for a in horizontals)) or\
                   any((i3 in a[2::3] for a in verticals)):
                    continue
                for i4 in range(1, 10):
                    if i4 in (i3, i2, i1) or\
                       any((i4 in a[3:6] for a in horizontals)) or\
                       any((i4 in a[0::3] for a in verticals)):
                        continue
                    for i5 in range(1, 10):
                        if i5 in (i4, i3, i2, i1) or\
                           any((i5 in a[3:6] for a in horizontals)) or\
                           any((i5 in a[1::3] for a in verticals)):
                            continue
                        for i6 in range(1, 10):
                            if i6 in (i5, i4, i3, i2, i1) or\
                               any((i6 in a[3:6] for a in horizontals)) or\
                               any((i6 in a[2::3] for a in verticals)):
                                continue
                            for i7 in range(1, 10):
                                if i7 in (i6, i5, i4, i3, i2, i1) or\
                                   any((i7 in a[6:9] for a in horizontals)) or\
                                   any((i7 in a[0::3] for a in verticals)):
                                    continue
                                for i8 in range(1, 10):
                                    if i8 in (i7, i6, i5, i4, i3, i2, i1) or\
                                       any((i8 in a[6:9] for a in horizontals)) or\
                                       any((i8 in a[1::3] for a in verticals)):
                                        continue
                                    for i9 in range(1, 10):
                                        if i9 in (i8, i7, i6, i5, i4, i3, i2, i1) or\
                                           any((i9 in a[6:9] for a in horizontals)) or\
                                           any((i9 in a[2::3] for a in verticals)):
                                            continue
                                        yield [i1, i2, i3, i4, i5, i6, i7, i8, i9]

The main functions are compress(sudoku) and decompress(text).

Outputs:

!%XIjS+]P{'Y
$OPMD&Sw&tlc
$1PdUMZ7K;W*
*=M1Ak9Oj6i\
!SY5:tDJxVo;
!F ]ki%jK>*R
'PXM4J7$s?#%
#9BJZP'%Ggse
*iAH-!9%QolJ
#&L6W6i> Dd6

pascalhein

Posted 2014-11-15T00:26:23.360

Reputation: 141

3

Python 2.5, 116 points

Code:

emptysud=[[' ']*9 for l in range(9)]

def encconfig(dig,sud):
 conf1=[(sud[i].index(dig),i) for i in range(9)]; out=[]
 for xgroup in range(3):
  a=filter(lambda (x,y): xgroup*3<=x<(xgroup+1)*3, conf1)
  b=[x-xgroup*3 for (x,y) in sorted(a,key = lambda (x,y): y)]
  out.append([[0,1,2],[0,2,1],[1,0,2],[1,2,0],[2,0,1],[2,1,0]].index(b))
 for ygroup in range(3):
  a=filter(lambda (x,y): ygroup*3<=y<(ygroup+1)*3, conf1)
  b=[y-ygroup*3 for (x,y) in sorted(a,key = lambda (x,y): x)]
  out.append([[0,1,2],[0,2,1],[1,0,2],[1,2,0],[2,0,1],[2,1,0]].index(b))
 return sum([out[i]*(6**i) for i in range(6)])

def decconfig(conf,dig,sud=emptysud):
 inp=[]; conf1=[]; sud=[line[:] for line in sud]
 for i in range(6):
  inp.append([[0,1,2],[0,2,1],[1,0,2],[1,2,0],[2,0,1],[2,1,0]][conf%6]); conf/=6
 for groupx in range(3):
  for groupy in range(3):
   conf1.append((groupx*3+inp[groupx][groupy],groupy*3+inp[groupy+3][groupx]))
 for (x,y) in conf1: sud[y][x]=dig
 return sud

def compatible(sud,conf,dig):
 a=reduce(lambda x,y: x+y, sud)
 b=decconfig(conf,dig,sud)
 c=reduce(lambda x,y: x+y, b)
 return a.count(' ')-c.count(' ')==9

def encode(sud):
 k=[encconfig(str(i),sud) for i in range(1,10)]; m=k[0]; p=6**6
 cursud=decconfig(k[0],'1')
 for i in range(1,9):
  t=filter(lambda u: compatible(cursud,u,str(i+1)), range(6**6))
  m=m+p*t.index(k[i]); p*=len(t)
  cursud=decconfig(k[i],str(i+1),cursud)
 return m

def decode(n):
 k=[n%46656]; n/=46656; cursud=decconfig(k[-1],'1')
 for i in range(2,10):
  t=filter(lambda u: compatible(cursud,u,str(i)), range(6**6))
  k.append(n%len(t)); n/=len(t); cursud=decconfig(t[k[-1]],str(i),cursud)
 return cursud

def base95(n):
 out=''
 while n: out+=chr(32+n%95); n/=95
 return out[::-1]

def base10(s): s=s[::-1]; return sum([(ord(s[i])-32)*(95**i) for i in range(len(s))])

import time
t0=time.clock()
for part in file('sudoku.txt','rb+').read().split('\r\n\r\n'):
 sudoku=[line.split(' ') for line in part.split('\r\n')]
 encsud=base95(encode(sudoku)); sud2=decode(base10(encsud))
 print encsud,sud2==sudoku
print time.clock()-t0

Results:

!|/FC,">;&3z
rUH">FLSgT|
)3#m|:&Zxl1c
jh _N@MG/zr
%Iye;U(6(p;0
!21.+KD0//yG
"O\B*O@8,h`y
A$`TUE#rsQu
J}ANCYXX*y5
".u2KV#4K|%a

Very slow. Took 517 seconds to run and verify on my machine.

encconfig takes a sudoku board and a digit from 1-9, lists the x-y coordinates where that digit appears, and outputs a number in range(6**6) that represents those coordinates. (the "digit configuration")

decconfig is the reverse function. It takes a number in range(6**6), a digit, and a sudoku board (defaults to empty). It outputs the sudoku board overlayed with the digit configuration. If one of the positions in the digit configuration is already taken in the inputted sudoku board, the digit in that position is overwritten by the new digit.

compatible takes a sudoku board and a digit configuration (defined by conf and dig), overlays the digit configuration over the sudoku board and checks for conflicts. It then returns True or False depending on the result.

encode is the compression function. It takes a sudoku board and outputs a number representing it. It does this by first copying the positions of the 1's to an empty board and making a list of all the configurations of the number 2 which are compatible with the 1-configuration (that don't take up any of the places already taken by the 1's). It then finds the order of the board's actual 2-configuration in the list and stores it, then copies that configuration to the new board, which now contains only the 1's and 2's. It then lists all the configurations of the number 3 which are compatible with the positions of the 1's and 2's, and so on.

decode is the reverse function.

Python 2.5.

user1729061

Posted 2014-11-15T00:26:23.360

Reputation: 131

2

C#, 150 bytes

Compressed output:

KYxnUjIpNe/YDnA
F97LclGuqeTcT2c
i6D1SvMVkS0jPlQ
32FOiIoUHpz5GGs
aAazPo2RJiH+IWQ
CwAA5NIMyNzSt1I
Cc2jOjU1+buCtVM
OgQv3Dz3PqsRvGA
eSxaW3wY5e6NGFc
olQvtpDOUPJXKGw

How it works:

It generates all possible permutations of 123456789 and remembers them. Then it compares the permutations with the rows in the sudoku. When a matching permutation for a giving row is found it remembers the index of that permutation. After each row it will remove all permutations where there is atleast one char in same position as the current row. This makes sure every number is unique in its column. Then it takes out all permutations that do not work anymore by the box-criteria. Since the last row is trivial it generates 8 numbers. I tested what the max value of each of those numbers would be and generated a digit-count-mask for each position of those. { 6, 5, 3, 5, 3, 1, 2, 1, 1 }. The first is obviously the longest with 362880 permutations. Using the digitmask i construct a BigInteger with a leading 1 to make it 28 digits long. This results in 11 bytes total. Then those bytes get converted to base64. To save one char i remove the = sign at the end.

The reconstrcution works similiar.

It reconstructs the BigInteger from the base64string and then turns it into a string again and splitting it up again using the mentiond digit-count-mask. Those strings get parsed back to the indexes.

Then the algorithm does almost the same, instead of finding the row in the permutations it just uses the index to get the row, the rest works the same.

Probably this could be a bit better to really use the 94 possible charachters instead of only 64 but i lack the brainz to do this.

Source: Copy- and pasteable to make it run with the 10 examples. .dotNet-Fiddle tells me this exceeds the memorylimit so you need to run it on your machine to text.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Numerics;
using System.Text;

public class Programm
{
    public static void Main(string[] args)
    {
        string[] input = new[] {
            "973581426526473198184296753247865319398124675651739842819342567765918234432657981",
            "724865193169243875385197246896724351273951684451386927542639718618572439937418562", 
            "157682349432519687698347251825476193713928465964135728541293876289761534376854912", 
            "835416927296857431417293658569134782123678549748529163652781394981345276374962815", 
            "628451793594732681713689542247315869961827354385964217156243978439578126872196435", 
            "123456789456789123789123456214365897365897214897214365531648972648972531972531648", 
            "145792836376584192298361754731928645859647321462135987624873519587419263913256478",
            "527416938864329157139578642291854376348697521675132489712945863483261795956783214", 
            "246713985185496732937825146678542391493168257512379468824957613759631824361284579",
            "861294573475318692392567814236459781154783269987621345529176438648932157713845926" };

        string[] permutations = GetPermutations();
        foreach (string sudoku in input)
        {

            int[] indices = _compressSudoku(sudoku, permutations).ToArray();
            string compressedRepresentation = _toCompressedRepresentation(indices);

            Console.WriteLine(compressedRepresentation);
            indices = _fromCompressedRepresentation(compressedRepresentation);
            string decompressedSudoku = _decompressSudoku(indices, permutations);

            if (decompressedSudoku != sudoku)
                throw new Exception();
        }
        Console.ReadKey();
    }

    static int[] _digitMask = new int[] { 6, 5, 3, 5, 3, 1, 2, 1, 1 };

    private static int[] _fromCompressedRepresentation(string compressedRepresentation)
    {
        BigInteger big = new BigInteger(Convert.FromBase64String(compressedRepresentation + "="));

        string stringValue = big.ToString().Substring(1);

        List<int> indexes = new List<int>();
        int i = 0;
        while (stringValue.Length > 0)
        {
            int length = _digitMask[i++];
            string current = stringValue.Substring(0, length);
            stringValue = stringValue.Substring(length);
            indexes.Add(int.Parse(current));
        }
        return indexes.ToArray(); ;
    }

    private static string _toCompressedRepresentation(int[] indices)
    {
        StringBuilder builder = new StringBuilder("1");
        int i = 0;
        foreach (int index in indices)
        {
            string mask = "{0:D" + _digitMask[i++].ToString() + "}";
            builder.AppendFormat(mask, index);
        }

        string base64 = Convert.ToBase64String(BigInteger.Parse(builder.ToString()).ToByteArray());
        return base64.Substring(0, base64.Length - 1); // remove the = at the end.
    }

    private static IEnumerable<int> _compressSudoku(string input, string[] remainingPermutations)
    {
        string[] localRemainingPermutations = null;
        List<HashSet<char>> localUsed = null;
        for (int i = 0; i < 8; i++)
        {
            string currentRow = _getCurrentRow(input, i);
            if (i % 3 == 0)
            {
                localRemainingPermutations = remainingPermutations;
                localUsed = _initLocalUsed();
            }

            int index = 0;
            foreach (string permutation in localRemainingPermutations)
            {
                if (permutation == currentRow)
                {
                    yield return index;
                    break;
                }
                index++;
            }
            remainingPermutations = remainingPermutations.Where(permutation => _isStillValidPermutation(currentRow, permutation)).ToArray();
            if (i % 3 < 2)
            {
                for (int j = 0; j < 9; j++)
                    localUsed[j / 3].Add(currentRow[j]);
                localRemainingPermutations = localRemainingPermutations.Where(permutation => _isStillValidLocalPermutation(permutation, localUsed)).ToArray();
            }
        }
    }

    private static string _decompressSudoku(int[] indices, string[] remainingPermutations)
    {
        StringBuilder result = new StringBuilder();

        string[] localRemainingPermutations = null;
        List<HashSet<char>> localUsed = null;
        for (int i = 0; i < 9; i++)
        {
            if (i % 3 == 0)
            {
                localRemainingPermutations = remainingPermutations;
                localUsed = _initLocalUsed();
            }
            string currentRow = localRemainingPermutations[i < indices.Length ? indices[i] : 0];
            result.Append(currentRow);

            remainingPermutations = remainingPermutations.Where(permutation => _isStillValidPermutation(currentRow, permutation)).ToArray();
            if (i % 3 < 2)
            {
                for (int j = 0; j < 9; j++)
                    localUsed[j / 3].Add(currentRow[j]);
                localRemainingPermutations = localRemainingPermutations.Where(permutation => _isStillValidLocalPermutation(permutation, localUsed)).ToArray();
            }
        }
        return result.ToString();
    }

    private static string _getCurrentRow(string input, int i)
    {
        return new string(input.Skip(i * 9).Take(9).ToArray());
    }

    private static List<HashSet<char>> _initLocalUsed()
    {
        return new List<HashSet<char>> { new HashSet<char>(), new HashSet<char>(), new HashSet<char>() };
    }

    private static bool _isStillValidLocalPermutation(string permutation, List<HashSet<char>> localUsed)
    {
        for (int i = 0; i < 9; i++)
        {
            if (localUsed[i / 3].Contains(permutation[i]))
                return false;
        }
        return true;
    }

    private static bool _isStillValidPermutation(string currentRow, string permutation)
    {
        return permutation.Select((c, j) => c != currentRow[j]).All(b => b);
    }

    static string[] GetPermutations(char[] chars = null)
    {
        if (chars == null)
            chars = new[] { '1', '2', '3', '4', '5', '6', '7', '8', '9' };
        if (chars.Length == 2)
            return new[] { new String(chars), new String(chars.Reverse().ToArray()) };
        return chars.SelectMany(c => GetPermutations(chars.Where(sc => sc != c).ToArray()), (c, s) => c + s).ToArray();
    }
}

CSharpie

Posted 2014-11-15T00:26:23.360

Reputation: 381

1

Perl - 290 characters = 290 points

This program uses no hard coding and reliably compresses a grid into exactly 29 characters (theoretically it would be possible to find some smaller ones).

Here's how it works:

  • First convert the 9 x 9 array to 60 numbers. This can be done as the last column, the last row, and the final square of each 3 x 3 cell can be dropped.

  • Then convert using bigint to a single integer, using 9^60 elements.

  • Then convert the bigint to base 95.

Compressor and decompressor:

#!/usr/bin/perl

use strict;
use warnings;
use Getopt::Long;
use bigint;

sub compress
{
    my @grid;
    my @nums;
    while (<>)
    {
        push @grid, [split];
    }

    # encode into 60 numbers omitting last from each row, column and 3 x 3 square
    my $i;
    my $j;
    for ($i=0; $i<=7; $i++)
    {
        for ($j=0; $j<=7; $j++)
        {
            push @nums, $grid[$i][$j] if (($i % 3 !=2 ) || ($j % 3 !=2));
        }
    }

    # encode into a big int
    my $code = 0;
    foreach my $n (@nums)
    {
        $code = $code * 9 + ($n-1);
    }

    # print in base 95
    my $out="";
    while ($code)
    {
        my $digit = $code % 95;
        $out = chr($digit+32).$out;
        $code -= $digit;
        $code /= 95;
    }

    print "$out";
}

sub decompress
{
    my @grid;
    my @nums;
    my $code = 0;

    # Read from base 95 into bigint
    while (<>)
    {
        chomp;
        foreach my $char (split (//, $_))
        {
            my $c =ord($char)-32;
            $code*=95;
            $code+=$c;
        }
    }

    # convert back to 60 numbers
    for (my $n = 0; $n<60; $n++)
    {
        my $d = $code % 9;
        $code -= $d;
        $code/=9;
        unshift @nums, $d+1;
    }

    # print filling in last column, row and 3 x 3 square
    for (my $i=0; $i<=8; $i++)
    {
        for (my $j=0; $j<=8; $j++)
        {
            if ($j == 8)
            {
                my $tot = 0;
                for (my $jj = 0; $jj<=7; $jj++)
                {
                    $tot += $grid[$i][$jj];
                }
                $grid[$i][$j]=45-$tot;
            }
            elsif ($i == 8)
            {
                my $tot = 0;
                for (my $ii = 0; $ii<=7; $ii++)
                {
                    $tot += $grid[$ii][$j];
                }
                $grid[$i][$j]=45-$tot;
            }
            elsif (($i % 3 == 2 ) && ($j % 3 == 2))
            {
                my $tot = 0;
                for (my $ii = $i-2; $ii<=$i; $ii++)
                {
                    for (my $jj = $j-2; $jj<=$j; $jj++)
                    {
                        next if (($ii % 3 == 2 ) && ($jj % 3 == 2));
                        $tot += $grid[$ii][$jj];
                    }
                }
                $grid[$i][$j]=45-$tot;
            }
            else
            {
                $grid[$i][$j] = shift @nums;
            }

            print $grid[$i][$j].(($j==8)?"":" ");
        }
        print "\n";
    }
}

my $decompress;
GetOptions ("d|decompress" => \$decompress);

if ($decompress)
{
    decompress;
}
else
{
    compress;
}

abligh

Posted 2014-11-15T00:26:23.360

Reputation: 777

Is that score bytes or points? – Bill Woodger – 2014-11-15T13:57:59.507

@BillWoodger I think bytes (well, characters) = points? – abligh – 2014-11-15T17:07:14.550

1

PHP, 214

<?php
// checks each row/col/block and removes impossible candidates
function reduce($cand){
    do{
        $old = $cand;
        for($r = 0; $r < 9; ++$r){
        for($c = 0; $c < 9; ++$c){
            if(count($cand[$r][$c]) == 1){ // if filled in
                // remove values from row and col and block
                $remove = $cand[$r][$c];
                for($i = 0; $i < 9; ++$i){
                    $cand[$r][$i] = array_diff($cand[$r][$i],$remove);
                    $cand[$i][$c] = array_diff($cand[$i][$c],$remove);
                    $br = floor($r/3)*3+$i/3;
                    $bc = floor($c/3)*3+$i%3;
                    $cand[$br][$bc] = array_diff($cand[$br][$bc],$remove);
                }
                $cand[$r][$c] = $remove;
            }
        }}
    }while($old != $cand);
    return $cand;
}

// checks candidate list for completion
function done($cand){
    for($r = 0; $r < 9; ++$r){
    for($c = 0; $c < 9; ++$c){
        if(count($cand[$r][$c]) != 1)
            return false;
    }}
    return true;
}

// board format: [[1,2,0,3,..],[..],..], $b[$row][$col]
function solve($board){
    $cand = [[],[],[],[],[],[],[],[],[]];
    for($r = 0; $r < 9; ++$r){
    for($c = 0; $c < 9; ++$c){
        if($board[$r][$c]){ // if filled in
            $cand[$r][$c] = [$board[$r][$c]];
        }else{
            $cand[$r][$c] = range(1, 9);
        }
    }}
    $cand = reduce($cand);

    if(done($cand))  // goto not really necessary
        goto end;    // but it feels good to use it 
    else return false;

    end:
    // back to board format
    $b = [];
    for($r = 0; $r < 9; ++$r){
        $b[$r] = [];
        for($c = 0; $c < 9; ++$c){
            if(count($cand[$r][$c]) == 1)
                $b[$r][$c] = array_pop($cand[$r][$c]);
            else 
                $b[$r][$c] = 0;
        }
    }
    return $b;
}

function add_zeros($board, $ind){
    for($r = 0; $r < 9; ++$r){
    for($c = 0; $c < 9; ++$c){
        $R = ($r + (int)($ind/9)) % 9;
        $C = ($c + (int)($ind%9)) % 9;
        if($board[$R][$C]){
            $tmp = $board[$R][$C];
            $board[$R][$C] = 0;
            if(!solve($board))
                $board[$R][$C] = $tmp;
        }   
    }}
    return $board;
}

function base95($str, $b, $z){
    $tmp = gmp_init($str, $b); $zero = gmp_init(0); $gmp95 = gmp_init(95);
    $out = '';
    while(gmp_cmp($tmp, $zero) > 0){
        $arr = gmp_div_qr($tmp, $gmp95);
        $tmp = $arr[0];
        $out .= chr(32+gmp_intval($arr[1]));
    }
    $out = chr((32+($z << 2))|($b - 10)) . strrev($out);
    return $out;
}

function encode($board, $ind){
    // remove last row+col
    $board[8] = [0,0,0,0,0,0,0,0,0];
    foreach($board as &$j) $j[8] = 0;

    // remove bottom corner of each box
    $board[2][2] = $board[2][5] = $board[5][2] = $board[5][5] = 0;

    $board = add_zeros($board, $ind);

    $str = '';$z=0;
    for($r = 0; $r < 8; ++$r){
        for($c = 0; $c < 8; ++$c){
            if(($r==2||$r==5)&&($c==2||$c==5)) continue;
            if($str == '' && !$board[$r][$c]) ++$z;
            else $str .= $board[$r][$c];
        }
    }

    $b10 = base95(rtrim($str,'0'), 10, $z);
    $b11 = base95(rtrim(str_replace(['00'],['A'],$str),'0'), 11, $z);
    $b12 = base95(rtrim(str_replace(['000','00'],['B','A'],$str),'0'), 12, $z);

    $l10 = strlen($b10);
    $l11 = strlen($b11);
    $l12 = strlen($b12);
    var_dump($z);
    if($l10 < $l11)
        if($l10 < $l12)
            return $b10;
        else 
            return $b12;
    else
        if($l11 < $l12)
            return $b11;
        else 
            return $b12;    
}

function decode($str){
    $fc = ord($str[0]);
    $base = 10 + ($fc & 3);
    $z = ($fc - 32) >> 2;

    $tmp = gmp_init(0);
    $zero = gmp_init(0); $gmp95 = gmp_init(95);
    while(strlen($str = substr($str, 1))){
        $tmp = gmp_mul($tmp, $gmp95);
        $tmp = gmp_add($tmp, gmp_init(ord($str[0])-32));
    }
    $str = gmp_strval($tmp, $base);
    $expanded = str_repeat('0', $z) . str_replace(['a','b'],['00','000'],$str) . str_repeat('0', 81);

    $board = [];
    $ind = 0;
    for($i = 0; $i < 8; ++$i){
        $board[$i] = [];
        for($j = 0; $j < 8; ++$j){
            if(($i == 2 || $i == 5) && ($j == 2 || $j == 5)) 
                $board[$i][$j] = 0;
            else
                $board[$i][$j] = (int)$expanded[$ind++];
        }
        $board[$i][8] = 0;
    }
    $board[8] = [0,0,0,0,0,0,0,0,0];
    return solve($board);
}

function printBoard($board){
    for($i = 0; $i < 9; ++$i){
        echo implode(' ', $board[$i]) . PHP_EOL;
    }
    flush();
}

function readBoard(){
    $board = [];
    for($r = 0; $r < 9; ++$r){
        $board[$r] = fscanf(STDIN, "%d%d%d%d%d%d%d%d%d");
    }
    return $board;
}
if(isset($argv[1])){
    if($argv[1] === 'enc'){
        $board = readBoard();
        $bests = ''; $bestl = 999;
        for($i = 0; $i < 71; ++$i){
            $str = encode($board, $i);
            $len = strlen($str);
            if($len < $bestl){
                $bestl = $len;
                $bests = $str;
            }
        }
        echo $bests . PHP_EOL;
    }else if($argv[1] === 'dec'){
        echo printBoard(decode(trim(fgets(STDIN))));
    }
}else{
    echo "Missing argument. Use `{$argv[0]} [enc|dec]`.\n";
}

This solution first clears out the right column and bottom row, as well as the bottom-right corner of each 3x3 block. It then tries clearing out a cell. If a simple solution exists, the cell remains blank.

Then, the sudoku grid is formatted into a string, from left to right and top to bottom, excluding the right column, bottom row, and bottom-right corner. Leading zeros are counted (let this be z) and removed. Trailing zeros are likewise removed.

The string is formatted into either a base 10, 11, or 12 integer (let this base be b), with A representing two zeros, and B, three.

This is converted into a base-95 integer, and prepended by the base-95 digit representing z << 2 | (b - 10).

Call php sudoku-compress.php enc to encode, and php sudoku-compress.php dec to decode. Encoder takes the format given in the question, with a mandatory trailing newline.

Test outputs:

R'Ngxgi#Hu~+cR)0nE)+
Veu-b454j|:tRm(b-Xk'I
V.{mi;*6-/9Ufu[~GE"e>
F/YgX]PeyeKX5=M_+,z+Z
R&3mEHyZ6sSF'-$L<:VmX
"#b'npsIv0%L,t0yr^a.+'&
UNjx*#~I/siBGck7u9eaC%
Z!SuM^f{e<ji@F&hP-S<
*0:43tD r;=x8|&I0/k[&%
B1Mm-dx@G}[2lZId/-'h{zU

es1024

Posted 2014-11-15T00:26:23.360

Reputation: 8 953

1

Java, 330 Points

Before I get ridiculed for such a high score let me clarify that I attempted to try and solve this in a different kind of way knowing it probably wouldn't be quite as optimal as some of the better answers here. I was more or less curious if I could get close which to my surprise I didn't realize just how much worse it would turn out. Here is the run down of what my approach was here:

  1. Develop an algo for solving a Sudoku puzzle.

  2. Develop a scrambling algo that can still be solvable. It does this somewhat randomly while removing clues that can be trivially determined before hand. I could get to about 22 clues reliably before it took far too long.

  3. Once scrambled, the puzzle could be represented by a triplet of single digit integers for each clue, in my case 22 triplets of 3. I thought if I could combine these into a single 66 digit number then base95 encode this then I have something that can be easily decoded.

The encoded string ended up being longer than I hoped at generally around 33 characters long. At which point I tried an alternative way than using Java BigInteger where I created a large number from an 81 bit mask representing the 81 cells of a grid where 1 means a clue exists for this cell. I then combined that bitmask to 4 bit representations of each cell value in sequential order, rounded up to bytes and found that I roughly got the same encoded string length after base95 encoded.

So basically I am posting my code in case anybody was interested in a different approach that didn't work out so well.

Class Puzz

public class Puzz {

    enum By {
        Row, Column, Block
    }

    static final List<Integer> NUMBERS = Arrays.asList(new Integer[] { 1, 2, 3,
            4, 5, 6, 7, 8, 9 });

    List<Square> entries = new ArrayList<Square>();
    HashMap<Integer, List<Square>> squaresByRow = new HashMap<Integer, List<Square>>();
    HashMap<Integer, List<Square>> squaresByColumn = new HashMap<Integer, List<Square>>();
    HashMap<Integer, List<Square>> squaresByBlock = new HashMap<Integer, List<Square>>();

    public Puzz(int[][] data) {

        // Create squares put them in squares by row hashtable
        for (int r = 0; r < 9; r++) {
            List<Square> squaresInRow = new ArrayList<Square>();
            for (int c = 0; c < 9; c++) {
                Square square = new Square(r, c, data[r][c], this);
                entries.add(square);
                squaresInRow.add(square);
            }
            squaresByRow.put(r, squaresInRow);
        }

        // Put squares in column hash table
        for (int c = 0; c < 9; c++) {
            List<Square> squaresInColumn = new ArrayList<Square>();
            for (int r = 0; r < 9; r++) {
                squaresInColumn.add(squaresByRow.get(r).get(c));
            }
            squaresByColumn.put(c, squaresInColumn);
        }

        // Put squares in block hash table
        for (int i = 1; i < 10; i++) {
            squaresByBlock.put(i, new ArrayList<Square>());
        }
        for (int r = 0; r < 9; r++) {
            for (int c = 0; c < 9; c++) {
                int block = getBlock(r, c);
                squaresByBlock.get(block).add(get(r, c));
            }
        }

        // Discover the possibilities
        updatePossibilities();
    }

    public void updatePossibilities() {
        for (int r = 0; r < 9; r++) {
            for (int c = 0; c < 9; c++) {
                Square theSquare = get(r, c);
                if (theSquare.value != 0) {
                    theSquare.possibilities.removeAll(NUMBERS);
                    continue;
                } else {
                    theSquare.possibilities.addAll(NUMBERS);
                }
                int block = getBlock(r, c);
                HashSet<Square> squares = new HashSet<Square>();
                squares.addAll(squaresByRow.get(r));
                squares.addAll(squaresByColumn.get(c));
                squares.addAll(squaresByBlock.get(block));
                for (Square s : squares) {
                    if (s == theSquare)
                        continue;

                    theSquare.possibilities.remove(s.value);
                }
            }
        }
    }

    public int getValue(int row, int column) {
        return squaresByRow.get(row).get(column).value;
    }

    public Square get(int row, int column) {
        return squaresByRow.get(row).get(column);
    }

    public boolean set(int row, int column, int value) {
        if (value == 0) {
            squaresByRow.get(row).get(column).value = 0;
            updatePossibilities();
            return true;
        }

        if (isValid(row, column, value)) {
            squaresByRow.get(row).get(column).value = value;
            updatePossibilities();
            return true;
        } else {
            return false;
        }
    }

    public boolean isValidSubset(By subset, int row, int column, int value) {
        List<Dubs> dubss = new ArrayList<Dubs>();
        List<Trips> tripss = new ArrayList<Trips>();
        Square theSquare = get(row, column);
        int block = getBlock(row, column);
        List<Square> squares = new ArrayList<Square>();
        switch (subset) {
        case Row:
            squares.addAll(squaresByRow.get(row));
            break;
        case Column:
            squares.addAll(squaresByColumn.get(column));
            break;
        default:
            squares.addAll(squaresByBlock.get(block));
            break;
        }

        for (Square r : squares) {
            if (r == theSquare)
                continue;
            // if any of the impacted squares have this value then it is not a
            // valid value
            if (r.value == value)
                return false;

            if (r.possibilities.size() == 3) {
                List<Integer> poss = new ArrayList<Integer>(r.possibilities);
                tripss.add(new Trips(poss.get(0), poss.get(1), poss.get(2),
                        r.row, r.col));
            }

            if (r.possibilities.size() == 2) {
                List<Integer> poss = new ArrayList<Integer>(r.possibilities);
                dubss.add(new Dubs(poss.get(0), poss.get(1), r.row, r.col));
            }
        }

        // Find the trips and rule out the value if a triplet exists in squares
        List<Trips> tripsCopy = new ArrayList<Trips>(tripss);
        for (Trips trips : tripsCopy) {
            int countOfOccurrences = 0;
            for (Trips tr : tripss) {
                if (tr.equals(trips) && !(tr.row == row && tr.col == column))
                    countOfOccurrences++;
            }

            for (Dubs dubs : dubss) {
                if (trips.containedWithin(dubs)
                        && !(dubs.row == row && dubs.col == column))
                    countOfOccurrences++;
            }

            if (countOfOccurrences == 3 && trips.containedWithin(value))
                return false;
        }

        // Find the dubs and rule out the value if a double exists in squares
        List<Dubs> dubsCopy = new ArrayList<Dubs>(dubss);
        for (Dubs dubs : dubsCopy) {
            int countOfOccurrences = 0;
            for (Dubs du : dubss) {
                // Count occurrences of Dubs that are not the tested square
                if (du.equals(dubs) && !(du.row == row && du.col == column))
                    countOfOccurrences++;
            }

            if (countOfOccurrences == 2 && dubs.containedWithin(value))
                return false;
        }

        return true;
    }

    public boolean isValid(int row, int column, int value) {

        return isValidSubset(By.Row, row, column, value)
                && isValidSubset(By.Column, row, column, value)
                && isValidSubset(By.Block, row, column, value);
    }

    public int getBlock(int row, int column) {
        int blockRow = (int) Math.floor(row / 3);
        int columnRow = (int) Math.floor(column / 3) + 1;
        return (blockRow * 3) + columnRow;
    }

    public Puzz solve(Puzz arg, boolean top) throws Exception {
        // Make an original copy of the array
        Puzz p = (Puzz) arg.clone();
        for (int i = 1; i < 10; i++) {
            for (Square s : p.squaresByBlock.get(i)) {
                if (s.value == 0) {
                    for (Integer number : NUMBERS) {
                        if (p.set(s.row, s.col, number)) {
                            // System.out.println(p);
                            Puzz solved = solve(p, false);
                            if (solved != null)
                                return solved;
                        }
                    }
                    // no numbers fit here, return null and backtrack
                    p.set(s.row, s.col, 0);
                    return null;
                }
            }
        }

        // Check for remaining 0's
        for (Square s : p.entries) {
            if (s.value == 0)
                return null;
        }
        return p;
    }

    public Puzz scramble(int clues) throws Exception {
        Puzz p = (Puzz) clone();
        Random rand = new Random();
        int removed = 0;

        //Remove the last row, it is a freebie
        int toRemove = 81 - clues - 15;
        for (int c = 0; c < 9; c++) {
            p.set(8, c, 0);
        }
        p.set(0, 0, 0);
        p.set(0, 3, 0);
        p.set(0, 6, 0);
        p.set(3, 0, 0);
        p.set(3, 3, 0);
        p.set(3, 6, 0);

        // Keeping track of this because randomly removing squares can potentially create an 
        // unsolvable situation
        HashSet<Square> alreadyTried = new HashSet<Square>();
        while (removed < toRemove) {
            if (alreadyTried.size() >= ((toRemove + clues) - removed)) {
                // Start over
                removed = 0;
                alreadyTried = new HashSet<Square>();
                p = (Puzz)clone();
                for (int c = 0; c < 9; c++) {
                    p.set(8, c, 0);
                }
                p.set(0, 0, 0);
                p.set(0, 3, 0);
                p.set(0, 6, 0);
                p.set(3, 0, 0);
                p.set(3, 3, 0);
                p.set(3, 6, 0);
            }
            int randX = rand.nextInt((7) + 1);
            int randY = rand.nextInt((8) + 1);
            int existingValue = p.getValue(randX, randY);
            if (existingValue != 0) {
                p.set(randX, randY, 0);
                // confirm it is still solvable after removing this item
                Puzz psol = solve(p, true);
                if (psol != null && psol.equals(this)) {
                    removed++;
                    alreadyTried = new HashSet<Square>();
                    System.out.println("Clues Remaining: " + (81 - 15 - removed));
                } else {
                    // otherwise set it back to what it was and try again
                    p.set(randX, randY, existingValue);
                    Square s = new Square(randX, randY, existingValue, p);
                    alreadyTried.add(s);
                }
            }

        }
        p.updatePossibilities();
        return p;
    }

    public static String encode(Puzz p) { // Remove all zero'ed items
        StringBuffer sb = new StringBuffer();
        for (int i = 0; i < 9; i++) {
            for (Square s : p.squaresByRow.get(i)) {
                if (s.value == 0)
                    continue;
                sb.append(s.row).append(s.col).append(s.value);
            }
        }

        // number mod 95 gives lowest digit, subtract that from original number
        BigInteger num = new BigInteger(sb.toString());
        byte[] numBytes = num.toByteArray();

        StringBuffer retVal = new StringBuffer();
        while (num.compareTo(BigInteger.ZERO) > 0) {
            int modu = num.mod(new BigInteger("95")).intValue();
            retVal.append((char) (modu + 32));
            num = num.subtract(new BigInteger("" + modu));
            num = num.divide(new BigInteger("95"));
        }
        return retVal.toString();
    }


    @Override
    public boolean equals(Object arg0) {
        if (arg0 == null || !(arg0 instanceof Puzz))
            return false;

        Puzz p = (Puzz) arg0;
        for (int r = 0; r < 9; r++) {
            for (int c = 0; c < 9; c++) {
                int val1 = getValue(r, c);
                int val2 = p.getValue(r, c);
                if (val1 != val2)
                    return false;
            }
        }
        return true;
    }

    @Override
    protected Object clone() throws CloneNotSupportedException {
        int[][] data = new int[9][9];
        for (Square square : entries) {
            data[square.row][square.col] = square.value;
        }

        return new Puzz(data);
    }

    @Override
    public String toString() {
        if (entries == null)
            return "";
        StringBuffer sb = new StringBuffer();
        for (int r = 0; r < 9; r++) {
            for (int c = 0; c < 9; c++) {
                sb.append(getValue(r, c)).append(' ');
            }
            sb.append('\n');
        }
        return sb.toString();
    }

}

class Square {

    public Square(int row, int col, Puzz p) {
        this.row = row;
        this.col = col;
        this.p = p;
    }

    public Square(int row, int col, int value, Puzz p) {
        this(row, col, p);
        this.value = value;
    }

    int row;
    int col;
    int value;
    HashSet<Integer> possibilities = new HashSet<Integer>(Puzz.NUMBERS);
    Puzz p;

    @Override
    protected Object clone() throws CloneNotSupportedException {
        Square s = new Square(row, col, value, p);
        s.possibilities = new HashSet<Integer>();
        for (Integer val : possibilities) {
            s.possibilities.add(new Integer(val));
        }
        return s;
    }

    @Override
    public boolean equals(Object obj) {
        if (!(obj instanceof Square))
            return false;

        Square s = (Square) obj;
        return row == s.row && col == s.col && value == s.value
                && p.equals(s.p);
    }

    @Override
    public int hashCode() {
        return row ^ col ^ value ^ p.hashCode();
    }
}

class Dubs {
    int p1;
    int p2;

    int row, col;

    public Dubs(int p1, int p2) {
        this.p1 = p1;
        this.p2 = p2;
    }

    public Dubs(int p1, int p2, int row, int col) {
        this(p1, p2);
        this.row = row;
        this.col = col;
    }

    public boolean containedWithin(int value) {
        return (p1 == value || p2 == value);
    }

    @Override
    public boolean equals(Object arg0) {
        if (!(arg0 instanceof Dubs))
            return false;

        Dubs d = (Dubs) arg0;
        return (this.p1 == d.p1 || this.p1 == d.p2)
                && (this.p2 == d.p1 || this.p2 == d.p2);
    }
}

class Trips {
    int p1;
    int p2;
    int p3;
    int row, col;

    public Trips(int p1, int p2) {
        this.p1 = p1;
        this.p2 = p2;
    }

    public Trips(int p1, int p2, int p3) {
        this(p1, p2);
        this.p3 = p3;
    }

    public Trips(int p1, int p2, int p3, int row, int col) {
        this(p1, p2, p3);
        this.row = row;
        this.col = col;
    }

    public boolean containedWithin(int value) {
        return (p1 == value || p2 == value || p3 == value);
    }

    public boolean containedWithin(Dubs d) {
        return (d.p1 == p1 || d.p1 == p2 || d.p1 == p3)
                && (d.p2 == p1 || d.p2 == p2 || d.p2 == p3);
    }

    public boolean equals(Object arg0) {
        if (!(arg0 instanceof Trips))
            return false;

        Trips t = (Trips) arg0;
        return (this.p1 == t.p1 || this.p1 == t.p2 || this.p1 == t.p3)
                && (this.p2 == t.p1 || this.p2 == t.p2 || this.p2 == t.p3)
                && (this.p3 == t.p1 || this.p3 == t.p2 || this.p3 == t.p3);
    }
}

My Test Case

public class TestCompression extends TestCase {

    public static int[][] test1 = new int[][] {
            new int[] { 9, 7, 3, 5, 8, 1, 4, 2, 6 },
            new int[] { 5, 2, 6, 4, 7, 3, 1, 9, 8 },
            new int[] { 1, 8, 4, 2, 9, 6, 7, 5, 3 },
            new int[] { 2, 4, 7, 8, 6, 5, 3, 1, 9 },
            new int[] { 3, 9, 8, 1, 2, 4, 6, 7, 5 },
            new int[] { 6, 5, 1, 7, 3, 9, 8, 4, 2 },
            new int[] { 8, 1, 9, 3, 4, 2, 5, 6, 7 },
            new int[] { 7, 6, 5, 9, 1, 8, 2, 3, 4 },
            new int[] { 4, 3, 2, 6, 5, 7, 9, 8, 1 } };
    public static int[][] test2 = new int[][] {
            new int[] { 7, 2, 4, 8, 6, 5, 1, 9, 3 },
            new int[] { 1, 6, 9, 2, 4, 3, 8, 7, 5 },
            new int[] { 3, 8, 5, 1, 9, 7, 2, 4, 6 },
            new int[] { 8, 9, 6, 7, 2, 4, 3, 5, 1 },
            new int[] { 2, 7, 3, 9, 5, 1, 6, 8, 4 },
            new int[] { 4, 5, 1, 3, 8, 6, 9, 2, 7 },
            new int[] { 5, 4, 2, 6, 3, 9, 7, 1, 8 },
            new int[] { 6, 1, 8, 5, 7, 2, 4, 3, 9 },
            new int[] { 9, 3, 7, 4, 1, 8, 5, 6, 2 } };
    public static int[][] test3 = new int[][] {
            new int[] { 1, 5, 7, 6, 8, 2, 3, 4, 9 },
            new int[] { 4, 3, 2, 5, 1, 9, 6, 8, 7 },
            new int[] { 6, 9, 8, 3, 4, 7, 2, 5, 1 },
            new int[] { 8, 2, 5, 4, 7, 6, 1, 9, 3 },
            new int[] { 7, 1, 3, 9, 2, 8, 4, 6, 5 },
            new int[] { 9, 6, 4, 1, 3, 5, 7, 2, 8 },
            new int[] { 5, 4, 1, 2, 9, 3, 8, 7, 6 },
            new int[] { 2, 8, 9, 7, 6, 1, 5, 3, 4 },
            new int[] { 3, 7, 6, 8, 5, 4, 9, 1, 2 } };
    public static int[][] test4 = new int[][] {
            new int[] { 8, 3, 5, 4, 1, 6, 9, 2, 7 },
            new int[] { 2, 9, 6, 8, 5, 7, 4, 3, 1 },
            new int[] { 4, 1, 7, 2, 9, 3, 6, 5, 8 },
            new int[] { 5, 6, 9, 1, 3, 4, 7, 8, 2 },
            new int[] { 1, 2, 3, 6, 7, 8, 5, 4, 9 },
            new int[] { 7, 4, 8, 5, 2, 9, 1, 6, 3 },
            new int[] { 6, 5, 2, 7, 8, 1, 3, 9, 4 },
            new int[] { 9, 8, 1, 3, 4, 5, 2, 7, 6 },
            new int[] { 3, 7, 4, 9, 6, 2, 8, 1, 5 } };
    public static int[][] test5 = new int[][] {
            new int[] { 6, 2, 8, 4, 5, 1, 7, 9, 3 },
            new int[] { 5, 9, 4, 7, 3, 2, 6, 8, 1 },
            new int[] { 7, 1, 3, 6, 8, 9, 5, 4, 2 },
            new int[] { 2, 4, 7, 3, 1, 5, 8, 6, 9 },
            new int[] { 9, 6, 1, 8, 2, 7, 3, 5, 4 },
            new int[] { 3, 8, 5, 9, 6, 4, 2, 1, 7 },
            new int[] { 1, 5, 6, 2, 4, 3, 9, 7, 8 },
            new int[] { 4, 3, 9, 5, 7, 8, 1, 2, 6 },
            new int[] { 8, 7, 2, 1, 9, 6, 4, 3, 5 } };
    public static int[][] test6 = new int[][] {
            new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 },
            new int[] { 4, 5, 6, 7, 8, 9, 1, 2, 3 },
            new int[] { 7, 8, 9, 1, 2, 3, 4, 5, 6 },
            new int[] { 2, 1, 4, 3, 6, 5, 8, 9, 7 },
            new int[] { 3, 6, 5, 8, 9, 7, 2, 1, 4 },
            new int[] { 8, 9, 7, 2, 1, 4, 3, 6, 5 },
            new int[] { 5, 3, 1, 6, 4, 8, 9, 7, 2 },
            new int[] { 6, 4, 8, 9, 7, 2, 5, 3, 1 },
            new int[] { 9, 7, 2, 5, 3, 1, 6, 4, 8 } };
    public static int[][] test7 = new int[][] {
            new int[] { 1, 4, 5, 7, 9, 2, 8, 3, 6 },
            new int[] { 3, 7, 6, 5, 8, 4, 1, 9, 2 },
            new int[] { 2, 9, 8, 3, 6, 1, 7, 5, 4 },
            new int[] { 7, 3, 1, 9, 2, 8, 6, 4, 5 },
            new int[] { 8, 5, 9, 6, 4, 7, 3, 2, 1 },
            new int[] { 4, 6, 2, 1, 3, 5, 9, 8, 7 },
            new int[] { 6, 2, 4, 8, 7, 3, 5, 1, 9 },
            new int[] { 5, 8, 7, 4, 1, 9, 2, 6, 3 },
            new int[] { 9, 1, 3, 2, 5, 6, 4, 7, 8 } };
    public static int[][] test8 = new int[][] {
            new int[] { 5, 2, 7, 4, 1, 6, 9, 3, 8 },
            new int[] { 8, 6, 4, 3, 2, 9, 1, 5, 7 },
            new int[] { 1, 3, 9, 5, 7, 8, 6, 4, 2 },
            new int[] { 2, 9, 1, 8, 5, 4, 3, 7, 6 },
            new int[] { 3, 4, 8, 6, 9, 7, 5, 2, 1 },
            new int[] { 6, 7, 5, 1, 3, 2, 4, 8, 9 },
            new int[] { 7, 1, 2, 9, 4, 5, 8, 6, 3 },
            new int[] { 4, 8, 3, 2, 6, 1, 7, 9, 5 },
            new int[] { 9, 5, 6, 7, 8, 3, 2, 1, 4 } };
    public static int[][] test9 = new int[][] {
            new int[] { 2, 4, 6, 7, 1, 3, 9, 8, 5 },
            new int[] { 1, 8, 5, 4, 9, 6, 7, 3, 2 },
            new int[] { 9, 3, 7, 8, 2, 5, 1, 4, 6 },
            new int[] { 6, 7, 8, 5, 4, 2, 3, 9, 1 },
            new int[] { 4, 9, 3, 1, 6, 8, 2, 5, 7 },
            new int[] { 5, 1, 2, 3, 7, 9, 4, 6, 8 },
            new int[] { 8, 2, 4, 9, 5, 7, 6, 1, 3 },
            new int[] { 7, 5, 9, 6, 3, 1, 8, 2, 4 },
            new int[] { 3, 6, 1, 2, 8, 4, 5, 7, 9 } };
    public static int[][] test10 = new int[][] {
            new int[] { 8, 6, 1, 2, 9, 4, 5, 7, 3 },
            new int[] { 4, 7, 5, 3, 1, 8, 6, 9, 2 },
            new int[] { 3, 9, 2, 5, 6, 7, 8, 1, 4 },
            new int[] { 2, 3, 6, 4, 5, 9, 7, 8, 1 },
            new int[] { 1, 5, 4, 7, 8, 3, 2, 6, 9 },
            new int[] { 9, 8, 7, 6, 2, 1, 3, 4, 5 },
            new int[] { 5, 2, 9, 1, 7, 6, 4, 3, 8 },
            new int[] { 6, 4, 8, 9, 3, 2, 1, 5, 7 },
            new int[] { 7, 1, 3, 8, 4, 5, 9, 2, 6 } };

    @Test
    public void test2() throws Exception {
        int encodedLength = 0;
        Puzz expected = new Puzz(test1);
        Puzz test = (Puzz) expected.clone();

        long start = System.currentTimeMillis();
        test = test.scramble(22);
        long duration = System.currentTimeMillis() - start;
        System.out.println("Duration of scramble for 22 clue puzzle: " + duration);
        System.out.println("Scrambled");
        System.out.println(test);

        String encoded = Puzz.encode(test);

        System.out.println("Encoded Length with BigInteger: " + encoded.length());
        encodedLength += encoded.length();


        expected = new Puzz(test2);
        test = (Puzz) expected.clone();

        start = System.currentTimeMillis();
        test = test.scramble(22);
        duration = System.currentTimeMillis() - start;
        System.out.println("Duration of scramble for 22 clue puzzle: " + duration);
        System.out.println("Scrambled");
        System.out.println(test);

        encoded = Puzz.encode(test);

        System.out.println("Encoded Length with BigInteger: " + encoded.length());
        encodedLength += encoded.length();

        expected = new Puzz(test3);
        test = (Puzz) expected.clone();

        start = System.currentTimeMillis();
        test = test.scramble(22);
        duration = System.currentTimeMillis() - start;
        System.out.println("Duration of scramble for 22 clue puzzle: " + duration);
        System.out.println("Scrambled");
        System.out.println(test);

        encoded = Puzz.encode(test);

        System.out.println("Encoded Length with BigInteger: " + encoded.length());
        encodedLength += encoded.length();

        expected = new Puzz(test4);
        test = (Puzz) expected.clone();

        start = System.currentTimeMillis();
        test = test.scramble(22);
        duration = System.currentTimeMillis() - start;
        System.out.println("Duration of scramble for 22 clue puzzle: " + duration);
        System.out.println("Scrambled");
        System.out.println(test);

        encoded = Puzz.encode(test);

        System.out.println("Encoded Length with BigInteger: " + encoded.length());
        encodedLength += encoded.length();

        expected = new Puzz(test5);
        test = (Puzz) expected.clone();

        start = System.currentTimeMillis();
        test = test.scramble(22);
        duration = System.currentTimeMillis() - start;
        System.out.println("Duration of scramble for 22 clue puzzle: " + duration);
        System.out.println("Scrambled");
        System.out.println(test);

        encoded = Puzz.encode(test);

        System.out.println("Encoded Length with BigInteger: " + encoded.length());
        encodedLength += encoded.length();

        expected = new Puzz(test6);
        test = (Puzz) expected.clone();

        start = System.currentTimeMillis();
        test = test.scramble(22);
        duration = System.currentTimeMillis() - start;
        System.out.println("Duration of scramble for 22 clue puzzle: " + duration);
        System.out.println("Scrambled");
        System.out.println(test);

        encoded = Puzz.encode(test);

        System.out.println("Encoded Length with BigInteger: " + encoded.length());
        encodedLength += encoded.length();

        expected = new Puzz(test7);
        test = (Puzz) expected.clone();

        start = System.currentTimeMillis();
        test = test.scramble(22);
        duration = System.currentTimeMillis() - start;
        System.out.println("Duration of scramble for 22 clue puzzle: " + duration);
        System.out.println("Scrambled");
        System.out.println(test);

        encoded = Puzz.encode(test);

        System.out.println("Encoded Length with BigInteger: " + encoded.length());
        encodedLength += encoded.length();

        expected = new Puzz(test8);
        test = (Puzz) expected.clone();

        start = System.currentTimeMillis();
        test = test.scramble(22);
        duration = System.currentTimeMillis() - start;
        System.out.println("Duration of scramble for 22 clue puzzle: " + duration);
        System.out.println("Scrambled");
        System.out.println(test);

        encoded = Puzz.encode(test);

        System.out.println("Encoded Length with BigInteger: " + encoded.length());
        encodedLength += encoded.length();

        expected = new Puzz(test9);
        test = (Puzz) expected.clone();

        start = System.currentTimeMillis();
        test = test.scramble(22);
        duration = System.currentTimeMillis() - start;
        System.out.println("Duration of scramble for 22 clue puzzle: " + duration);
        System.out.println("Scrambled");
        System.out.println(test);

        encoded = Puzz.encode(test);

        System.out.println("Encoded Length with BigInteger: " + encoded.length());
        encodedLength += encoded.length();

        expected = new Puzz(test10);
        test = (Puzz) expected.clone();

        start = System.currentTimeMillis();
        test = test.scramble(22);
        duration = System.currentTimeMillis() - start;
        System.out.println("Duration of scramble for 22 clue puzzle: " + duration);
        System.out.println("Scrambled");
        System.out.println(test);

encoded = Puzz.encode(test);
encodedLength += encoded.length();

        System.out.println("Final Result: " + encodedLength); 
    }

}

Test Output

Duration of scramble for 22 clue puzzle: 427614
Scrambled
0 0 3 0 0 0 0 0 6 
0 2 0 0 0 0 0 9 0 
0 0 0 0 9 6 7 5 0 
0 4 0 0 0 5 0 1 0 
0 0 0 1 0 0 0 0 0 
0 5 0 0 0 0 8 4 0 
0 0 0 3 0 0 5 0 7 
7 0 0 9 0 8 0 3 0 
0 0 0 0 0 0 0 0 0 

Building encoded string: U5[XZ+C6Bgf)}O."gDE)`\)kNv7*6}1w+
Encoded Length with BigInteger: 33

Duration of scramble for 22 clue puzzle: 167739
Scrambled
0 2 4 0 0 0 0 0 0 
1 6 0 0 4 0 8 0 5 
0 0 5 0 9 7 2 0 0 
0 0 0 0 2 4 0 0 1 
0 0 3 9 0 0 0 0 0 
0 0 0 0 0 0 0 0 7 
0 4 0 0 0 0 0 0 8 
0 1 0 5 0 0 0 3 0 
0 0 0 0 0 0 0 0 0 

Building encoded string: 7\c^oE}`H6@P.&E)Zu\t>B"k}Vf<[0a3&
Encoded Length with BigInteger: 33

Duration of scramble for 22 clue puzzle: 136364
Scrambled
0 0 7 0 8 0 0 0 0 
0 3 2 0 0 9 6 0 0 
0 0 0 0 0 0 2 5 0 
0 2 0 0 0 6 0 0 0 
0 0 0 9 0 0 0 0 0 
0 0 4 1 0 5 7 2 0 
5 0 1 0 0 0 0 7 0 
2 8 9 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 

Building encoded string: [S#bHlTDwS,&w,moQ{WN}Z9!{1C>.vN{-
Encoded Length with BigInteger: 33

Duration of scramble for 22 clue puzzle: 392150
Scrambled
0 0 0 0 0 6 0 0 0 
0 9 0 0 0 0 0 0 1 
4 0 0 0 0 3 6 0 8 
0 0 0 0 0 0 0 8 0 
0 0 3 0 7 8 0 0 9 
7 0 0 0 0 0 0 0 3 
6 0 2 0 0 0 0 9 0 
9 0 1 3 4 0 2 0 0 
0 0 0 0 0 0 0 0 0 

Building encoded string: T-yKJ2<d)Dj~[~>]334*9YpxM<JQNf2|<
Encoded Length with BigInteger: 33

Duration of scramble for 22 clue puzzle: 169355
Scrambled
0 0 0 0 0 1 0 0 0 
0 9 4 7 0 0 0 8 0 
0 1 3 0 0 0 5 0 2 
0 0 0 0 0 0 0 0 9 
0 0 0 0 2 7 3 5 4 
0 8 0 0 0 0 0 1 0 
0 0 0 0 4 0 9 0 8 
0 0 0 5 0 0 0 0 6 
0 0 0 0 0 0 0 0 0 

Building encoded string: 5@.=FmOKws7jl5*hWMQqqou\lv'e^Q}D:
Encoded Length with BigInteger: 33

Duration of scramble for 22 clue puzzle: 786
Scrambled
0 2 3 0 0 6 0 0 0 
0 5 0 7 0 0 1 2 3 
0 8 0 0 2 0 0 0 0 
0 0 0 0 0 5 0 0 7 
0 6 5 8 0 0 0 0 0 
0 0 7 0 0 4 3 0 0 
0 3 0 0 4 0 0 0 2 
0 0 0 0 0 2 0 0 0 
0 0 0 0 0 0 0 0 0 

Building encoded string: wY%(O9tOSDZu-PBaFl^.f0xH7C~e)=\3&
Encoded Length with BigInteger: 33

Duration of scramble for 22 clue puzzle: 826530
Scrambled
0 0 0 0 9 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 9 0 3 0 1 7 0 0 
0 3 0 0 0 8 0 4 5 
0 0 9 0 0 7 3 0 0 
0 0 2 0 3 0 0 8 0 
6 0 0 0 0 0 0 0 9 
5 0 0 4 1 0 2 0 3 
0 0 0 0 0 0 0 0 0 

Building encoded string: K|>.Aa?,8e&NRL;*ut=+Iqk8E$@&-zlF9
Encoded Length with BigInteger: 33

Duration of scramble for 22 clue puzzle: 4834
Scrambled
0 2 0 0 1 0 0 3 8 
8 6 0 3 0 0 1 0 0 
0 0 0 0 0 8 6 0 2 
0 0 0 0 0 0 0 7 0 
0 0 8 0 0 0 0 0 0 
0 0 0 0 3 0 0 0 0 
0 0 2 0 0 5 8 0 3 
4 0 0 0 0 1 7 9 0 
0 0 0 0 0 0 0 0 0 

Building encoded string: GOS0!r=&HR5PZ|ezy>*l7 HWU`wIN7Q4&
Encoded Length with BigInteger: 33

Duration of scramble for 22 clue puzzle: 42126
Scrambled
0 0 0 0 0 3 0 0 5 
0 0 5 4 0 0 0 3 2 
9 0 0 8 0 0 0 0 0 
0 0 0 0 0 2 0 0 0 
0 0 0 0 6 8 2 0 7 
5 1 0 0 7 0 0 0 8 
8 0 0 0 5 0 0 1 0 
7 0 0 0 0 0 0 0 4 
0 0 0 0 0 0 0 0 0 

Building encoded string: [4#9D_?I1.!h];Y_2!iqLyngbBJ&k)FF;
Encoded Length with BigInteger: 33

Duration of scramble for 22 clue puzzle: 156182
Scrambled
0 6 0 0 0 0 0 7 0 
4 0 5 3 1 0 0 0 2 
0 0 0 0 6 0 0 0 0 
0 3 0 0 0 9 0 8 1 
0 0 0 0 0 0 0 0 0 
0 0 7 0 0 1 0 4 5 
5 0 9 0 0 0 0 0 8 
6 0 0 0 3 2 0 0 0 
0 0 0 0 0 0 0 0 0 

Building encoded string: r+a;I%hGj4YCA-pXz+n=ioRL:agzH'K<(
Encoded Length with BigInteger: 33
Final Result: 330

maple_shaft

Posted 2014-11-15T00:26:23.360

Reputation: 421

I don't know if you are doing this already, but the last row of your scramble always ends up being all zeros, so one optimization you could do is take the last row for granted and just assume that whatever cells are missing from the end of your encoding are all zeros. – kukac67 – 2014-12-16T22:00:04.947

0

C++ 241, score: 82*10=820

Adds '!' to beginning of encoded string to determine which operation to perform.

Golfed 241 chars

void D(char i){static int x=0;cout<<(int)(i-'a')<<" ";if(x++%8==0) cout<<endl;}
int main()
{
int i=81;int n;string S;
char c=cin.peek();
if(c=='!'){cin>>S;for_each(S.begin()+1,S.end(),D);}
else{S.push_back('!');while(i--){cin>>n;S.push_back(n+'a');}cout<<S;}
}

Ungolfed 312 chars

void decode(char i) {
static int x=0;
cout<<(int)(i-'a')<<" ";
if(x++%8==0) cout<<endl;
}
int main()
{
int i=81;
int n;
string d;
char c=cin.peek();
if(c=='!'){
cin>>d;
for_each(d.begin()+1,d.end(),decode);
}
else{
d.push_back('!');
while(i--)
{
cin>>n;
d.push_back(n+'a');
}
cout<<d;
}
}

bacchusbeale

Posted 2014-11-15T00:26:23.360

Reputation: 1 235

4This isn't code golf. The point of this challenge is to minimise the encoded board length... – John Dvorak – 2014-11-15T22:08:23.210

1So every sudoku grid is conpressable to 82 bytes in length? – Beta Decay – 2014-11-16T00:09:34.823