Write a Kolmogorov Complexity Solver

16

3

The Kolmogorov complexity of a string S is the length of the shortest program P, written in some programming language L, whose output is exactly S.
(Yes, the real definition is more formal but this will suffice for the challenge.)

Your task in this challenge is to write the shortest possible "Kolmogorov complexity solver", that is, a program written in L itself that takes in a string S and returns the shortest P written in L that outputs S.

The naive approach to this is to iterate over all length 1 programs, then all length 2 programs, then all length 3 programs, and so on, running each of them and measuring the output until a program that outputs S is found. The issue with this approach is that some of these programs may never stop running, which means that the solver itself may never stop. And due to the halting problem there is no surefire way to avoid the programs that don't stop.

A simple, though imperfect solution is to put a time limit on the execution time of each of the potential P's. Programs that don't happen to halt in time may be passed over, but the solver will definitely stop (assuming that a program in L can indeed output S within the time limit).

Challenge

Write your solver as a program or function that takes in three things:

  • The string S.
  • A positive integer T that is the time limit in seconds or some smaller time span (e.g. milliseconds).
  • A string A of the alphabet of characters to use for potential P's.

And outputs the shortest P that only contains characters in A, runs in less than T time units, and outputs S.

This is the general pseudocode:

Function KolmogorovComplexitySolver(S, T, A):
    Assign N to 0
    Loop until function returns:
        In any order, iterate over all strings of length N that only contain characters from *A*. Call the current string P:
            Execute P, saving the output to O, stopping the execution if it takes longer than time T
            If (P did not throw any errors) and (P did not timeout) and (O and S are identical):
                Return P
        Add 1 to N

Details

  • You may assume that there will always be a P made from the characters in A that runs in time T that outputs S.
  • You may assume that the execution of the potential P's will not have side effects that prevent the solver from running or working correctly (like messing with the solver's allocated memory).
  • You may not assume that the potential P's are error free. Be sure to include try/catch blocks or whatever applicable around the execution call.
  • If there are multiple shortest P's, then any will suffice. "Shortness" is measured in characters not bytes.
  • The output of potential P's is what's printed to stdout (or your language's usual output area). The empty string is a potential P.
  • Ideally your solver will allow A to contain any characters. A must at least to be able to contain printable ASCII characters plus tabs and newlines.
  • Input may come from file/stdin/command line/function args. Output goes to stdout or similar, or can be returned as a string if you wrote a function.

Scoring

The submission with the fewest bytes wins. Tiebreaker goes to earliest posted submission.

Calvin's Hobbies

Posted 2015-03-04T15:34:02.857

Reputation: 84 000

7My brain hurts. – Alex A. – 2015-03-04T19:59:09.600

1Can you relax the requirement that the target language and language which the meta solver is written in must be the same? – n̴̖̋h̷͉̃a̷̭̿h̸̡̅ẗ̵̨́d̷̰̀ĥ̷̳ – 2015-03-05T09:37:25.940

And wouldn't it be possible to just write a program which converts the output into string literal representation of the language? – n̴̖̋h̷͉̃a̷̭̿h̸̡̅ẗ̵̨́d̷̰̀ĥ̷̳ – 2015-03-05T09:41:16.653

@n̴̖̋h̷͉̃a̷̭̿h̸̡̅ẗ̵̨́d̷̰̀ĥ̷̳ No, the point is to do it in the same language. Yes, but that would not always be the shortest program. – Calvin's Hobbies – 2015-03-06T02:34:10.277

@Calvin'sHobbies: So in the end, it is only a code golf challenge to find out which language can easily write code to call facilities (or implement compiler where absent) to compile the language itself? – n̴̖̋h̷͉̃a̷̭̿h̸̡̅ẗ̵̨́d̷̰̀ĥ̷̳ – 2015-03-06T03:02:33.800

Answers

11

Python 3, 240 236 bytes

import subprocess as s,itertools as i
def f(S,T,A,N=0):
 while 1:
  for P in i.product(A,repeat=N):
   try:
    P="".join(P);open("a","w").write(P)
    if s.check_output(["py","-3","a"],timeout=T).decode()==S:return P
   except:1
  N+=1

Don't run this. On my computer, at least, I found the program really hard to stop once it started running due to the pop-up windows created per process.

timeouts were only added to subprocess.check_output in Python 3, which is why we're using this rather than Python 2.

Here's an alternative version with a time.sleep that also prints all valid programs found along the way, as well as their corresponding output:

import subprocess as s,itertools as i
import time
def f(S,T,A,N=0):
 while 1:
  for P in i.product(A,repeat=N):
   time.sleep(0.2)
   try:
    P="".join(P);open("a","w").write(P);C=s.check_output(["py","-3","a"],timeout=T).decode()
    print(P,repr(C))
    if C==S:return P
   except:1
  N+=1

The program uses the filename a for each program P to be tested, so if you run this make sure you don't already have a file of that name. Replace ["py","-3","a"] with the appropriate command for your setup (e.g. ["python","a"] or ["python3","a"]).

Feel free to change the sleep duration at your own risk :). Call like f("1\r\n",1,"1()print"), where T is timeout in seconds.

First few lines of output from the tester with the above call:

 ''
1 ''
11 ''
() ''
111 ''
(1) ''
int ''
1111 ''

(If you want to help the program along a bit you can change P="".join(P) to P="print"+"".join(P))

Since the above programs all have no output, here's the result for f("1\r\n",1,["print","(",")","1"]) (tokens aren't part of the challenge, but I wanted to show what happens):

 ''
print ''
1 ''
() ''
11 ''
print() '\r\n'
(print) ''
(1) ''
111 ''
print(print) '<built-in function print>\r\n'
print(1) '1\r\n'

The return value is the string 'print(1)'.

Finally, just for fun, here's what happens if the alphabet is string.printable, i.e.

0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c

Pastebin link of all valid 0-2 char Python 3 programs

Sp3000

Posted 2015-03-04T15:34:02.857

Reputation: 58 729