Predictor Generator

Write a program that when given a string \$S\$, it generates a program of the same language as itself.

The generated program, when given a continuous substring of \$S\$, should predict the next character in an occurence of the subtring in \$S\$. You are guaranteed that this substring occurs exactly once in \$S\$ and does not occur at the end of \$S\$. \$S\$ will be a string of at most \$10000\$ characters consisting of only lowercase English letters.

The program that you are submitting will only be given \$S\$. The program that your program generates will be given a list of substrings and should return a list of answers in an acceptable format.

This is code-challenge, and the scoring criterion is as follows:

The score of the generated program is equal to \$\frac{|S|c}{1+Lt}\$, where \$|S|\$ is the length of the string, \$c\$ is the number of correct predictions, \$L\$ is the length of the generated program, and \$t\$ is the total number of predictions.

The score of your submission is the minimum score of your generated programs across all testcases.

Users are welcome to "hack" other people's submissions by proposing new test cases in order to decrease submissions' scores.

Some example test cases can be found here (currently only two test cases, I will add more later).

Baaing Cow

Posted 2020-02-13T12:20:14.633

Reputation: 349

2"Across all testcases" means over everyone's answers? Does this mean you don't know your score when you submit? – xnor – 2020-02-13T12:30:20.800

"Across all testcases" means min(score of the generated program for S) for your submission. (Can someone verify the link works?) – Baaing Cow – 2020-02-13T12:32:00.347

1I'm confused then what you mean about hacking other users' submissions. How can you affect their scores? – xnor – 2020-02-13T12:33:16.533

@xnor Maybe he meant has "hack" as in codeforces, I think. The person have to write code again -> (maybe more increased length?) – buttercrab – 2020-02-13T12:34:04.573

The hacker can propose a test case, and since your score is the minimum for all the test cases, your score will decrease if your generator performs particularly badly on a certain case. – Baaing Cow – 2020-02-13T12:34:07.903

@jaeyongsung Well the person's code may just perform very badly, causing their score to decrease, and everyone's scores will be asked to be re-evaluated (I do mean codeforces' hacking system( – Baaing Cow – 2020-02-13T12:34:48.177

4Oh, I see. It's really not clear though that users can add test cases, since usually these are written into the spec. And is this really better than just being required to prove that your program meets a score threshold? I understand it might not be obvious what the worst case is, but it seems to me it shouldn't be hard in practice. – xnor – 2020-02-13T12:38:56.783

4I'd really suggest putting this challenge in the Sandbox, both as general advice, and because this challenge has unusual format and scoring which can be both hard to balance well and hard to explain clearly. – xnor – 2020-02-13T12:41:19.260

4And maybe I'm missing something, but does anything stop me from "proposing" a random string of 10,000 lowercase letters along with a single random substring that appears once? Then, any program that doesn't guess the next letter gets a score of 0. And because my string is random as is the position of the substring, I think any generator has to encode the full string S, or a packed form, in is code. And this makes it seem like optimally compressing strings of lowercase letters into the character-space of your language is a crucial part of the challenge. – xnor – 2020-02-13T12:53:19.813

1You definitely want to rethink that scoring formula. Right now, the only possible score is 0: nothing prevents me from adding a test case where S is the empty string, right? And even if you ban the empty string, now the score is 1 / (1 + length of the program generated for a length-1 test case). – Grimmy – 2020-02-13T16:18:16.343

Predictor Generator

No answers