How can I score challenges with variable problem sizes?

21

2

There is fairly strong support on meta for challenge-writing questions to be on topic on main, provided these questions are specific and answerable. However, we don't have any such questions yet, so I figured I'd test the waters. This question is probably entering good subjective, bad subjective territory, but I think that's likely what challenge-writing questions will have to be. To ensure that they still generate high-quality content, please don't just post wild speculative ideas in the answers. Explain why they avoid the problems mentioned below, or ideally point to existing challenges which have successfully used the suggested technique in the past.

For certain optimisation challenges, a free parameter in setting the challenge is the size of the problem to be optimised. By "optimisation challenge" I don't mean things like our genre, where answers are usually required to be exact/optimal, and the challenge is scored either on a fixed problem size or on the largest problem size that can be handled in a fixed amount of time. I mean specifically challenges where suboptimal solutions to the underlying problem are allowed and even likely, and the goal is to do as well as possible.

For the sake of definiteness, consider busy beaver challenges, although in principle this applies to other challenge types without known optimal solutions as well (I'm just using busy beavers here because they exacerbate the problems mentioned below). Say, I wanted to make a challenge about finding the busiest Brainfuck beaver. The free parameter in busy beaver problems is the size of the code. I can't set the challenge without making reference to the code size in some way. In a way, each value of the problem-size parameter N gives a separate (increasingly difficult) challenge. My main question is how I can make such a challenge work without running into balancing problems.

The obvious solution is to fix N: "Find a terminating Brainfuck program with N bytes of source code which prints as many characters as possible/runs for as many ticks as possible." This has massive balancing issues: if I pick the size too small, someone might quickly find the busiest beaver and the challenge is over. If I pick the size too large, the optimal solution will print an astronomical amount of characters before terminating, which means that it will likely be trivial to find such programs and the challenge becomes a chore/exercise in patience — this also leaves the territory where busy beavers can be found programmatically, and instead people would need to start proving their results formally which a lot of people might not consider very fun. Of course, this problem is more pronounced in busy beaver challenges than other types, due to the growth of the optimal solutions, but it applies to other challenges nevertheless.

The next option would to leave N unconstrained and make it part of the scoring via some function. Even for "normal" challenges getting the balance of combined scores right is incredibly difficult, but in the case of busy beavers it's actually fundamentally impossible, due to the fact that optimal solutions grow faster with N than any computable function. That means I can always beat the best existing answer by going to a sufficiently large N where I can easily find a program that runs for so long that I can get a better score without much effort.

I have also considered setting a fixed N and allowing people to also submit beavers for larger N which will be used as successive tie breakers. This has a similar problem, in that someone can just "happen to find an equally good busy beaver" for a N, thereby creating a tie and then just submitting pretty much anything for the next N where finding a large score is easier (even if finding the optimal score gets harder). In these cases, how would you deal with multiple people using the same solution? Forbidding it would also be weird in case it's optimal.

Maybe one could strike a middle ground, by making an educated guess at a reasonable N and then asking for busy beavers for all sizes within (say) 5 bytes of N, so that there's some leeway in both directions (and then you combine the ~10 scores into a single one by one technique or another). This doesn't feel quite satisfactory either because my initial guess for N could still be widely off the range that makes for interesting challenges.

TL;DR: in cases where the challenge is to (suboptimally solve and) optimise a problem whose size is variable, how do I incorporate the size into the challenge? Ideally I would want people to be able to work with a value of N that is near the upper end of the range of tractable sizes. But just in case it turns out that optimal solutions are possible for that N, it would be great if solutions for slightly larger N would start to weigh in, such that the challenge could continue with a more interesting problem size.

Martin Ender

Posted 2016-04-19T18:24:33.733

Reputation: 184 808

6I like this as a model for challenge-writing questions because it's not specific to PPCG. It's not "How should we do this?" but "What's a good way to do this?". I could imagine such challenges being run on a programming hobbyist site or at an in-person competition. – xnor – 2016-04-19T20:59:15.310

Put the tldr at the top! – Majora320 – 2016-04-20T06:34:59.620

1@Majora320 ...but then change d to w :-) – Luis Mendo – 2016-04-20T11:39:59.620

Answers

3

Find the next N

The challenge would indicate an N that submissions should start at.

Then, people would submit answers at the current N. If a given submission is proven to be optimal, then N is increased by 1, and the process repeats.

There are several ways to score this:

  1. Score the best submission at the current N
  2. Give a point to the best submission at the current N, plus a point for each optimal solution
  3. Same as #2, but also give a point to the person that proved that a given submission was optimal.

Nathan Merrill

Posted 2016-04-19T18:24:33.733

Reputation: 13 591

1

Give points for solutions within a bounded N

Allow N to be within a fixed bounds. The lower bound should exclude obviously trivial answers, and that the higher bound shouldn't be too far from the lower bound.

Then, give 1 point for each person that has the best solution for each N within the bounds. If higher N means that the solution is harder, then give N points to them. (or some formula based on N).

This method is similar to how AZsPCs does it, but partial points aren't given.

Nathan Merrill

Posted 2016-04-19T18:24:33.733

Reputation: 13 591