42
5
Notice: Following popular demand I have slightly relaxed the rules:
- The maximum regex size grows by 1 byte every 5 answers. Answer N may use up to 29 + ⌈N/5⌉ bytes.
- The score of each answer will be (M/(30+N/5))N
In regex golf, you're given two sets of strings, and are asked to create the shortest regex which matches all strings in the first set, but fails on all strings in the second set.
That is what we're going to do, but each time someone answers, their regex itself will be added to one of the two sets of strings (of their own choice). Therefore, there is a strict order to answers in this challenge.
Let's go through an example:
- Say I start this with
abc
(which I won't), and put it in the match set. - Then a valid second answer would be
a
, as it matches the above (and there are no strings that need to fail yet). Say this answer goes in the fail set. - Now the third answer has to match
abc
but fail ona
. A possible third answer is thereforeb
. Let's put this in the match set. - The fourth answer now has to match
abc
andb
, but fail ona
. We'll disallow duplicate answers, so a valid regex would bec|b
.
What's important is that your answer should be as short as possible. This may be trivial for the first few answers, but once we get a few answers, it should get harder and harder to get the desired match in as few characters as possible.
For the actual challenge, initially the match set contains PPCG
and the fail set contains [PPCG]
, and I have already provided the first answer.
Answering
The key thing to understand about this challenge is that only one person can answer at a time and each answer depends on the one before it.
There should never be two answers with the same N
. If two people happen to simultaneously answer for some N
, the one who answered later (even if it's a few seconds difference) should graciously delete their answer.
To make this run a bit smoother, try to stick to the following steps when posting your answer:
- Make sure that someone has independently verified the correctness of the previous answer (and left a corresponding comment).
- Take the two test sets found in the previous answer, and write a regex which matches all strings in one set and none in the other.
Post your answer in the following format:
# N. [regex flavour] - [regex size in bytes] [regex] [link to online regex tester] [notes, explanation, observations, whatever] ### The next answer has to match the following strings: [match set] ### And fail on these strings: [fail set]
where
N
is the number of your answer. Please copy[match set]
and[fail set]
from the previous answer, and append your regex to one of them.This is absolutely vital to the challenge! I've provided a dashboard tool for the challenge to help with the bookkeeping, and it relies on the above template. (See bottom of the post.)
- Another user should now review your submission and leave a comment "Correctness verified" if your answer follows all the rules (see below). If it doesn't, they should leave a comment pointing out any flaws. You've then got 15 minutes to fix those issues. If you don't, your answer will be deemed invalid, should be deleted, and someone else may post a follow-up answer to the previous one. (If this happens, you're free to submit a new answer any time.)
These regulations may seem rather strict, but they are necessary to avoid invalid answers somewhere up the chain.
Rules
- A user may only submit one answer per 4 hour period. (This is to prevent users from constantly watching the question and answering as much as possible.)
- A user may not submit two answers in a row. (e.g. since I submitted answer 1 I can't do answer 2, but I could do 3.)
- Do not edit answers that have been verified. (Even if you find a way to shorten it!)
- Should a mistake be discovered earlier in the chain (i.e. after follow-up answers have been posted), the offending answer should be deleted and will be removed from the set of strings that new submissions should fail on. However, all answers that have been posted since should not be changed to reflect.
- Clearly state one flavour your regex is valid in. You may choose any flavour that is freely testable online. There's a good list of online testers over on StackOverflow. In particular, Regex101 and RegexPlanet should be useful, as they support a wide variety of flavours. Please include a link to the tester you chose in your answer. By switching on the
g
lobal andm
ultiline modifiers in the tester, you can test all strings at once, one on each line (these modifiers are not counted towards your regex size, because they aren't needed on any individual string). - Your regex must not be empty.
- Your regex for answer N must not be longer than 29 + ⌈N/5⌉ bytes. I.e. answers 1 to 5 may use up to 30 bytes (inclusive), answers 6 to 10 may use up to 31 bytes... answers 31 to 35 may use up to 36 bytes. Check the dashboard to see how many characters the next answer may use.
- Your regex must not be identical to any string in either test set.
- Do not include delimiters in your submission or byte count, even if the relevant host language uses them. If your regex uses modifiers, add one byte per modifier to the regex size. E.g.
/foo/i
would be 4 bytes.
Scoring
Each answer's score is calculated as (M/(30+N/5))N, where M is the size of the regex in bytes, and N is it's number. Each user's score is the product of all their answers. The user with the lowest overall score wins. In the unlikely event of a tie, the user with the latest submission wins. I will accept that user's latest answer.
If you prefer summing scores, you can calculate each answer's score as N * (log(M) - log(30)) and sum those up over all answers. That will give the same leaderboard order.
There is no need to include an answer's score in the answer, just report M. The challenge dashboard at the bottom of the question will compute the scores, and in the event of two very close scores, I'll double check the results using arbitrary-precision types.
Note that the score of each answer is less than 1, so you can improve your overall score by providing a new answer. However, the shorter each of your submissions, the more efficiently you can lower your score. Furthermore, later answers can achieve a lower score although being longer, due to the increasing exponent.
Dashboard
I've written a little Dashboard tool, using Stack Snippets, based on Optimizer's work over here. I hope this will help us get some order into these answer-dependent challenges.
This will display the current status of the challenge - in particular, if there are conflicting answers, if an answer needs to be verified, or if the next answer can be posted.
It also produces a list of all answers with scores, as well as a leaderboard of all users. Please stick to the challenge format above, so the dashboard can read out the relevant strings from your answers. Otherwise you might not be included in the leaderboard.
Please let me know (ideally in chat) if you spot any bugs or have some ideas how the usefulness of the tool could be improved.
function answersUrl(e){return"http://api.stackexchange.com/2.2/questions/"+QUESTION_ID+"/answers?page="+e+"&pagesize=100&order=desc&sort=creation&site=codegolf&filter="+ANSWER_FILTER}function commentsUrl(e,t){return"http://api.stackexchange.com/2.2/answers/"+e+"/comments?page="+t+"&pagesize=100&order=asc&sort=creation&site=codegolf&filter="+COMMENT_FILTER}function getAnswers(){$.ajax({url:answersUrl(page++),method:"get",dataType:"jsonp",crossDomain:true,success:function(e){answers.push.apply(answers,e.items);if(e.has_more)getAnswers();else{page=1;getFinalComments()}}})}function getFinalComments(){answers=answers.filter(shouldHaveHeading);answers=answers.filter(shouldHaveScore);$.ajax({url:commentsUrl(answers[0].answer_id,page++),method:"get",dataType:"jsonp",crossDomain:true,success:function(e){comments.push.apply(comments,e.items);if(e.has_more)getAnswers();else process()}})}function shouldHaveHeading(e){var t=false;try{t|=/^(#|<h).*/.test(e.body_markdown);t|=["-","="].indexOf(e.body_markdown.split("\n")[1][0])>-1}catch(n){}return t}function shouldHaveScore(e){var t=false;try{t|=SIZE_REG.test(e.body_markdown.split("\n")[0])}catch(n){}return t}function findDuplicates(e){var t=false;var n={};e.forEach(function(e){var r=e.body_markdown.split("\n")[0].match(NUMBER_REG)[0];if(n[r])t=t||r;n[r]=true});return t}function hasBeenVerified(e,t){var n=false;t.forEach(function(t){n|=/correctness verified/i.test(t.body_markdown)&&e!=t.owner.user_id});return n}function userTimedOut(e){return NOW-e.creation_date*1e3<MSEC_PER_ANSWER}function getAuthorName(e){return e.owner.display_name}function getAnswerScore(e,t){e=parseInt(e);t=parseInt(t);return Math.pow(t/(30+e/5),e)}function process(){$("#last-user").append(answers[0].owner.display_name);var e=answers.slice(1).filter(userTimedOut).map(getAuthorName).join(", ");if(e)$("#timed-out-users").append(e);else $("#timed-out-notice").hide();var t=answers[0].body_markdown.split("\n")[0].match(NUMBER_REG)[0];var n=findDuplicates(answers);if(n){var r=$("#status-conflict-template").html();$("#challenge-status").append(r.replace("{{NUMBER}}",n));$("#challenge-status").addClass("conflict")}else if(!hasBeenVerified(answers[0].owner.user_id,comments)){var r=$("#status-verification-template").html();$("#challenge-status").append(r.replace("{{NUMBER}}",t));$("#challenge-status").addClass("verification")}else{var r=$("#status-next-template").html();$("#challenge-status").append(r.replace("{{NUMBER}}",t).replace("{{NEXT}}",parseInt(t)+1).replace("{{SIZE}}",29+Math.ceil((parseInt(t)+1)/5)));$("#challenge-status").addClass("next")}var i={};var s={};answers.forEach(function(e){var t=e.body_markdown.split("\n")[0];var n=$("#answer-template").html();var r=t.match(NUMBER_REG)[0];var o=(t.match(SIZE_REG)||[0])[0];var u=getAnswerScore(r,o);var a=getAuthorName(e);n=n.replace("{{NAME}}",a).replace("{{NUMBER}}",r).replace("{{SIZE}}",o).replace("{{SCORE}}",u.toExponential(5)).replace("{{LINK}}",e.share_link);n=$(n);$("#answers").append(n);i[a]=(i[a]||1)*u;s[a]=(s[a]||0)+1});var o=[];for(var u in i)if(i.hasOwnProperty(u)){o.push({name:u,numAnswers:s[u],score:i[u]})}o.sort(function(e,t){return e.score-t.score});var a=1;o.forEach(function(e){var t=$("#user-template").html();t=t.replace("{{NAME}}",e.name).replace("{{NUMBER}}",a++).replace("{{COUNT}}",e.numAnswers).replace("{{SCORE}}",e.score.toExponential(5));t=$(t);$("#users").append(t)})}var QUESTION_ID=41462;var ANSWER_FILTER="!*cCFgu5yS6BFQP8Z)xIZ.qGoikO4jB.Ahv_g-";var COMMENT_FILTER="!)Q2B_A497Z2O1kEH(Of5MUPK";var HOURS_PER_ANSWER=4;var MSEC_PER_ANSWER=HOURS_PER_ANSWER*60*60*1e3;var NOW=Date.now();var answers=[],comments=[],page=1;getAnswers();var SIZE_REG=/\d+(?=[^\d&]*(?:<(?:s>[^&]*<\/s>|[^&]+>)[^\d&]*)*$)/;var NUMBER_REG=/\d+/
body{text-align:left!important}#challenge-status{font-weight:700;padding:10px;width:620px}#blocked-users{padding:10px;width:620px}.conflict{background:#994343;color:#fff}.verification{background:#FFDB12}.next{background:#75FF6E}#last-user,#timed-out-users{font-weight:700}#answer-list{padding:10px;width:300px;float:left}#leaderboard{padding:10px;width:290px;float:left}table thead{font-weight:700}table td{padding:5px}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script> <link rel="stylesheet" type="text/css" href="//cdn.sstatic.net/codegolf/all.css?v=83c949450c8b"> <div id="challenge-status"> </div><div id="blocked-users"> User <span id="last-user"></span> has posted the last answer, and may not post the next one. <div id="timed-out-notice"><span id="timed-out-users"></span> have answered within the last four hours and may not answer again yet. (If a user appears in this list twice, they must have answered twice within four hours!)</div></div><div id="answer-list"> <h2>List of Answers (newest first)</h2> <table class="answer-list"> <thead> <tr><td>No.</td><td>Author</td><td>Size</td><td>Score</td></tr></thead> <tbody id="answers"> </tbody> </table> </div><div id="leaderboard"> <h2>Leaderboard</h2> <table class="leaderboard"> <thead> <tr><td>No.</td><td>User</td><td>Answers</td><td>Score</td></tr></thead> <tbody id="users"> </tbody> </table> </div><table style="display: none"> <tbody id="answer-template"> <tr><td>{{NUMBER}}</td><td>{{NAME}}</td><td>{{SIZE}}</td><td>{{SCORE}}</td><td><a href="{{LINK}}">Link</a></td></tr></tbody> </table> <table style="display: none"> <tbody id="user-template"> <tr><td>{{NUMBER}}</td><td>{{NAME}}</td><td>{{COUNT}}</td><td>{{SCORE}}</td></tr></tbody> </table> <div id="status-conflict-template" style="display: none"> There is more than one answer with number {{NUMBER}}!<br>Please resolve this conflict before posting any further answer. </div><div id="status-verification-template" style="display: none"> Answer {{NUMBER}} has not been verified!<br>Please review the answer and post a comment reading "Correctness verified." on the answer if it is valid. Note that this has to be done by a different user than the author of the answer! </div><div id="status-next-template" style="display: none"> Answer {{NUMBER}} has been verified!<br>You may now post answer {{NEXT}}, using up to {{SIZE}} bytes. </div>
The rules turned out to be a bit stricter than I intended. After some discussion in chat I'm considering to relax the rules a little after the bounty has run out. I'll post 3 comments for the options I can think of below. Please indicate your preference by voting on the comments. – Martin Ender – 2014-11-22T12:36:15.607
2Rules are rules. Don't change them. It might be a shame that it's pretty much impossible to post another answer, but that doesn't justify changing the rules. – Martin Ender – 2014-11-22T12:37:22.820
2Allow for an additional byte every 10 answers. Correspondingly, change the answer score to (M/(30+N/10))^N. This would be applied retroactively, so the next answer could use up to 32 bytes. The change in scoring would not affect the top two places on the leaderboard, but the other users would be shuffled somewhat. – Martin Ender – 2014-11-22T12:38:31.200
8Allow for an additional byte every 5 answers. Correspondingly, change the answer score to (M/(30+N/5))^N. This would be applied retroactively, so the next answer could use up to 35 bytes. The change in scoring would not affect the top two places on the leaderboard, but the other users would be shuffled somewhat. – Martin Ender – 2014-11-22T12:39:33.537
(Note that the order of the leaderboard changes, because the alternative scoring methods would give a bigger advantage to late answers, with the 5-answer option emphasising late answers even more than the 10-answer option. This is probably a good thing, because I've received several complaints that the easy, early answers had too much of an advantage over late answers in the original rules.) – Martin Ender – 2014-11-22T13:08:43.773
4You people are strange and twisted. Why would you do this to yourselves? (It's fun to read though :P) – Joe – 2014-11-25T12:06:56.930