Make a regex polyglot

19

2

Write a regex that works in at least 2 regex flavors (or versions), and match a different string in each flavor (or version) it is running in.

The strings to be matched in this challenge are the first words of the Ubuntu code names, which are listed below. Your regex has to match from the top of the list. That is, if your regex works in 3 flavors, it has to match Warty Hoary and Breezy, and not others.

Warty
Hoary
Breezy
Dapper
Edgy
Feisty
Gutsy
Hardy
Intrepid
Jaunty
Karmic
Lucid
Maverick
Natty
Oneiric
Precise
Quantal
Raring
Saucy
Trusty
Utopic
Vivid
Wily
Xenial
Yakkety
Zesty
17.10
18.04
18.10
19.04
19.10
...

If your regex works in more than 26 flavors, you can match the Ubuntu version numbers instead. Starting from 17.10, for each new flavor, change the second number to 10 if it was 04, and increment the first number and change the second to 04 otherwise.

In each flavor, your regex should match only the supposed string and nothing else (not limited to the code names). Trailing newlines doesn't matter. That means your regex could either match only the string without the trailing newline, match only the string with the trailing newline, or match both. And it doesn't have to be consistent in different flavors. You can assume the input is in printable ASCII (except for the trailing newline, if there is one).

Your score is (the length of your code+10)/((number of flavors)^2). Lowest score wins.

jimmy23013

Posted 2016-12-03T07:38:36.137

Reputation: 34 042

1Just to check - by "In each flavor, your regex should match only the supposed string and nothing else.", does this mean each regex should only match one Ubuntu version name and none of the other names but can potentially match other non-version name strings, or does it mean that the regex can only match that exact string and no other strings, even if it isn't a version name in the list above? – Sp3000 – 2016-12-03T09:47:50.963

@Sp3000 It should match that exact string, and no other strings. – jimmy23013 – 2016-12-03T10:05:52.817

Answers

24

87 bytes, 5 flavours, (87+10)/25 = 3.88

^(((?=W)[[:word:]&&]art|Ho(?=a)\ar|Bre(?=ez)[]e]\z|E(?=dg)[[d]]g)y|(?=Da)[D-[E]]apper)$

I've gone with the easy to test flavours for now, which are:

The general structure is ^((...)y|...)$, i.e. factoring out the trailing ys and adding anchors.

Warty (PCRE)

(?=W)[[:word:]&&]art

In PCRE and Ruby, [[:word:]] is a POSIX character class matching a word character — in other flavours you get a [[:word:] character class then literal &&], which fails the (?=W) assert. To make Ruby fail, && is used to intersect the POSIX class with nothing, whereas in PCRE && has no special meaning.

Hoary (Javascript)

Ho(?=a)\ar

For whatever reason, Javascript is the only flavour out of the bunch where \a is a literal a — in other flavours it matches the bell character (ASCII 7).

Breezy (Python)

Bre(?=ez)[]e]\z

In Python and Javascript, \z is a literal z — in the other flavours it is equivalent to the $ end of string anchor. To make Javascript fail we use the char class []e], which is an empty char class [] then literal e] in Javascript, and a two-char class []e] in Python.

Dapper (.NET)

(?=Da)[D-[E]]apper

In .NET, [D-[E]] is a set difference, removing the set [E] from [D]. In PCRE, Javascript and Python, we have the class [D-[E] then a literal ]. Ruby's a little different, but for some reason it parses as a class [D-[E]] which only matches E, and I have yet to figure out why...

Edgy (Ruby)

E(?=dg)[[d]]g

Ruby allows char classes inside char classes, so [[d]] is actually equivalent to [d], or just d. In the other flavours, we have [[d] then a literal ].

Sp3000

Posted 2016-12-03T07:38:36.137

Reputation: 58 729