Regex for Finding Non-Radioactive Elements

3

Find the shortest regex that matches all non-radioactive elements and nothing else in the Periodic Table of Elements. This is the inverse of Regex for Finding Radioactive Elements

Radioactive Elements

'Technetium','Promethium','Polonium','Astatine','Radon','Francium','Radium','Actinium','Thorium','Protactinium','Uranium','Neptunium','Plutonium','Americium','Curium','Berkelium','Californium','Einsteinium','Fermium','Mendelevium','Nobelium','Lawrencium','Rutherfordium','Dubnium','Seaborgium','Bohrium','Hassium','Meitnerium','Darmstadtium','Roentgenium','Copernicium','Ununtrium','Flerovium','Ununpentium','Livermorium','Ununseptium','Ununoctium'

Non-radioactive Elements

'Hydrogen','Helium','Lithium','Beryllium','Boron','Carbon','Nitrogen','Oxygen','Fluorine','Neon','Sodium','Magnesium','Aluminium','Silicon','Phosphorus','Sulphur','Chlorine','Argon','Potassium','Calcium','Scandium','Titanium','Vanadium','Chromium','Manganese','Iron','Cobalt','Nickel','Copper','Zinc','Gallium','Germanium','Arsenic','Selenium','Bromine','Krypton','Rubidium','Strontium','Yttrium','Zirconium','Niobium','Molybdenum','Ruthenium','Rhodium','Palladium','Silver','Cadmium','Indium','Tin','Antimony','Tellurium','Iodine','Xenon','Caesium','Barium','Lanthanum','Cerium','Praseodymium','Neodymium','Samarium','Europium','Gadolinium','Terbium','Dysprosium','Holmium','Erbium','Thulium','Ytterbium','Lutetium','Hafnium','Tantalum','Tungsten','Rhenium','Osmium','Iridium','Platinum','Gold','Mercury','Thallium','Lead','Bismuth'
  • Scored by character count in the regex.
  • Use standard Perl regex (just no specialized functions).
  • Assume all lower case.
  • You only need to count the characters of the regex itself.

Note if you used a program to get you started and maybe post how well it did. I'll post my best attempt as an answer to get started/show an example.

Edit: Apparently Bismuth is radioactive, but since I'm assuming the universe would die from heat death before it was ever much of a problem I'm not going to worry about it now.

qw3n

Posted 2014-01-08T21:40:21.263

Reputation: 733

In addition to your other question, I also ran this through Peter Norvig's Regex Golf solver, but it spat out a 99 character solution. Sometimes humans come out on top. – EMBLEM – 2015-03-07T23:13:46.533

1

Nitpick: Bismuth is technically radioactive, although its decay rate is so vanishingly low that it can be treated as stable for all practical purposes.

– Ilmari Karonen – 2014-01-10T01:24:31.583

@IlmariKaronen Interesting didn't know that. I copied the lists from the internet and you know how reliable that is ;). – qw3n – 2014-01-10T01:47:44.450

I don't think the question is well specified. My understanding of "the shortest regex that matches all non-radioactive elements and nothing else" is that the string "y" shouldn't match since "y" isn't the name of a non-radioactive element; but both the solutions currently posted match "y"; effectively they do substring matches. My understanding of the question would be that the answer should be the shortest string representing a regular expression that is equivalent to the regular expression 'Hydrogen|Helium|...|Bismuth' (ie. the RE formed by joining the names of all non-radioactive elements w – None – 2014-01-10T02:47:15.760

Answers

2

Small improvement on qw3n's solution (93 characters):

y|^v|h.[lfdn]|[^te]i[rodnts]|[lp].t[^o]|ru?[bs]|ll|^..ro|^[^r][^u].{1,4}$|ca..i|ma|s.l|tan|gs

Basically got rid of ^t[^eh] (7 characters) clause and replaced with tan|gs (6 characters). Also I count qw3n's solution as 94 characters, not 95.

dr jimbob

Posted 2014-01-08T21:40:21.263

Reputation: 336

3

Character count 94

y|^v|h.[lfdn]|^t[^eh]|[^te]i[rodnts]|[lp].t[^o]|ru?[bs]|ll|^..ro|^[^r][^u].{1,4}$|ca..i|ma|s.l

qw3n

Posted 2014-01-08T21:40:21.263

Reputation: 733

What language is this? – Timtech – 2014-01-08T22:17:24.730

1@Timtech javascript's regex – qw3n – 2014-01-08T22:34:33.503

3

I was tempted (and am still debating internally) to vote to close this as a duplicate of your other regex golf question on the basis that the optimal solutions are within a constant of each other, as witness:

70 chars

^(?!no|c?u|ra|.*(e.[kht]|[^l][gecv]i|[^c]oh?[rn]iu|f.r|ac|sta|bn|has))

Peter Taylor

Posted 2014-01-08T21:40:21.263

Reputation: 41 901

I see your point, and in the back of my mind I had thought of this. What if there was the restriction it needed to be solved without using a global not? – qw3n – 2014-01-10T16:51:38.217

1@qw3n, that has two problems. Firstly, as a general rule if you need to prohibit the best option it indicates a weakness in the question. Secondly, it could potentially kill a better solution which casts a fairly wide net and needs to exclude a couple of cases. – Peter Taylor – 2014-01-10T17:00:46.607

When solving the general problem I went from both directions to see what the results would be. This regex was unique from the other and a unique challenge to create, but yes it ignores the fact you can use the shorter regex by adding not. But that was the reason I posted both. I thought it would be a unique challenge from the other one. – qw3n – 2014-01-10T17:15:38.937