Degree of Unsaturation

11

Degree of Unsaturation

This is not a particularly hard code puzzle - but I'm interested to see your multiple ways of solving it.

The Degree of Unsaturation is the number of double chemical bonds between atoms, and/or the number rings in a chemical compound.

You will be given the molecular formula of a chemical compound in the form XaYbZc (where a, b and c are the number of atoms of X, Y or Z in the compound) - the formula could be of any length and contain any chemical element in the periodic table (though elements other than C, H, N, F, Cl, Br, I may be ignored as they do not feature in the formula). The compound will contain at least one atom of carbon. You must calculate and display its Degree of Unsaturation.

For example, the compound benzene (pictured below) has a DoU of 4 as it has three double bonds (shown by a double line between atoms), and a single ring (a number of atoms connected in a loop):

benzene ring

As defined by LibreTexts:

DoU = (2C + 2 + N − X − H ) / 2

Where:

  • C is the number of carbon atoms
  • N is the number of nitrogen atoms
  • X is the number of halogen atoms (F, Cl, Br, I)
  • H is the number of hydrogen atoms

Test cases:

C6H6 --> 4
C9H2O1 --> 0
C9H9N1O4 --> 6
U1Pt1 --> Not a valid input, no carbon
Na2O1 --> Not a valid input, no carbon
C1H1 --> 1.5, although in practice this would be one, but is a part of a compound rather than a compound in entirety. 
N1H3 would return 0 - though in practice it isn't an organic compound (in other words it contains no carbon) so the formula wouldn't apply and it isn't a valid input

For an explanation of CH see here

In essence, you must identify if there are any of the above elements (C, H, N, F, Cl, Br, I) in the compound, and if so how many there are. Then, calculate the Degree of Unsaturation using the above formula.

Only C, H, N, F, Cl, Br, and I are valid inputs for the DoU formula. For the purposes of this puzzle, any other elements may be completely ignored (eg if the compound were C6H6Mn the result would still be 4). If there are none of the above compounds the answer would be zero.

You may assume that all the compounds input are chemically possible, contain at least one atom of carbon, and are known to exist. If the input is invalid, the program may output either 0 or -1, or produce no result.

Rules

Standard IO rules and loopholes apply. Input must be a standard string, and you can assume the input won't be empty. This is codegolf - so the shortest code in bytes wins.

Archie Roques

Posted 2017-06-27T20:57:22.930

Reputation: 111

Proposed test cases: Sodium oxide: Na2O and Methylidyne: CH and CCl4He. These are some corner cases that may break a few solutions. By the way, not that it matters for anyone other than Mathematica (probably), but can we assume that the compounds (can) exist? – Stewie Griffin – 2017-06-27T21:35:42.253

I don't understand C9H2O1 --> 0. Shouldn't it be 9? (2*9+2+0-0-2)/2 – DLosc – 2017-06-28T07:34:17.453

according to last paragraph, do you mean the code must be able to deal with invalid inputs? By the way, is it guaranteed that every single element in the compound has a trailing '1' as in C1H1? – Keyu Gan – 2017-06-28T08:38:21.603

@KeyuGan yes and yes. – Archie Roques – 2017-06-28T08:40:52.320

Answers

2

Python 3, 142 151 148 bytes

import re
l=dict(re.findall("(\D+)(\d+)",input()))
m=lambda k:int(l.get(k,0))
print(m("C")and m("C")+1+(m("N")-sum(map(m,"F I H Cl Br".split())))/2)

Returns 0 on error.

Thanks to @HyperNeutrino bringing the bytes down.

Try it online!

MooseOnTheRocks

Posted 2017-06-27T20:57:22.930

Reputation: 191

oops - test cases updated! – Archie Roques – 2017-06-27T22:48:52.810

Does't quite work – HyperNeutrino – 2017-06-27T22:50:15.860

@HyperNeutrino Test cases were unclear for a bit. Now has no output on invalid input. – MooseOnTheRocks – 2017-06-27T23:18:55.953

148 bytes – HyperNeutrino – 2017-06-28T00:56:43.147

Nice use of dict there! – DLosc – 2017-06-28T07:41:57.100

146 bytes – Felipe Nardi Batista – 2017-06-28T12:45:17.890

2

JavaScript (ES6), 117 112 bytes

Returns 0 for invalid inputs.

s=>s.split(/(\d+)/).reduce((p,c,i,a)=>p+[0,k=a[i+1]/2,2*k,-k][n='NCFHIClBr'.search(c||0)+1,s|=n==2,n>2?3:n],1)*s

Test cases

let f =

s=>s.split(/(\d+)/).reduce((p,c,i,a)=>p+[0,k=a[i+1]/2,2*k,-k][n='NCFHIClBr'.search(c||0)+1,s|=n==2,n>2?3:n],1)*s

console.log(f("C6H6"))     // --> 4
console.log(f("C9H20"))    // --> 0
console.log(f("C9H9N1O4")) // --> 6
console.log(f("U1Pt1"))    // --> 0 (invalid)
console.log(f("Na2O1"))    // --> 0 (invalid)
console.log(f("C1H1"))     // --> 1.5
console.log(f("N1H3"))     // --> 0 (invalid)

Alternate version, 103 bytes

If the input was guaranteed to be valid -- as the challenge introduction is misleadingly suggesting -- we could just do:

s=>s.split(/(\d+)/).reduce((p,c,i,a)=>p+[0,k=a[i+1]/2,2*k,-k][n='NCFHIClBr'.search(c||0)+1,n>2?3:n],1)

Demo

let f =

s=>s.split(/(\d+)/).reduce((p,c,i,a)=>p+[0,k=a[i+1]/2,2*k,-k][n='NCFHIClBr'.search(c||0)+1,n>2?3:n],1)

console.log(f("C6H6"))     // --> 4
console.log(f("C9H20"))    // --> 0
console.log(f("C9H9N1O4")) // --> 6
console.log(f("C1H1"))     // --> 1.5

Arnauld

Posted 2017-06-27T20:57:22.930

Reputation: 111 334

0

Pip, 70 67 bytes

`C\d`Na&1+/2*VaR+XDs._R['C'NC`H|F|I|Cl|Br`].s["+2*"'+'-]RXU.XX"+0*"

Takes the chemical formula as a command-line argument. Outputs 0 for invalid inputs. Try it online!

Explanation

Uses a series of regex replacements to turn the chemical formula into a mathematical formula, evals it, and makes a couple tweaks to get the final value.

The replacements (slightly ungolfed version):

aR+XDs._R"C ""+2*"R"N "'+R`(H|F|I|Cl|Br) `'-RXU.XX"+0*"

a                    Cmdline arg
 R+XD                 Replace runs of 1 or more digits (\d+)
     s._               with a callback function that prepends a space
                       (putting a space between each element and the following number)
 R"C "                Replace carbon symbol
      "+2*"            with +2* (add 2* the number of carbon atoms to the tally)
 R"N "                Replace nitrogen symbol
      '+               with + (add the number of nitrogen atoms to the tally)
 R`(H|F|I|Cl|Br) `    Replace hydrogen or halogen symbol
                  '-   with - (subtract the number of atoms from the tally)
 RXU.XX               Replace uppercase letter followed by another char ([A-Z].)
       "+0*"           with +0* (cancel out numbers of all other kinds of atoms)

We eval the resulting string with V. This gives us 2C + N − X − H. To get the correct value, we make the following adjustments:

`C\d`Na&1+/2*V...

             V...  Value of expression calculated above
          /2*      multiplied by 1/2
        1+         plus 1
`C\d`Na            Is carbon in the original formula? (i.e. C followed by a digit)
       &           Logical AND: if no carbon, return 0, otherwise return the formula value

DLosc

Posted 2017-06-27T20:57:22.930

Reputation: 21 213

0

C (gcc), 195 197 202 bytes

Probably the longest answer.

d,c,b,e,n;f(char*a){for(c=d=0;b=*a;d+=e?e-1?b-66?b-67?0:e-2?0:-n:e-3?0:-n:b-67?b-78?b/70*73/b?-n:0:n:(c=2*n):0)e=*++a>57?*a-108?*a-114?0:3:2:1,a+=e>1,n=strtol(a,&a,10);printf("%.1f",c?d/2.+1:0);}

Try it online!

Returns 0 on error.

Keyu Gan

Posted 2017-06-27T20:57:22.930

Reputation: 2 028