Find the Mr of a Given Compound!

12

2

Challenge

Given the formula of a chemical, output the Mr of the compound.

Equation

Each element in the compound is followed by a number that denotes the number of said atom in the compound. If there isn't a number, there is only one of that atom in the compound.

Some examples are:

  • Ethanol (C2H6O) would be C2H6O where there are two carbon atoms, 6 hydrogen atoms and 1 oxygen atom
  • Magnesium Hydroxide (MgO2H2) would be MgO2H2 where there is one magnesium atom, two oxygen atoms and two hydrogen atoms.

Note that you will never have to handle brackets and each element is included only once in the formula.

Whilst most people will probably stick to the order they feel most comfortable with, there is no strict ordering system. For example, water may be given as either H2O or OH2.

Mr

Note: Here, assume formula mass is the same as molecular mass

The Mr of a compound, the molecular mass, is the sum of the atomic weights of the atoms in the molecule.

The only elements and their atomic weights to 1 decimal place that you have to support (hydrogen to calcium, not including noble gases) are as follows. They can also be found here

H  - 1.0      Li - 6.9      Be - 9.0
B  - 10.8     C  - 12.0     N  - 14.0
O  - 16.0     F  - 19.0     Na - 23.0
Mg - 24.3     Al - 27.0     Si - 28.1
P  - 31.0     S  - 32.1     Cl - 35.5
K  - 39.1     Ca - 40.1

You should always give the output to one decimal place.

For example, ethanol (C2H6O) has an Mr of 46.0 as it is the sum of the atomic weights of the elements in it:

12.0 + 12.0 + 1.0 + 1.0 + 1.0 + 1.0 + 1.0 + 1.0 + 16.0
(2*C + 6*H + 1*O)

Input

A single string in the above format. You can guarantee that the elements included in the equation will be actual elemental symbols.

The given compound isn't guaranteed to exist in reality.

Output

The total Mr of the compound, to 1 decimal place.

Rules

Builtins which access element or chemical data are disallowed (sorry Mathematica)

Examples

Input > Output
CaCO3 > 100.1
H2SO4 > 98.1
SF6 > 146.1
C100H202O53 > 2250.0

Winning

Shortest code in bytes wins.

This post was adopted with permission from caird coinheringaahing. (Post now deleted)

Beta Decay

Posted 2017-06-11T10:44:29.877

Reputation: 21 478

Do we have to handle quantifiers, such as: 2H2O? – Mr. Xcoder – 2017-06-11T12:21:49.470

6For the curious, this is the Mathematica solution (53 bytes): NumberForm[#&@@#~ChemicalData~"MolecularMass",{9,1}]& – JungHwan Min – 2017-06-11T13:59:25.983

Answers

6

Jelly, 63 bytes

ḟØDOP%⁽¡ṛị“ÇṚÆ’BH+“Ḳ"ɦṀ⁷6<s¡_-¦y⁼Ḟ¡¡FPɓ‘¤÷5
fØDVȯ1×Ç
Œs>œṗ⁸ḊÇ€S

A monadic link accepting a list of characters and returning a number.

Try it online!

How?

ḟØDOP%⁽¡ṛị“ÇṚÆ’BH+“Ḳ"ɦṀ⁷6<s¡_-¦y⁼Ḟ¡¡FPɓ‘¤÷5 - Link 1, Atomic weight: list of characters
                                            -                              e.g. "Cl23"
 ØD                                         - digit yield = "0123456789"
ḟ                                           - filter discard                      "Cl"
   O                                        - cast to ordinals                [67,108]
    P                                       - product                            7236
      ⁽¡ṛ                                   - base 250 literal = 1223
     %                                      - modulo                             1121
                                        ¤   - nilad followed by link(s) as a nilad:
          “ÇṚÆ’                             -   base 250 literal  = 983264
               B                            -   convert to binary = [    1,    1,     1,     1,   0,  0,  0,   0, 0,  0,  0, 0,     1,     1,     1, 0, 0,  0,  0,   0]
                H                           -   halve             = [  0.5,  0.5,   0.5,   0.5,   0,  0,  0,   0, 0,  0,  0, 0,   0.5,   0.5,   0.5, 0, 0,  0,  0,   0]
                  “Ḳ"ɦṀ⁷6<s¡_-¦y⁼Ḟ¡¡FPɓ‘    -   code-page indexes = [177  , 34  , 160  , 200  , 135, 54, 60, 115, 0, 95, 45, 5, 121  , 140  , 195  , 0, 0, 70, 80, 155]
                 +                          -   addition          = [177.5, 34.5, 160.5, 200.5, 135, 54, 60, 115, 0, 95, 45, 5, 121.5, 140.5, 195.5, 0, 0, 70, 80, 155]
         ị                                  - index into (1-indexed and modular)
                                            -    ...20 items so e.g. 1121%20=1 so 177.5
                                         ÷5 - divide by 5                          35.5

fØDVȯ1×Ç - Link 2: Total weight of multiple of atoms: list of characters   e.g. "Cl23"
 ØD      - digit yield = "0123456789"
f        - filter keep                                                            "23"
   V     - evaluate as Jelly code                                                  23
    ȯ1   - logical or with one (no digits yields an empty string which evaluates to zero)
       Ç - call last link (1) as a monad (get the atomic weight)                   35.5
      ×  - multiply                                                               816.5

Œs>œṗ⁸ḊÇ€S - Main link: list of characters                             e.g. "C24HCl23"
Œs         - swap case                                                      "c24hcL23"
  >        - greater than? (vectorises)                                      10011000
     ⁸     - chain's left argument                                          "C24HCl23"
   œṗ      - partition at truthy indexes                          ["","C24","H","Cl23"]
      Ḋ    - dequeue                                                 ["C24","H","Cl23"]
       Ç€  - call last link (2) as a monad for €ach                  [  288,  1,  816.5]
         S - sum                                                                 1105.5

Jonathan Allan

Posted 2017-06-11T10:44:29.877

Reputation: 67 804

This is one of the longest Jelly answers I have ever seen, but it still is less than half the length of the program currently in second, so good job! – Gryphon – 2017-06-11T22:45:33.723

6

Python 3,  189 182  168 bytes

-14 bytes by using the hash from Justin Mariner's JavaScript (ES6) answer.

import re
lambda s:sum([[9,35.5,39.1,24.3,28.1,14,16,31,40.1,23,32.1,10.8,12,27,6.9,19,0,1][int(a,29)%633%35%18]*int(n or 1)for a,n in re.findall("(\D[a-z]?)(\d*)",s)])

Try it online!


Below is the 182 byte version, I'll leave the explanation for this one - the above just changes the order of the weights, uses int to convert the element name from base 29, and uses different dividends to compress the range of integers down - see Justin Mariner's answer.

import re
lambda s:sum([[16,31,40.1,32.1,0,24.3,12,39.1,28.1,19,0,9,10.8,23,27,35.5,6.9,14,1][ord(a[0])*ord(a[-1])%1135%98%19]*int(n or 1)for a,n in re.findall("(\D[a-z]?)(\d*)",s)])

An unnamed function accepting a string, s, and returning a number.

Try it online!

How?

Uses a regex to split the input, s, into the elements and their counts using:
re.findall("(\D[a-z]?)(\d*)",s)
\D matches exactly one non-digit and [a-z]? matches 0 or 1 lowercase letter, together matching elements. \d* matches 0 or more digits. The parentheses make these into two groups, and as such findall("...",s) returns a list of tuples of strings, [(element, number),...].

The number is simple to extract, the only thing to handle is that an empty string means 1, this is achieved with a logical or since Python strings are falsey: int(n or 1).

The element string is given a unique number by taking the product of its first and last character's ordinals (usually these are the same e.g. S or C, but we need to differentiate between Cl, C, Ca, and Na so we cannot just use one character).

These numbers are then hashed to cover a much smaller range of [0,18], found by a search of the modulo space resulting in %1135%98%19. For example "Cl" has ordinals 67 and 108, which multiply to give 7736, which, modulo 1135 is 426, which modulo 98 is 34, which modulo 19 is 15; this number is used to index into a list of integers - the 15th (0-indexed) value in the list:
[16,31,40.1,32.1,0,24.3,12,39.1,28.1,19,0,9,10.8,23,27,35.5,6.9,14,1]
is 35.5, the atomic weight of Cl, which is then multiplied by the number of such elements (as found above).

These products are then added together using sum(...).

Jonathan Allan

Posted 2017-06-11T10:44:29.877

Reputation: 67 804

You are a genius... Outgolfed me by over 350 bytes – Mr. Xcoder – 2017-06-11T16:10:25.083

4

PHP, 235 bytes

preg_match_all("#([A-Z][a-z]?)(\d*)#",$argn,$m);foreach($m[1]as$v)$s+=array_combine([H,Li,Be,B,C,N,O,F,Na,Mg,Al,Si,P,S,Cl,K,Ca],[1,6.9,9,10.8,12,14,16,19,23,24.3,27,28.1,31,32.1,35.5,39.1,40.1])[$v]*($m[2][+$k++]?:1);printf("%.1f",$s);

Try it online!

Instead of array_combine([H,Li,Be,B,C,N,O,F,Na,Mg,Al,Si,P,S,Cl,K,Ca],[1,6.9,9,10.8,12,14,16,19,23,24.3,27,28.1,31,32.1,35.5,39.1,40.1]) you can use [H=>1,Li=>6.9,Be=>9,B=>10.8,C=>12,N=>14,O=>16,F=>19,Na=>23,Mg=>24.3,Al=>27,Si=>28.1,P=>31,S=>32.1,Cl=>35.5,K=>39.1,Ca=>40.1] with the same Byte count

Jörg Hülsermann

Posted 2017-06-11T10:44:29.877

Reputation: 13 026

3

JavaScript (ES6), 150 bytes

c=>c.replace(/(\D[a-z]?)(\d+)?/g,(_,e,n=1)=>s+=[9,35.5,39.1,24.3,28.1,14,16,31,40.1,23,32.1,10.8,12,27,6.9,19,0,1][parseInt(e,29)%633%35%18]*n,s=0)&&s

Inspired by Jonathan Allan's Python answer, where he explained giving each element a unique number and hashing those numbers to be in a smaller range.

The elements were made into unique numbers by interpreting them as base-29 (0-9 and A-S). I then found that %633%35%18 narrows the values down to the range of [0, 17] while maintaining uniqueness.

Test Snippet

f=
c=>c.replace(/(\D[a-z]?)(\d+)?/g,(_,e,n=1)=>s+=[9,35.5,39.1,24.3,28.1,14,16,31,40.1,23,32.1,10.8,12,27,6.9,19,0,1][parseInt(e,29)%633%35%18]*n,s=0)&&s
Input: <input oninput="O.value=f(this.value)"><br>
Result: <input id="O" disabled>

Justin Mariner

Posted 2017-06-11T10:44:29.877

Reputation: 4 746

Oh, I think your way would save me a few bytes too! – Jonathan Allan – 2017-06-15T05:13:40.900

2

Clojure, 198 194 bytes

Update: better to for than reduce.

#(apply +(for[[_ e n](re-seq #"([A-Z][a-z]?)([0-9]*)"%)](*(if(=""n)1(Integer. n))({"H"1"B"10.8"O"16"Mg"24.3"P"31"K"39.1"Li"6.9"C"12"F"19"Al"2"S"32.1"Ca"40.1"Be"9"N"14"Na"23"Si"28.1"Cl"35.5}e))))

Original:

#(reduce(fn[r[_ e n]](+(*(if(=""n)1(Integer. n))({"H"1"B"10.8"O"16"Mg"24.3"P"31"K"39.1"Li"6.9"C"12"F"19"Al"2"S"32.1"Ca"40.1"Be"9"N"14"Na"23"Si"28.1"Cl"35.5}e))r))0(re-seq #"([A-Z][a-z]?)([0-9]*)"%))

I'm wondering if there is a more compact way to encode the look-up table.

NikoNyrh

Posted 2017-06-11T10:44:29.877

Reputation: 2 361

2

Python 3, 253 bytes

def m(f,j=0):
 b=j+1
 while'`'<f[b:]<'{':b+=1
 c=b
 while'.'<f[c:]<':':c+=1
 return[6.9,9,23,40.1,24.3,27,28.1,35.5,31,32.1,39.1,1,10.8,12,14,16,19]['Li Be Na Ca Mg Al Si Cl P S K H B C N O F'.split().index(f[j:b])]*int(f[b:c]or 1)+(f[c:]>' 'and m(f,c))

Try it online!

ovs

Posted 2017-06-11T10:44:29.877

Reputation: 21 408

1

Python 3 - 408 bytes

This is mainly @ovs' solution, since he golfed it down by over 120 bytes... See the initial solution below.

e='Li Be Na Ca Mg Al Si Cl P S K H B C N O F'.split()
f,g=input(),[]
x=r=0
for i in e:
 if i in f:g+=[(r,eval('6.9 9 23 40.1 24.3 27 28.1 35.5 31 32.1 39.1 1 10.8 12 14 16 19'.split()[e.index(i)]))];f=f.replace(i,' %d- '%r);r+=1
h=f.split()
for c,d in zip(h,h[1:]):
 s=c.find('-')
 if-1<s:
  if'-'in d:
   for y in g:x+=y[1]*(str(y[0])==c[:s])
  else:
   for y in g:x+=y[1]*int(d)*(str(y[0])==c[:s])
print(x)

Try it online!

Python 3 - 550 548 535 bytes (lost the count with indentation)

Saved 10 bytes thanks to @cairdcoinheringaahing and 3 saved thanks to ovs

I had a personal goal to not use any regex, and do it the fun, old-school way... It turned out to be 350 bytes longer than the regex solution, but it only uses Python's standard library...

a='Li6.9 Be9. Na23. Ca40.1 Mg24.3 Al27. Si28.1 Cl35.5 P-31. S-32.1 K-39.1 H-1. B-10.8 C-12. N-14. O-16. F-19.'.split()
e,m,f,g,r=[x[:1+(x[1]>'-')]for x in a],[x[2:]for x in a],input(),[],0
for i in e:
 if i in f:g.append((r,float(m[e.index(i)])));f=f.replace(i,' '+str(r)+'- ');r+=1;
h,x=f.split(),0
for i in range(len(h)):
 if '-'in h[i]:
    if '-'in h[i+1]:
     for y in g:x+=y[1]*(str(y[0])==h[i][:h[i].index('-')])
    else:
        for y in g:
         if str(y[0])==h[i][:h[i].index('-')]:x+=(y[1])*int(h[i+1])
 else:1
print(x)  

Try it online!


If anyone is willing to golf it down (with indentation fixes and other tricks...), it will be 100% well received, feeling like there's a better way to do this...

Mr. Xcoder

Posted 2017-06-11T10:44:29.877

Reputation: 39 774

You can replace for y in g: if str(y[0])==h[i][:h[i].index('-')]:x+=y[1] with for y in g:x+=y[1]*(str(y[0])==h[i][:h[i].index('-')]) – caird coinheringaahing – 2017-06-11T14:19:30.757

@cairdcoinheringaahing ah, great... updating when I have access to a computer – Mr. Xcoder – 2017-06-11T15:45:23.403

@ovs Thanks a lot! Credited you in the answer – Mr. Xcoder – 2017-06-12T19:20:18.707

In Python, you can use a semicolon in place of a newline, which allows you to save bytes on indentation. – Pavel – 2017-06-12T19:48:05.703

@Phoenix not if there is a if/for/while on the next line. As this is the case on every indented line, you can't save bytes by this. – ovs – 2017-06-12T20:35:34.343

I know that @Phoenix but it doesn't help in this case – Mr. Xcoder – 2017-06-12T20:51:10.983

1

Mathematica, 390 338 329 Bytes

Saved 9 bytes due to actually being awake now and actually using the shortening I intended.

Version 2.1:

S=StringSplit;Total[Flatten@{ToExpression@S[#,LetterCharacter],S[#,DigitCharacter]}&/@S[StringInsert[#,".",First/@StringPosition[#,x_/;UpperCaseQ[x]]],"."]/.{"H"->1,"Li"->3,"Be"->9,"B"->10.8,"C"->12,"N"->14,"O"->16,"F"->19,"Na"->23,"Mg"->24.3,"Al"->27,"Si"->28.1,"P"->31,"S"->32.1,"Cl"->35.5,"K"->39.1,"Ca"->40.1}/.{a_,b_}->a*b]&

Explanation: Find the position of all the uppercase characters. Put a dot before each. Split the string at each dot. For this list of substrings do the following split it based on letters and split it based on digits. For the ones split by letters convert string to numbers. For the ones split by digits replace each chemical with its molecular weight. For any with a molecular weight and an atom count replace it with the product of them. Them find the total.

Version 1:

I'm sure this can be golfed lots (or just completely rewritten). I just wanted to figure out how to do it. (Will reflect on it in the morning.)

F=Flatten;d=DigitCharacter;S=StringSplit;Total@Apply[Times,#,2]&@(Transpose[{F@S[#,d],ToExpression@F@S[#,LetterCharacter]}]&@(#<>If[StringEndsQ[#,d],"","1"]&/@Fold[If[UpperCaseQ[#2],Append[#,#2],F@{Drop[#,-1],Last[#]<>#2}]&,{},StringPartition[#,1]]))/.{"H"->1,"Li"->3,"Be"->9,"B"->10.8,"C"->12,"N"->14,"O"->16,"F"->19,"Na"->23,"Mg"->24.3,"Al"->27,"Si"->28.1,"P"->31,"S"->32.1,"Cl"->35.5,"K"->39.1,"Ca"->40.1}&

Explanation: First split the string up into characters. Then fold over the array joining lowercase characters and numbers back to their capital. Next append a 1 to any chemical without a number on the end. Then do two splits of the terms in the array - one splitting at all numbers and one splitting at all letters. For the first replace the letters with their molar masses then find the dot product of these two lists.

Ian Miller

Posted 2017-06-11T10:44:29.877

Reputation: 727