Balance chemical equations!

30

11

Bernd is a high school student who has some problems in chemistry. In class he has to design chemical equations for some experiments they are doing, such as the combustion of heptane:

C7H16 + 11O2 → 7CO2 + 8H2O

Since mathematics isn't exactly Bernd's strongest subject, he often has a hard time finding the exact ratios between the pro- and educts of the reaction. Since you are Bernd's tutor, it is your job to help him! Write a program, that calculates the amount of each substance needed to get a valid chemical equation.

Input

The input is a chemical equation without amounts. In order to make this possible in pure ASCII, we write any subscriptions as ordinary numbers. Element names always start with a capital letter and may be followed by a minuscule. The molecules are separated with + signs, an ASCII-art arrow -> is inserted between both sides of the equation:

Al+Fe2O4->Fe+Al2O3

The input is terminated with a newline and won't contain any spaces. If the input is invalid, your program may do whatever you like.

You may assume, that the input is never longer than 1024 characters. Your program may either read the input from standard input, from the first argument or in an implementation defined way at runtime if neither is possible.

Output

The output of your program is the input equation augmented with extra numbers. The number of atoms for each element must be the same on both sides of the arrow. For the example above, a valid output is:

2Al+Fe2O3->2Fe+Al2O3

If the number for a molecule is 1, drop it. A number must always be a positive integer. Your program must yield numbers such that their sum is minimal. For instance, the following is illegal:

40Al+20Fe2O3->40Fe+20Al2O3

If there is no solution, print

Nope!

instead. A sample input that has no solution is

Pb->Au

Rules

  • This is code-golf. The shortest code wins.
  • Your program must terminate in reasonable time for all reasonable inputs.

Test Cases

Each test case has two lines: An input and a correct output.

C7H16+O2->CO2+H2O
C7H16+11O2->7CO2+8H2O

Al+Fe2O3->Fe+Al2O3
2Al+Fe2O3->2Fe+Al2O3

Pb->Au
Nope!

FUZxxl

Posted 2012-10-15T21:54:38.037

Reputation: 9 656

5"Since you are Bernds tutor, it is your job to help him!" - I would have thought a tutor should be teaching Bernd to think for himself, rather than write software for him so he doesn't have to :P – naught101 – 2015-01-30T04:44:19.383

1I could be wrong, but this seems like a natural candidate for a programming challenge rather code golf. – DavidC – 2012-10-16T02:06:20.100

The algorithm is not so dificult once you think about it. Hint: Vectors. – FUZxxl – 2012-10-16T05:24:15.117

1I once wrote a chemical equation solver on my TI-89 graphing calculator, using the built-in solve( function and eval( to interpret the input :) – mellamokb – 2012-10-16T12:48:05.873

3@mellamokb why don't you post it, you'll get an upvote from me for originality – ratchet freak – 2012-10-16T15:57:41.010

http://www.webqc.org/balance.php?reaction=Al%2BFe2O4%3DAl2O3%2BFe says that Al+Fe2O4->Fe+Al2O3 is 8Al+3Fe2O4->4Al2O3+6Fe – baby-rabbit – 2012-10-16T22:49:11.707

@baby-rabbit: The equation was wrong. Fixed. – FUZxxl – 2012-10-17T05:46:36.473

The equation under Input is still wrong, saying Al+Fe2O4 instead of Fe2O3 – Kuilin Li – 2016-12-25T05:26:46.427

1@KuilinLi It is not wrong, just different. – FUZxxl – 2016-12-25T09:37:59.227

What about ion charges? Cu2+ +2e- ->Cu ? – devRicher – 2016-12-25T22:50:50.140

@devRicher Out of scope. – FUZxxl – 2016-12-25T23:57:53.727

Answers

7

C, 442 505 chars

// element use table, then once parsed reused as molecule weights
u,t[99];

// molecules
char*s,*m[99]; // name and following separator
c,v[99][99]; // count-1, element vector

i,j,n;

// brute force solver, n==0 upon solution - assume at most 30 of each molecule
b(k){
    if(k<0)for(n=j=0;!n&&j<u;j++)for(i=0;i<=c;i++)n+=t[i]*v[i][j]; // check if sums to zero
    else for(t[k]=0;n&&t[k]++<30;)b(k-1); // loop through all combos of weights
}

main(int r,char**a){
    // parse
    for(s=m[0]=a[1];*s;){
        // parse separator, advance next molecule
        if(*s==45)r=0,s++;
        if(*s<65)m[++c]=++s;
        // parse element
        j=*s++;
        if(*s>96)j=*s+++j<<8;            
        // lookup element index
        for(i=0,t[u]=j;t[i]-j;i++);
        u+=i==u;
        // parse amount
        for(n=0;*s>>4==3;)n=n*10+*s++-48;
        n+=!n;
        // store element count in molecule vector, flip sign for other side of '->'
        v[c][i]=r?n:-n;
    }
    // solve
    b(c);
    // output
    for(i=0,s=n?"Nope!":a[1];*s;putchar(*s++))s==m[i]&&t[i++]>1?printf("%d",t[i-1]):0;
    putchar(10);
}

Run as:

./a.out "C7H16+O2->CO2+H2O"
./a.out "Al+Fe2O4->Fe+Al2O3"
./a.out "Pb->Au"

Results:

C7H16+11O2->7CO2+8H2O
8Al+3Fe2O4->6Fe+4Al2O3
Nope!

baby-rabbit

Posted 2012-10-15T21:54:38.037

Reputation: 1 623

+1 this is much more respectable than the pres. debate – ardnew – 2012-10-17T02:36:36.760

2Try using commas as statement separators to avoid curly braces. Also try replacing if-then-else-constructs with ternary operators to shorten the code. t[i]>1?printf("%s",t[i]):0; is one byte shorter. Also: m[0] is the same as *m. – FUZxxl – 2012-10-17T09:32:51.410

6

Mathematica 507

I employed the augmented chemical composition matrix approach described in

L.R.Thorne, An innovative approach to balancing chemical - reaction equations : a simplified matrix - inverse technique for determining the matrix null space. Chem.Educator, 2010, 15, 304 - 308.

One slight tweak was added: I divided the transpose of the null-space vector by the greatest common divisor of the elements to ensure integer values in any solutions. My implementation does not yet handle cases where there is more than one solution to balancing the equation.

b@t_ :=Quiet@Check[Module[{s = StringSplit[t, "+" | "->"], g = StringCases, k = Length, 
  e, v, f, z, r},
e = Union@Flatten[g[#, _?UpperCaseQ ~~ ___?LowerCaseQ] & /@ s];v = k@e;
s_~f~e_ := If[g[s, e] == {}, 0, If[(r = g[s, e ~~ p__?DigitQ :> p]) == {}, 1, 
   r /. {{x_} :> ToExpression@x}]];z = k@s - v;
r = #/(GCD @@ #) &[Inverse[Join[SparseArray[{{i_, j_} :> f[s[[j]], e[[i]]]}, k /@ {e, s}], 
Table[Join[ConstantArray[0, {z, v}][[i]], #[[i]]], {i, k[#]}]]][[All, -1]] &
   [IdentityMatrix@z]];
Row@Flatten[ReplacePart[Riffle[Partition[Riffle[Abs@r, s], 2], " + "], 
   2 Count[r, _?Negative] -> " -> "]]], "Nope!"]

Tests

b["C7H16+O2->CO2+H2O"]
b["Al+Fe2O3->Fe+Al2O3"]
b["Pb->Au"]

enter image description here

Analysis

It works by setting up the following chemical composition table, consisting of chemical species by elements, to which an addition nullity vector is added (becoming the augmented chemical composition table:

chemical composition table

The inner cells are removed as a matrix and inverted, yielding.

inversion

The right-most column is extracted, yielding:

{-(1/8), -(11/8), 7/8, 1}

Each element in the vector is divided by the gcd of the elements (1/8), giving:

{-1, -11, 7, 8}

where the negative values will be placed on the left side of the arrow. The absolute values of these are the numbers needed to balance the original equation:

solution

DavidC

Posted 2012-10-15T21:54:38.037

Reputation: 24 524

For some reason, I only came across your comment today. Yes, I did mean "right-hand column". So much time has passed since I worked on this that I cannot see (or remember) where padding is used. Sorry. – DavidC – 2015-06-26T16:00:34.433

don't forget to add the exclamation point! – ardnew – 2012-10-18T17:45:44.243

:} ok, and I upped the character count – DavidC – 2012-10-18T18:52:49.797

I think you mean the right-hand column, not the left-hand one. I appreciate the explanation (+1) but I do wonder: if it weren't the case that the number of molecules is one more than the number of elements, how do you pad? Off to read the paper now. – Peter Taylor – 2012-10-19T20:51:29.680

3

Python, 880 chars

import sys,re
from sympy.solvers import solve
from sympy import Symbol
from fractions import gcd
from collections import defaultdict

Ls=list('abcdefghijklmnopqrstuvwxyz')
eq=sys.argv[1]
Ss,Os,Es,a,i=defaultdict(list),Ls[:],[],1,1
for p in eq.split('->'):
 for k in p.split('+'):
  c = [Ls.pop(0), 1]
  for e,m in re.findall('([A-Z][a-z]?)([0-9]*)',k):
   m=1 if m=='' else int(m)
   a*=m
   d=[c[0],c[1]*m*i]
   Ss[e][:0],Es[:0]=[d],[[e,d]]
 i=-1
Ys=dict((s,eval('Symbol("'+s+'")')) for s in Os if s not in Ls)
Qs=[eval('+'.join('%d*%s'%(c[1],c[0]) for c in Ss[s]),{},Ys) for s in Ss]+[Ys['a']-a]
k=solve(Qs,*Ys)
if k:
 N=[k[Ys[s]] for s in sorted(Ys)]
 g=N[0]
 for a1, a2 in zip(N[0::2],N[1::2]):g=gcd(g,a2)
 N=[i/g for i in N]
 pM=lambda c: str(c) if c!=1 else ''
 print '->'.join('+'.join(pM(N.pop(0))+str(t) for t in p.split('+')) for p in eq.split('->'))
else:print 'Nope!'

Tests:

python chem-min.py "C7H16+O2->CO2+H2O"
python chem-min.py "Al+Fe2O4->Fe+Al2O3"
python chem-min.py "Pb->Au"

Output:

C7H16+11O2->7CO2+8H2O
8Al+3Fe2O4->6Fe+4Al2O3
Nope!

Could be much less than 880, but my eyes are killing me already...

jadkik94

Posted 2012-10-15T21:54:38.037

Reputation: 281

2

Python 2, 635 bytes

previous byte counts: 794, 776, 774, 765, 759, 747, 735, 734, 720, 683, 658, 655, 654, 653, 651, 638, 637, 636 bytes.

The second indentation level is only a tab, the third is a tab then a space.

To be honest, this is jadkik94's answer, but so many bytes were shaved, I had to do it. Tell me if I can shave any bytes off!

from sympy import*
import sys,re
from sympy.solvers import*
from collections import*
P=str.split
L=map(chr,range(97,123))
q=sys.argv[1]
S,O,a,i,u,v=defaultdict(list),L[:],1,1,'+','->'
w=u.join
for p in P(q,v):
 for k in P(p,u):
     c=L.pop(0)
     for e,m in re.findall('([A-Z][a-z]*)(\d*)',k):
      m=int(m or 1)
      a*=m
      S[e][:0]=[c,m*i],
 i=-1
Y=dict((s,Symbol(s))for s in set(O)-set(L))
Q=[eval(w('%d*%s'%(c[1],c[0])for c in S[s]),{},Y)for s in S]+[Y['a']-a]
k=solve(Q,*Y)
if k:
 N=[k[Y[s]]for s in sorted(Y)]
 g=gcd(N[:1]+N[1::2])
 print v.join(w((lambda c:str(c)*(c!=1))(N.pop(0)/g)+str(t)for t in P(p,u))for p in P(q,v))
else:print'Nope!'

Zacharý

Posted 2012-10-15T21:54:38.037

Reputation: 5 710

save three bytes: ''.join(map(chr,range(97,122))) :D – aliqandil – 2017-01-03T14:53:55.800

:(, that doesn't work. However, map(chr,range(97,123)) works for 12 bytes saved. – Zacharý – 2017-01-03T15:11:15.200

oh right! it's python 2! – aliqandil – 2017-01-11T23:25:45.087

1

JavaScript, 682 bytes

x=>{m=1;x.split(/\D+/g).map(i=>i?m*=i:0);e=new Set(x.replace(/\d+|\+|->/g,"").match(/([A-Z][a-z]*)/g));e.delete``;A=[];for(let z of e){t=x.split`->`;u=[];for(c=1;Q=t.shift();c=-1)Q.split`+`.map(p=>u.push(c*((i=p.indexOf(z))==-1?0:(N=p.substring(i+z.length).match(/^\d+/g))?N[0]:1)));A.push(u)}J=A.length;for(P=0;P<J;P++){for(i=P;!A[i][P];i++);W=A.splice(i,1)[0];W=W.map(t=>t*m/W[P]);A=A.map(r=>r[P]?r.map((t,j)=>t-W[j]*r[P]/m):r);A.splice(P,0,W)}f=e.size;if(!A[0][f])return"Nope!";g=m=-m;_=(a,b)=>b?_(b,a%b):a;c=[];A.map(p=>c.push(t=p.pop())&(g=_(g,t)));c.push(m);j=x.match(/[^+>]+/g);return c.map(k=>k/g).map(t=>(t^1?t:"")+(z=j.shift())+(z.endsWith`-`?">":"+")).join``.slice(0,-1);}

This is a much more golfed (decades of characters!) of Kuilin's answer. Might be noncompeting because certain JS features postdate the challenge.

Zacharý

Posted 2012-10-15T21:54:38.037

Reputation: 5 710

0

Javascript, 705 bytes

(non-competing, some features postdate the challenge)

Other solutions all had elements of brute-forcing. I tried for a more deterministic approach by representing the chemical equation as a set of linear equations, and then solving using the Gauss-Jordan algorithm to take the reduced row-echelon form of that matrix. In order to isolate out the trivial case where everything is zero, I assume that one of the elements is a constant number - and that number is determined by just all the numbers multiplied together, in order to not have fractions. Then as a final step we'll divide each by the gcd to satisfy the last condition.

Ungolfed:

function solve(x) {
 //firstly we find bigNumber, which will be all numbers multiplied together, in order to assume the last element is a constant amount of that
 bigNumber = 1;
 arrayOfNumbers = new Set(x.split(/\D+/g));
 arrayOfNumbers.delete("");
 for (let i of arrayOfNumbers) bigNumber *= parseInt(i);
 
 //first actual step, we split into left hand side and right hand side, and then into separate molecules
 //number of molecules is number of variables, number of elements is number of equations, variables refer to the coefficients of the chemical equation
 //note, the structure of this is changed a lot in the golfed version since right is the same as negative left
 left = x.split("->")[0].split("+");
 righ = x.split("->")[1].split("+");
 molecules = left.length + righ.length;
 
 //then let's find what elements there are - this will also become how many equations we have, or the columns of our matrix minus one
 //we replace all the non-element characters, and then split based on the uppercase characters
 //this also sometimes adds a "" to the array, we don't need that so we just delete it
 //turn into a set in order to remove repeats
 elems = new Set(x.replace(/\d+|\+|->/g,"").match(/([A-Z][a-z]*)/g));
 elems.delete("");
 
 rrefArray = [];//first index is rows, second index columns - each row is an equation x*(A11)+y*(A21)+z*(A31)=A41 etc etc, to solve for xyz as coefficients
 //loop thru the elements, since for each element we'll have an equation, or a row in the array
 for (let elem of elems) {
  buildArr = [];
  //loop thru the sides
  for (let molecule of left) {
   //let's see how many of element elem are in molecule molecule
   //ASSUMPTION: each element happens only once per molecule (no shenanigans like CH3COOH)
   index = molecule.indexOf(elem);
   if (index == -1) buildArr.push(0);
   else {
    index += elem.length;
    numberAfterElement = molecule.substring(index).match(/^\d+/g);
    if (numberAfterElement == null) buildArr.push(1);
    else buildArr.push(parseInt(numberAfterElement));
   }
  }
  //same for right, except each item is negative
  for (let molecule of righ) {
   index = molecule.indexOf(elem);
   if (index == -1) buildArr.push(0);
   else {
    index += elem.length;
    numberAfterElement = molecule.substring(index).match(/^\d+/g);
    if (numberAfterElement == null) buildArr.push(-1);
    else buildArr.push(parseInt(numberAfterElement)*(-1));
   }
  }
  rrefArray.push(buildArr);
 }
 
 //Gauss-Jordan algorithm starts here, on rrefArray
 for (pivot=0;pivot<Math.min(molecules, elems.size);pivot++) {
  //for each pivot element, first we search for a row in which the pivot is nonzero
  //this is guaranteed to exist because there are no empty molecules
  for (i=pivot;i<rrefArray.length;i++) {
   row = rrefArray[i];
   if (row[pivot] != 0) {
    workingOnThisRow = rrefArray.splice(rrefArray.indexOf(row), 1)[0];
   }
  }
  //then multiply elements so the pivot element of workingOnThisRow is equal to bigNumber we determined above, this is all to keep everything in integer-space
  multiplyWhat = bigNumber / workingOnThisRow[pivot]
  for (i=0;i<workingOnThisRow.length;i++) workingOnThisRow[i] *= multiplyWhat
  //then we make sure the other rows don't have this column as a number, the other rows have to be zero, if not we can normalize to bigNumber and subtract
  for (let i in rrefArray) {
   row = rrefArray[i];
   if (row[pivot] != 0) {
    multiplyWhat = bigNumber / row[pivot]
    for (j=0;j<row.length;j++) {
     row[j] *= multiplyWhat;
     row[j] -= workingOnThisRow[j];
     row[j] /= multiplyWhat;
    }
    rrefArray[i]=row;
   }
  }
  //finally we put the row back
  rrefArray.splice(pivot, 0, workingOnThisRow);
 }
 
 //and finally we're done!
 //sanity check to make sure it succeeded, if not then the matrix is insolvable
 if (rrefArray[0][elems.size] == 0 || rrefArray[0][elems.size] == undefined) return "Nope!";
 
 //last step - get the results of the rref, which will be the coefficients of em except for the last one, which would be bigNumber (1 with typical implementation of the algorithm)
 bigNumber *= -1;
 gcd_calc = function(a, b) {
  if (!b) return a;
  return gcd_calc(b, a%b);
 };
 coEffs = [];
 gcd = bigNumber;
 for (i=0;i<rrefArray.length;i++) {
  num = rrefArray[i][molecules-1];
  coEffs.push(num);
  gcd = gcd_calc(gcd, num)
 }
 coEffs.push(bigNumber);
 for (i=0;i<coEffs.length;i++) coEffs[i] /= gcd;
 
 //now we make it human readable
 //we have left and right from before, let's not forget those!
 out = "";
 for (i=0;i<coEffs.length;i++) {
  coEff = coEffs[i];
  if (coEff != 1) out += coEff;
  out += left.shift();
  if (left.length == 0 && righ.length != 0) {
   out += "->";
   left = righ;
  } else if (i != coEffs.length-1) out += "+";
 }
 return out;
}
console.log(solve("Al+Fe2O4->Fe+Al2O3"));
console.log(solve("Al+Fe2O3->Fe+Al2O3"));
console.log(solve("C7H16+O2->CO2+H2O"));
console.log(solve("Pb->Au"));

Golfed

s=x=>{m=1;x.split(/\D+/g).map(i=>i!=""?m*=i:0);e=(new Set(x.replace(/\d+|\+|->/g,"").match(/([A-Z][a-z]*)/g)));e.delete("");A=[];for(let z of e){t=x.split("->");u=[];for(c=1;Q=t.shift();c=-1)Q.split("+").map(p=>u.push(c*((i=p.indexOf(z))==-1?0:(N=p.substring(i+z.length).match(/^\d+/g))?N[0]:1)));A.push(u)}J=A.length;for(P=0;P<J;P++){for(i=P;!A[i][P];i++);W=A.splice(i,1)[0];W=W.map(t=>t*m/W[P]);A=A.map(r=>!r[P]?r:r.map((t,j)=>t-W[j]*r[P]/m));A.splice(P,0,W)}f=e.size;if (!A[0][f])return "Nope!";g=m=-m;_=(a,b)=>b?_(b,a%b):a;c=[];A.map(p=>c.push(t=p.pop())&(g=_(g,t)));c.push(m);j=x.match(/[^+>]+/g);return c.map(k=>k/g).map(t=>(t==1?"":t)+(z=j.shift())+(z.endsWith("-")?">":"+")).join("").slice(0,-1);}

console.log(s("Al+Fe2O4->Fe+Al2O3"));
console.log(s("Al+Fe2O3->Fe+Al2O3"));
console.log(s("C7H16+O2->CO2+H2O"));
console.log(s("Pb->Au"));

Kuilin Li

Posted 2012-10-15T21:54:38.037

Reputation: 421

1Noncompeting, as some features postdate the challenge. – Zacharý – 2016-12-31T00:55:05.123

Oh wow I didn't notice how old this was. Thanks! – Kuilin Li – 2017-01-01T02:27:33.830