The mad chemist and the clever programmer

12

2

Backstory

You wake up dizzy in a chemistry laboratory, and you realize you have been kidnapped by a old mad chemist. Since he cannot see very well because of his age, he wants you to work for him and only then, you can escape the laboratory.

Task

It is your task to return the structural formulae of the molecules whose chemical formula will be given as input. Note that only the carbon (C), oxygen (O) and hydrogen (H) atoms will be used as input. Unlike in chemical formulas, a 0 is a valid quantifier and a 1 cannot be omitted (e.g. C1H4O0 is valid input, but CH4 isn't).

To prevent ambiguity, we assume double and triple bonds do not appear in the molecules. All carbon atoms need 4 single bonds, all oxygen atoms need 2, and hydrogen atoms need one. We also assume that O-O bonds do not exist as well. The molecule does not have to exist nor be stable.

The input will never contain more than 3 carbon atoms to ensure lightness in the output's display.

You only should display the molecules whose carbons atoms are arranged in a straight line without interruption. Ergo, no C-O-C bonds.

You must return all possible molecules not excluded by the previous rules. You do not need to handle invalid inputs.

The following example displays all the solutions you have to handle for that molecule.

A rotation by 180 degrees in the plane of the page of one of the molecule's formula is considered a redundancy and does not need to be displayed.

In the example below I'll show all of the possible formulae for a molecule, then point out the ones that do not need to be displayed.

Example

Input: C2H6O2

First, here are all the possible formulae for this input (Thank you to @Jonathan Allan)

01        H
          |
          O   H
          |   |
  H - O - C - C - H
          |   |
          H   H

02            H
              |
          H   O
          |   |
  H - O - C - C - H
          |   |
          H   H

03        H   H
          |   |
  H - O - C - C - O - H
          |   |
          H   H

04        H   H
          |   |
  H - O - C - C - H
          |   |
          H   O
              |
              H

05        H   H
          |   |
  H - O - C - C - H
          |   |
          O   H
          |
          H

12        H   H
          |   |
          O   O
          |   |
      H - C - C - H
          |   |
          H   H

13        H
          |
          O   H
          |   |
      H - C - C - O - H
          |   |
          H   H

14        H
          |
          O   H
          |   |
      H - C - C - H
          |   |
          H   O
              |
              H


15        H
          |
          O   H
          |   |
      H - C - C - H
          |   |
          O   H
          |
          H

23            H
              |
          H   O
          |   |
      H - C - C - O - H
          |   |
          H   H

24            H
              |
          H   O
          |   |
      H - C - C - H
          |   |
          H   O
              |
              H

25            H
              |
          H   O
          |   |
      H - C - C - H
          |   |
          O   H
          |
          H

34        H   H
          |   |
      H - C - C - O - H
          |   |
          H   O
              |
              H

35        H   H
          |   |
      H - C - C - O - H
          |   |
          O   H
          |
          H

45        H   H
          |   |
      H - C - C - H
          |   |
          O   O
          |   |
          H   H

And here are the formulae that should be in the output if we take out the rotations of 180° in the plane of the page :

01        H
          |
          O   H
          |   |
  H - O - C - C - H
          |   |
          H   H



03        H   H
          |   |
  H - O - C - C - O - H
          |   |
          H   H


12        H   H
          |   |
          O   O
          |   |
      H - C - C - H
          |   |
          H   H

13        H
          |
          O   H
          |   |
      H - C - C - O - H
          |   |
          H   H

14        H
          |
          O   H
          |   |
      H - C - C - H
          |   |
          H   O
              |
              H


 15      H
         |
         O   H      
         |   |
     H - C - C - H
         |   |
         O   H
         |
         H 

23            H
              |
          H   O
          |   |
      H - C - C - O - H
          |   |
          H   H



25            H
              |
          H   O
          |   |
      H - C - C - H
          |   |
          O   H
          |
          H



35        H   H
          |   |
      H - C - C - O - H
          |   |
          O   H
          |
          H

You do not need to output the labels of the formulae and you can output either of the rotations when two exist. For example you can output either 02 or 35.

Here are some valid inputs to test your code:

C3H8O2 C1H4O0 C2H6O2 C1H4O1 C2H6O2

The PC the chemist gave you to complete your task is quite old so you do not have a lot of memory on it to save your code, thus this is and the shortest amount of byte wins!

user68509

Posted 2017-04-27T15:07:21.023

Reputation:

Do we need to handle cyclic molecules? – Luke – 2017-04-27T16:13:57.970

@Luke The inputs I gave cannot be cyclic so you do not need to handle that. But if you want to handle molecules that contains 4 C or more you can do it and earn bonus score :) Thanks for the edit by the way! english is not my native language ^^ – None – 2017-04-27T16:17:12.903

@Antoine Given your current challenge description, handling molecules with any number of Carbon atoms (including 4 or more) is required. – mbomb007 – 2017-04-27T16:18:50.190

1

The output you've suggested is missing a lot of potential molecules: you have two copies of propan-1,2-diol there, but is missing at least propan-1,1-diol, propan-1,3-diol, propan-2,2-diol, a large number of alcohol ethers, and various compounds where the two oxygen atoms connect to each other. Additionally, how specified is the output format? I can imagine molecules in which some of the bonds need to be drawn longer than others to fit everything in (e.g. dimethylpropane, which is apparently a real chemical).

– None – 2017-04-27T16:20:13.570

@ais523 Before the last edit it was specified that the output I wrote did not display ALL the molecules possible for the formula, I added it back. Concerning the neopentane you linked it contains 5 carbons, the inputs needed to work with the code contains 3 C maximum, which allows a clearer display – None – 2017-04-27T16:24:37.590

@JonathanAllan Can you explain what is missing to make the challenge clearer please ? – None – 2017-04-27T16:26:22.220

@JonathanAllan Ok I'll submit it to sandbox and edit later with the completed specifications – None – 2017-04-27T16:31:18.143

I hope that the chemist is really mad, so he won't find out that all of the molecular structures in the example are the same molecule (just rotated)... :) – mzuther – 2017-04-29T17:20:23.930

@mzuther Technically yes it's the same molecule, just different symetries, but he doesn't see very well anyway ;) – None – 2017-04-29T17:23:43.983

2>

  • Is it possible to have 2 OH groups on the same carbon? You seem to have excluded it from the examples, but I don't see anywhere in the spec that says we don't have to consider it (I know in reality these compounds exist in equilibrium with aldehydes) 2. Why is the HOCH2CH2OH with both OH groups pointing downards missing from the example? Isn't it a required output?
  • < – Level River St – 2017-04-30T03:35:17.567

    1 start="3">

  • Is it acceptable to have the outputs with the carbon chain vertical instead of horizontal?
  • < – Level River St – 2017-04-30T03:36:03.183

    @LevelRiverSt It is indeed possible to have 2 OH group on the same carbon I totally forgot to add them and I'll do it right away ^^ Concerning the ouput it is totally acceptable to rotate it to have the carbon displayed vertically. And for your concern about the symetry with both OH group poitning down, it's because it's a rotation of pi of the first molecule. Thanks for pointing these concerns :) – None – 2017-04-30T09:19:38.583

    In that case the first and last structures on the 3rd row should be absent, as they are rotations of structures given previously. (They are also mirror images but it would seem that mirror images are considered different.) – Level River St – 2017-04-30T10:14:07.200

    @LevelRiverSt I agree for the first of the 3rd row but can't find where the 3rd one is rotated ? – None – 2017-04-30T10:25:59.137

    @JonathanAllan The pi rotation is about the whole molecule not only a part of it. That's why you cannot consider that r1c1, r1c3, r2c2, r3c1, and r3c3are equivalent by rotation, because that would mean you only rotated a part of the molecule. Could you please explain further your labeling system? I'm interested – None – 2017-04-30T11:34:25.087

    @JonathanAllan Indeed the couples you mention have a symetry with the vertical or horizontal axis, however, if you turn the whole molecule you cannot achieve the transformation from r1c3 to r2c1 etc. That's why I consider them to be different, contrary to rotations which are redundancies. If you think it's a flaw I think we can correct the question – None – 2017-04-30T11:51:07.457

    Is it required to omit said symmetries, or is it allowable to do so? – TLW – 2017-04-30T22:02:39.710

    @TLW The redundancies do not need to be displayed but if you want to let them remain you can – None – 2017-04-30T22:42:10.257

    Can I answer it graphically instead of ASCII art? – sergiol – 2017-05-02T23:01:37.910

    @sergiol Sure as long as it respects the format – None – 2017-05-02T23:36:56.263

    Answers

    3

    Ruby, 275

    ->s{(k=4<<2*c=s[1].to_i).times{|i|z=" "*8
    t=("  H|O|"[i%2*2,4]+"C|"*c+"O|H   "[i>>c&2^2,4]).chars.map{|j|z+j+z}
    (c*2).times{|j|t[4+j&-2][j%2*10,7]="    H - O - H    "[[i>>j/2-1&4,-7-(i>>c*2-j/2-1&4)][j%2],7]}
    i*(k+1)>>c+1&k-1<i||("%b"%i).sum%16!=s[5].to_i||i%7>c*3||puts(t)}}
    

    Combined formulas for left and right sidechains and eliminated variable h

    Ruby, 279

    ->s{(k=1<<h=2+2*c=s[1].to_i).times{|i|t=("  H|O|"[i%2*2,4]+"C|"*c+"O|H   "[i>>c&2^2,4]).chars.map{|j|(z=" "*8)+j+z}
    c.times{|j|t[4+j*2][0,7]="    H - O -"[i>>j-1&4,7]
    t[4+j*2][10,7]="- O - H    "[i>>h-j-3&4^4,7]}
    i*(k+1)>>h/2&k-1<i||("%b"%i).sum%16!=s[5].to_i||i%7>c*3||puts(t)}}
    

    Ungolfed in test program

    f=->s{
      (k=1<<h=2+2*c=s[1].to_i).times{|i|                       #c=number of C atoms. h=number of H (calculated)
                                                               #iterate i from 0 to (k=1<<h)-1
    
      t=("  H|O|"[i%2*2,4]+"C|"*c+"O|H   "[i>>c&2^2,4]).       #compose a backbone string H-C...C-H. Insert O at the top where bit 0 of i set, and O at the bottom where bit c+1 of i set
      chars.map{|j|(z=" "*8)+j+z}                              #convert string to an array of characters, pad each character left and right with 8 spaces
    
      c.times{|j|t[4+j*2][0,7]="    H - O -"[i>>j-1&4,7]       #overwrite spaces on left with H or HO according to bits 1 up to c
                 t[4+j*2][10,7]="- O - H    "[i>>h-j-3&4^4,7]} #overwrite spaces on right with H or OH according to bits h-1 down to c+1
    
      i*(k+1)>>h/2&k-1<i||                                     #rotate the bits of i by h/2. if this is less than i, do not output the structure (eliminates rotations by 180deg by outputtng the lexically highest)
      ("%b"%i).sum%16!=s[5].to_i||                             #if the number of 1's in i differs from the number of O's indicated in the input, do not output
      i%7>c*3||                                                #if i%7>c*3, do not output (empirical solution to avoid 90deg rotations for C=1)
      puts(t)                                                  #if the above are all false, output the current structure.
      }
    }
    
    f[gets]
    

    Output

    Spacing is as per question output. Backbone vertical instead of horizontal allowed per comments. Rotations of the entire display through 90 or 180 degrees considered equivalent.

    C2H6O2
            H
            |
            O
            |
    H - O - C - H
            |
        H - C - H
            |
            H
    
    
    
            H
            |
            O
            |
        H - C - H
            |
    H - O - C - H
            |
            H
    
    
    
    
    
            H
            |
    H - O - C - H
            |
    H - O - C - H
            |
            H
    
    
    
            H
            |
            O
            |
        H - C - H
            |
        H - C - H
            |
            O
            |
            H
    
    
    
            H
            |
    H - O - C - H
            |
        H - C - H
            |
            O
            |
            H
    
    
    
            H
            |
        H - C - H
            |
    H - O - C - H
            |
            O
            |
            H
    
    
    
            H
            |
    H - O - C - H
            |
        H - C - O - H
            |
            H
    
    
    
    
    
            H
            |
        H - C - H
            |
    H - O - C - O - H
            |
            H
    
    
    
    
    
            H
            |
        H - C - O - H
            |
    H - O - C - H
            |
            H
    

    Level River St

    Posted 2017-04-27T15:07:21.023

    Reputation: 22 049

    Nice I'll run it when I get back at my computer :) – None – 2017-05-01T14:17:07.167