Parse the comments out of my esoteric code

30

2

Earlier this week, we learned about how to format esoteric languages for commenting. Today, we're going to do the inverse of that. I need you to write a program or function that parses some well-commented esoteric code and parses the comments out, returning just the code. Using some examples from the previous challenge, here is what well-commented code looks like:

a                #Explanation of what 'a' does
 bc              #Bc
   d             #d
    e            #Explanation of e
     fgh         #foobar
        ij       #hello world
          k      #etc.
           l     #so on
            mn   #and
              op #so forth

Here is what you need to do to extract the code out. First, remove the comment character (#), the space before it, and everything after the comment character.

a               
 bc             
   d            
    e           
     fgh        
        ij      
          k     
           l    
            mn  
              op

Then, collapse each line upwards into a single line. For example, since b is in the second column on line two, once we collapse it up, it will be in the second column on line one. Similarly, c will be put in the third column of line one, and d will be put on the fourth. Repeat this for every character, and you get this:

abcdefghijklmnop

Important note: It seems like the trivial solution is to just remove the comments, remove every space, and join every line. This is not a valid approach! Because the original code might have spaces in it, these will get stripped out with this approach. For example, this is a perfectly valid input:

hello         #Line one
              #Line two
       world! #Line three

And the corresponding output should be:

hello  world!

The Challenge:

Write a program or function that takes commented code as input, and outputs or returns the code with all the comments parsed out of it. You should output the code without any trailing spaces, although one trailing newline is permissible. The comment character will always be #, and there will always be one extra space before the comments start. # will not appear in the comment section of the input. In order to keep the challenge simpler, here are some inputs you do not have to handle:

  • You can assume that the code will not have two characters in the same column. For example, this is an input that violates this rule:

    a  #A character in column one
    bc #Characters in columns one and two
    
  • You can also assume that all comment characters appear in the same column. For example, this input:

    short       #this is a short line
          long        #This is a long line
    

    violates this rule. This also means that # will not be in the code section.

  • And lastly, you do not have to handle code sections with leading or trailing spaces. For example,

      Hello,          #
             World!   #
    

You may also assume that the input only contains printable ASCII characters.

Examples:

Input:
hello         #Line one
              #Line two
       world! #Line three

Output:
hello  world!

Input:
E                                                   #This comment intentionally left blank
 ac                                                 #
   h s                                              #
      ecti                                          #
          on is                                     #
                one c                               #
                     haracte                        #
                            r longer                #
                                     than the       #
                                              last! #

Output:
Each section is one character longer than the last!

Input:
4          #This number is 7
 8         #
  15       #That last comment is wrong.
    16     #
      23   #
        42 #

Output:
4815162342

Input:
Hello                     #Comment 1
      world               #Comment 2
           ,              #Comment 3
             how          #Comment 4
                 are      #Comment 5
                     you? #Comment 6

Output:
Hello world, how are you?

Input:
Prepare                               #
        for...                        #
                        extra spaces! #

Output:
Prepare for...          extra spaces!

You may take input in whatever reasonable format you like, for example, a list of strings, a single string with newlines, a 2d list of characters, etc. The shortest answer in bytes wins!

James

Posted 2016-09-13T01:31:17.460

Reputation: 54 537

Will we need to accept code with characters lower than the next? – wizzwizz4 – 2016-09-13T06:41:02.023

Could you add the test case with the empty line with just two spaces (like the hello world! you've showed)? Also, you state: "# will not appear in the comment section of the input.", but can it occur in the code-snippet itself? – Kevin Cruijssen – 2016-09-13T07:01:10.603

@KevinCruijssen See my edits – James – 2016-09-13T07:09:58.620

@wizzwizz4 I'm not sure if I understand your question – James – 2016-09-13T07:10:20.273

@DJMcMayhem Example: do {stuff} while (condition); with the explanation in order do while (condition); #Explainything then {stuff} #Explainything. – wizzwizz4 – 2016-09-13T15:45:49.093

@wizzwizz4 I'm pretty sure that's covered under You can assume that the code will not have two characters in the same column. Does that answer your question? – James – 2016-09-13T17:43:19.230

@DJMcMayhem The HTML parser on your / my browser stripped the whitespace. Imagine a gap inbetween while and (condition), and a gap before / after {stuff}. (I'm not very good at explaining, am I?) – wizzwizz4 – 2016-09-14T06:22:47.543

Answers

18

Jelly, 8 7 bytes

»/ṣ”#ḢṖ

Try it online!

How it works

»/ṣ”#ḢṖ  Main link. Argument: A (array of strings)

»/       Reduce the columns of A by maximum.
         Since the space is the lowest printable ASCII characters, this returns the
         non-space character (if any) of each column.
  ṣ”#    Split the result at occurrences of '#'.
     Ḣ   Head; extract the first chunk, i.e., everything before the (first) '#'.
      Ṗ  Pop; remove the trailing space.

Dennis

Posted 2016-09-13T01:31:17.460

Reputation: 196 637

2That is just ...wow. – Jonathan Allan – 2016-09-13T02:26:36.197

3I am so jelly right now. – MonkeyZeus – 2016-09-13T13:49:39.350

How do you even hack that into your phone? – simbabque – 2016-09-13T16:14:06.127

2@simbabque Patience and a lot of copy-pasting. – Dennis – 2016-09-13T16:49:59.320

I'm always putting using a 9-iron, maybe it's time I learned how to use a putter when on the green... – Magic Octopus Urn – 2016-09-13T20:19:23.470

13

Python 2, 48 43 bytes

lambda x:`map(max,*x)`[2::5].split(' #')[0]

Thanks to @xnor for golfing off 5 bytes!

Test it on Ideone.

Dennis

Posted 2016-09-13T01:31:17.460

Reputation: 196 637

1I think you can just do map(max,*x) because max takes any number of arguments and None is small. – xnor – 2016-09-13T03:15:29.180

Right, I always forget that map can be used like that... Thanks! – Dennis – 2016-09-13T03:28:09.280

1How does the \...`[2::5]` trick work? – smls – 2016-09-14T06:06:12.640

1@smls \...`` is equivalent to repr(...), so for the list of singleton strings ['a', 'b', 'c'], you get the string "['a', 'b', 'c']". Finally, [2::5] chops off the first two characters ("['") and takes every fifth character of the remaining string. – Dennis – 2016-09-14T06:22:28.607

5

JavaScript (ES6), 97 75 60 bytes

Thanks to @Neil for helping golf off 22 bytes

a=>a.reduce((p,c)=>p.replace(/ /g,(m,o)=>c[o])).split` #`[0]

Input is an array of lines.

  • a is array input
  • p is previous item
  • c is current item
  • m is match string
  • o is offset

ASCII-only

Posted 2016-09-13T01:31:17.460

Reputation: 4 687

I count 96 bytes? Also, the m regexp flag is unnecessary (did you have a $ at one point?) as is the space in (p, c). Finally, I think replace will work out shorter than [...p].map().join. – Neil – 2016-09-13T08:00:39.917

97 for me, both from manual length and userscript, maybe you didn't count the newline, but only because I accidentally included the semicolon – ASCII-only – 2016-09-13T08:39:18.757

I see now - I hadn't copied the ; which isn't required (JavaScript has ASI). – Neil – 2016-09-13T08:41:37.077

Yeah, sorry, I had it to make sure Chromium console puts the function call outside the function body (had it once on a badly written lambda) – ASCII-only – 2016-09-13T08:43:19.097

Oh wow, I didn't realise replace would help so much, that's really neat! – Neil – 2016-09-13T08:47:52.533

Isn't there a g missing at the end of the first regex? – Cedric Reichenbach – 2016-09-14T07:43:51.747

@CedricReichenbach Try it, it works perfectly fine, plus I explain its absence right after the snippet – ASCII-only – 2016-09-14T08:30:09.117

4

Perl, 35 34 32 bytes

Includes +1 for -p

Give input on STDIN

eso.pl

#!/usr/bin/perl -p
y/ /\0/;/.#/;$\|=$`}{$\=~y;\0; 

Notice that there is a space after the final ;. The code works as shown, but replace \0 by the literal character to get the claimed score.

Ton Hospel

Posted 2016-09-13T01:31:17.460

Reputation: 14 114

Very nice code. That $a|=... is rather well done, it took me a while to figure out what you were doing! One question though : *_=a seems to be roughly equivalent to $_=$a, why is that? – Dada – 2016-09-13T12:45:26.740

*_=a is a very obscure glob assignment which aliases the _ globals and the a globals. So it's not so much a copy from $a to $_ but from that point on (global) $a and $_ are actually the same variable. All to save 1 byte... – Ton Hospel – 2016-09-13T13:27:51.077

Ok, thanks for the explanation! (and nice improvement thanks to $\\) – Dada – 2016-09-13T14:46:26.293

3

Python 2, 187 bytes

def f(x,o=""):
 l=[i[:i.index("#")-1]for i in x]
 for n in range(len(l[0])):
  c=[x[n]for x in l]
  if sum([1for x in c if x!=" "])<1:o+=" "
  else:o+=[x for x in c if x!=" "][0]
 print o

I'm gonna golf this more tomorrow I have school ;)

Daniel

Posted 2016-09-13T01:31:17.460

Reputation: 6 425

1 for can be reduced to 1for. Also, if the sum of the list (at line 5) can't be negative, you can just check for <1 instead of ==0. Happy school day! :D +1. – Yytsi – 2016-09-13T21:35:25.567

2

J, 30 bytes

(#~[:<./\'#'~:])@(>./&.(3&u:))

Takes a list of strings as input. Basically uses the same approach as Dennis in his Jelly answer.

Commented and explained

ord =: 3 & u:
under =: &.
max =: >./
over =: @
maxes =: max under ord
neq =: ~:
arg =: ]
runningMin =: <./\
magic =: #~ [: runningMin ('#' neq arg)

f =: magic over maxes

Intermediate steps:

   p
Hello                     #Comment 1
      world               #Comment 2
           ,              #Comment 3
             how          #Comment 4
                 are      #Comment 5
                     you? #Comment 6
   maxes p
Hello world, how are you? #Comment 6
   magic
#~ ([: runningMin '#' neq arg)
   3 neq 4
1
   '#' neq '~'
1
   '#' neq '#'
0
   '#' neq maxes p
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1
   runningMin 5 4 2 5 9 0 _3 4 _10
5 4 2 2 2 0 _3 _3 _10
   runningMin '#' neq maxes p
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
   0 1 0 1 1 0 # 'abcdef'
bde
   'abcdef' #~ 0 1 0 1 1 0
bde
   (maxes p) #~ runningMin '#' neq maxes p
Hello world, how are you? 
   (#~ [: runningMin '#' neq arg) maxes p
Hello world, how are you? 
   ((#~ [: runningMin '#' neq arg) over maxes) p
Hello world, how are you? 
   (magic over maxes) p
Hello world, how are you? 

Test case

   f =: (#~[:<./\'#'~:])@(>./&.(3&u:))
   a
Hello                     #Comment 1
      world               #Comment 2
           ,              #Comment 3
             how          #Comment 4
                 are      #Comment 5
                     you? #Comment 6
   $a
6 36
   f a
Hello world, how are you?

Conor O'Brien

Posted 2016-09-13T01:31:17.460

Reputation: 36 228

2

Ruby, 63 bytes

Basically a port of Dennis' Jelly answer. Takes input as an array of strings.

->a{l,=a
l.gsub(/./){a.map{|m|m[$`.size]||$/}.max}[/(.+) #/,1]}

See it on eval.in: https://eval.in/640757

Jordan

Posted 2016-09-13T01:31:17.460

Reputation: 5 001

2

CJam, 12 bytes

Thanks to Sp3000 for saving 2 bytes.

{:.e>_'##(<}

An unnamed block that takes a list of strings (one for each line) and replaces it with a single string.

Try it online!

Explanation

:.e>  e# Reduce the list of strings by elementwise maximum. This keeps non-spaces in
      e# favour of spaces. It'll also wreak havoc with the comments, but we'll discard
      e# those anyway.
_'##  e# Duplicate and find the index of '#'.
(<    e# Decrement that index and truncate the string to this length.

Martin Ender

Posted 2016-09-13T01:31:17.460

Reputation: 184 808

2

Javascript (ES6), 63 bytes

a=>a.reduce((p,c)=>p+/(.+?)\s+#/.exec(c)[1].slice(p.length),'')

Takes input as an array of strings.

F=a=>a.reduce((p,c)=>p+/(.+?)\s+#/.exec(c)[1].slice(p.length),'')

input.oninput = update;
update();

function update() {
  try {
    output.innerHTML = F(input.value.trim().split`
`);
  } catch(e) {
    output.innerHTML = 'ERROR: INVALID INPUT';
  }
}
textarea {
  width: 100%;
  box-sizing: border-box;
  font-family: monospace;
}
<h2>Input:</h2>
<textarea id="input" rows="8">
a                #Explanation of what 'a' does
 bc              #Bc
   d             #d
    e            #Explanation of e
     fgh         #foobar
        ij       #hello world
          k      #etc.
           l     #so on
            mn   #and
              op #so forth
</textarea>
<hr />
<h2>Output:</h2>
<pre id="output">
</pre>

George Reith

Posted 2016-09-13T01:31:17.460

Reputation: 2 424

1

Retina, 32 bytes

Byte count assumes ISO 8859-1 encoding.

Rmr` #.+|(?<=^(?<-1>.)+).+?¶( )+

Try it online!

Martin Ender

Posted 2016-09-13T01:31:17.460

Reputation: 184 808

1

Pyke, 15 10 bytes

,FSe)s\#ch

Try it here!

Port of the Jelly answer

,          -     transpose()
 FSe)      -    map(min, ^)
     s     -   sum(^)
      \#c  -  ^.split("#")
         h - ^[0]

Blue

Posted 2016-09-13T01:31:17.460

Reputation: 26 661

1

C# 157 122 Bytes

Golfed 35 bytes thanks to @milk -- though I swear I tried that earlier.

Takes input as a 2-d array of characters.

string f(char[][]s){int i=0;foreach(var x in s)for(i=0;x[i]!=35;i++)if(x[i]!=32)s[0][i]=x[i];return new string(s[0],0,i);}

157 bytes:

string g(char[][]s){var o=new char[s[0].Length];foreach(var x in s)for(int i=0;x[i]!=35;i++)if(x[i]!=32|o[i]<1)o[i]=x[i];return new string(o).TrimEnd('\0');}

pinkfloydx33

Posted 2016-09-13T01:31:17.460

Reputation: 308

Shouldn't Trim() work instead of TrimEnd()? Even better, I think you can save a lot of bytes by using s[0] as the output var and using return new string(s[0],0,i) where i is the index of the last code character. That idea may require two for loops instead of the foreach, I'll think about it more and try to write actual code later today. – milk – 2016-09-13T19:43:04.143

Trim() will trim from the start as well, which I believe wouldn't be valid. I also was originally doing the loading into s[0] and I had int i; outside of the loop (to reuse it in the return) which I believe ultimately added bytes – pinkfloydx33 – 2016-09-13T19:48:00.690

1

sed, 126 bytes

:a;N;$!ba;s,#[^\n]*\n,#,g;s,^,#,;:;/#[^ ]/{/^# /s,^# *,,;t;H;s,#.,#,g}
t;/#[^ ]/!{H;s,#.,#,g};t;g;s,\n#(.)[^\n]*,\1,g;s,...$,,

Requires a newline at the end of the input.
I'm sure I can golf this a little more, but I'm just happy it works for now.

Riley

Posted 2016-09-13T01:31:17.460

Reputation: 11 345

1

Pyth, 11 bytes

PhceCSMCQ\#

A program that takes input of a list of strings on STDIN and prints a string.

Try it online

How it works

PhceCSMCQ\#  Program. Input: Q
       CQ    Transpose Q
     SM      Sort each element of that lexicographically
    C        Transpose that
   e         Yield the last element of that, giving the program ending with ' #' and some
             parts of the comments
  c      \#  Split that on the character '#'
 h           Yield the first element of that, giving the program with a trailing space
P            All but the last element of that, removing the trailing space
             Implicitly print

TheBikingViking

Posted 2016-09-13T01:31:17.460

Reputation: 3 674

1

Perl 6, 39 bytes

{[Zmax](@_».comb).join.split(' #')[0]}

Translation of the Python solution by Dennis.
Takes input as a list of strings, and returns a string.

(try it online)

smls

Posted 2016-09-13T01:31:17.460

Reputation: 4 352

0

Jelly, 27 bytes

żḟ€” ;€” Ḣ€
i€”#’©ḣ@"ç/ḣ®ṪṖ

Test it at TryItOnline

Uses the strictest spec - the extra space before the comment character is removed at the cost of a byte.

Input is a list of strings.

Jonathan Allan

Posted 2016-09-13T01:31:17.460

Reputation: 67 804

@Erik the Golfer - maybe so, but did you see the crushing he gave me here?

– Jonathan Allan – 2016-09-13T12:45:50.027

0

TSQL, 216 175 bytes

Golfed:

DECLARE @ varchar(max)=
'hello         #Line one
              #Line two
       world! #Line three'

DECLARE @i INT=1,@j INT=0WHILE @i<LEN(@)SELECT @=stuff(@,@j+1,len(x),x),@j=iif(x=char(10),0,@j+1),@i+=1FROM(SELECT ltrim(substring(@,@i,1))x)x PRINT LEFT(@,patindex('%_#%',@))

Ungolfed:

DECLARE @ varchar(max)=
'hello         #Line one
              #Line two
       world! #Line three'

DECLARE @i INT=1,@j INT=0
WHILE @i<LEN(@)
  SELECT @=stuff(@,@j+1,len(x),x),@j=iif(x=char(10),0,@j+1),@i+=1
  FROM(SELECT ltrim(substring(@,@i,1))x)x
PRINT LEFT(@,patindex('%_#%',@))

Fiddle

t-clausen.dk

Posted 2016-09-13T01:31:17.460

Reputation: 2 874

0

Dyalog APL, 22 bytes

Inspiration.

(⎕UCS¯2↓⍳∘35↑⊢)⌈⌿∘⎕UCS

(

⎕UCS character representation of

¯2↓ all but the last two of

⍳∘35↑ up until the position of the first 35 ("#"), in that which is outside the parenthesis, taken from

that which is outside the parenthesis

) namely...

⌈⌿ the columnar maximums

of

⎕UCS the Unicode values

TryAPL online!

Adám

Posted 2016-09-13T01:31:17.460

Reputation: 37 779

How many bytes? – acrolith – 2016-09-14T01:34:06.453

0

Ruby, 77 bytes

puts File.readlines("stack.txt").join('').gsub(/\s{1}#.*\n/,'').gsub(/\s/,'')

Forwarding

Posted 2016-09-13T01:31:17.460

Reputation: 139

Hardcoding an input filename is not an acceptable method of input. – Mego – 2016-09-14T03:36:59.613

@Mego, where can I find the rules of what's "acceptable"? – Forwarding – 2016-09-14T03:53:22.413

http://meta.codegolf.stackexchange.com/q/2447/45941 – Mego – 2016-09-14T03:57:13.197

0

Javascript, 56 34 bytes, non-competing

q=>q.split(/\n/).map(x=>/ (.?) #./.exec(x)[1]).join()

q=>q.replace(/^ *| *#.*$\n?/gm,'')

As @n̴̖̋h̷͉̃a̷̭̿h̸̡̅ẗ̵̨́d̷̰̀ĥ̷̳ pointed out, I am not prepared for extra spaces

BlackCap

Posted 2016-09-13T01:31:17.460

Reputation: 3 576

Doesn't pass the "Prepare for extra spaces" case – n̴̖̋h̷͉̃a̷̭̿h̸̡̅ẗ̵̨́d̷̰̀ĥ̷̳ – 2016-09-14T09:27:47.617