Code Explanation Formatter



Successful code golf submissions are, by nature, filled with crazy symbols all over the place. To make their submission easier to understand, many code-golfers choose to include an explanation of their code. In their explanation, the line of code is turned into a vertically exploded diagram.

For example, if this were my code:


One of the many possible diagrams I could create would look like this:

   [      ] 
   [3:    ] 
   [  ~   ] 
   [   2@ ] 
   [     +] 

The Goal

In this challenge, you will write an explanation auto-formatting tool which takes a line of code and creates a diagram to which explanatory text can be easily added.

In order to make this a more useful challenge, the user will be able to specify the contents of each line, by providing a formatting string. The formatting string will be a second line, containing only letters A-Za-z, that is the same length as the program. The letters show the order in which the characters of the program should be printed in the explanation.

Here is an example of I/O without any bracket-like formatting:




If more than one character in the program has the same priority level, then that set of characters acts as a single block of code (if they form a group) or a set of brackets (if they contain other characters in-between). The general rules are simple:

  1. Characters do not appear in a line of the diagram until all other characters of greater priority have already appeared on the lines above it in the diagram.

  2. Characters of equal priority are always printed on the same lines. If a certain character appears on a line, all other characters of equal priority appear on the line.

  3. A set of characters of equal priority continue to appear on each line until all other characters enclosed by it have appeared at least once. This allows for "bracket-like" constructions. If bceab are the priorities, then the b characters will appear on the second line (they are second-highest priority) and will continue to appear until all of the cea characters have appeared. If the priority string is abcadeafga, then all of bcdefg are considered contained within it, an all 4 as will continue to appear until the g has appeared.

More formatting requirements

All lines of output should be the same length (the length of the input lines), padded with spaces as necessary. The input program line may contain spaces, although those spaces will also be given a priority letter. Trailing newlines on output/input are optional.


This is code golf, fewest bytes wins.


Here is a commented example of a piece of code with more complex formatting.


1            #highest priority is denoted by the lowercase letter a
 _'          #priority b
   [      ]  #all characters with priority c
   [3:    ]  #priority d, but priority c still printed because it encloses more
   [  ~   ]  #priority e
   [   2@ ]  #priority f
   [     +]  #priority g, last line of c because all enclosed characters have appeared
           ` #priority h

An example in Perl:


      s/          /     /gi;     
      s/[^aeiou\W]/     /gi;     
      s/          /$&o$&/gi;     

Here are a few examples in CJam, courtesy of Martin Büttner:


    {                 }g 
    {_2%              }g 
    {   {   }{  }?    }g 
    {   {3*)}{  }?    }g 
    {   {   }{2/}?    }g 
    {             _p  }g 
    {               _(}g 


 {                          }/
 {_eu                       }/
 {   '[,66>                 }/
 {         "EIOU"-          }/
 {                #         }/
 {                 )g       }/
 {                   {    }*}/
 {                   {'o  }*}/
 {                   {  1$}*}/

Here is a crazy example just to mess with you:


   [ :    ] 
   [3: 2  ] 
   [3:~2 +] 
   [ :~ @+] 
  '        `

Here is a more explicit example of what happens when brackets overlap like abab. (Normally, this is not the way you would choose to format your explanation.)


aa      aa    
aabb    aa  bb
aabbcc  aa  bb
aabb  ddaa  bb
  bb      eebb #"aa" no longer appears because all of "bbccdd" have already appeared.


Posted 2015-04-19T15:19:58.093

Reputation: 26 739



Pyth, 33 40 bytes


Try it online: Pyth Compiler/Executor


Generated with the string aabbbbbzccdeeegfffqhjiiikkpnmmllloooohec:

                                          implicit: z = first input line
Jw                                        J = second input line
  FHS{J                                   for H in sorted(set(J)):
        .e                             J    map each k,Y of enumerate(J) to:
        .e?                            J      .... if ... else ...
        .e @zk                        dJ      z[k] if ... else " "
        .e @zk gHY                    dJ        H >= Y
        .e @zk&                       dJ        and
        .e @zk     m                 JdJ        map each d of J to:
        .e @zk     m gdH             JdJ          d >= H
        .e @zk     m&                JdJ          and
        .e @zk     m    }d           JdJ          d in ...
        .e @zk     m          xJY    JdJ          index of Y in J
        .e @zk     m        >J       JdJ          substring of J (from index to end)
        .e @zk     m       _         JdJ          reverse substring
        .e @zk     m             x_JYJdJ          index of Y in reversed J
        .e @zk     m      >          JdJ          substring of reversed (from index to end)
        .e @zk    s                   dJ       sum up the booleans (acts as any)
       s                                    sum up the chars and print

So the first input line is z, the second input line is J.

The loop iterates over all chars of J in sorted order and without duplicates. The current char is called H.

Then for each Y of J I print the correspondent char of z or a whitespace, depending on if both of the following conditions are satisfied:

  • Y <= H (a char first appears in the line H)
  • there is a char d >= H, which appears in a block starting and ending with Y (brackets).


This shows how the forth line of the input abcdaeb, abcdaeb is printed. The fourth line is a good representation, since most of the possible cases happen:

code input:  "abcdaeb"
order input: "abcdaeb"

printing the fourth line, H = "d":

   "a" is printed, because "a" <= "d" and ("d" >= "d" and "d" is in "abcda")
   "b" is printed, because "b" <= "d" and ("d" >= "d" and "d" is in "bcdaeb")
   "c" are not printed, because neither "d" nor "e" (chars >= "d") are not in "c"
   "d" is printed, because "d" <= "d" and ("d" >= "d" and "d" is in "d")
   "a" is printed, because "a" <= "d" and ("d" >= "d" and "d" is in "abcda")
   "e" is not printed, because "e" > "d"
   "b" is printed, because "b" <= "d" and ("d" >= "d" and "d" is in "bcdaeb")

therefore the fourth line is: aabb__ddaa__bb

And another example based on a test-case, @Optimizer gave me. (which destroyed my 33 solution).

code input:  "acab"
order input: "acab"

printing the second line, H = "b":

   "a" is printed, because "a" <= "b" and ("c" >= "b" and "c" is in "aca")
   "c" is not printed, because "c" > "b"
   "a" is printed, because "a" <= "b" and ("c" >= "b" and "c" is in "aca")
   "b" is printed, because "b" <= "b" and ("b" >= "b" and "b" is in "b")

therefore the second line is: a_ab

Old version: 58 57 52 bytes


Try it online: Pyth Compiler/Executor

This creates a mask, which I'll modify before and after printing each line. For more information see the edit history.


Posted 2015-04-19T15:19:58.093

Reputation: 21 462


CJam, 82 bytes

Pretty long currently and I think I can shave off a few more bytes.



Basic algorithm is as following:

  • leel:F]z::+ : Group the code, the formatting and the index of each character together
  • F$_&\f{{W=1$=},\;} : Group the above triplets into priority of printing using the formatting string. This code also makes sure that the priorities are sorted.
  • ]_{0f=_W=),\0=>Lf&Ra#)}, : For each priority group of triplets, get the bounding index range and see if any index is not printed yet. If there is an unprinted index, include this priority group into the "to be printed in this step" group.
  • F,S*\:+{~;1$L+:L;t}%oNo~}% : After getting all groups to be printed in this step, fill the code into the correct index of an empty space string and then print that string. Also update the array containing the list of printed indexes.

Code explanation to be followed when I am done golfing this.


Here is the code run on the code itself:




                f{          }                                                     
                f{{     },  }                                                     
                f{{W=   },\;}                                                     
                f{{W=1$ },\;}                                                     
                f{{W=  =},\;}                                                     
                             {                                                }%];
                             {]_                                              }%];
                             {  {                   },                        }%];
                             {  {0f=                },                        }%];
                             {  {   _               },                        }%];
                             {  {    W=),           },                        }%];
                             {  {        \0=        },                        }%];
                             {  {           >       },                        }%];
                             {  {            Lf&    },                        }%];
                             {  {               Ra#)},                        }%];
                             {                        F,S*                    }%];
                             {                            \:+                 }%];
                             {                               {          }%    }%];
                             {                               {~;        }%    }%];
                             {                               {  1$L+:L; }%    }%];
                             {                               {         t}%    }%];
                             {                                            oNo~}%];

Try it online here


Posted 2015-04-19T15:19:58.093

Reputation: 25 836

oNo can be replaced with n in TIO. – Esolanging Fruit – 2017-04-27T19:04:55.190


CJam, 48 bytes



l                                                Code.
 l                                               Priority.
  :i                                             Convert priority to integer.
    :T                                           Save to T.
      .{                                }        For corresponding items:
      .{___                             }        Copy the current priority 3 times.
      .{   T#                           }        First position with this priority.
      .{     TW%                        }        Reverse T.
      .{        @#                      }        First (last) position with this priority.
      .{          ~T<                   }        Cut T at the end of this priority.
      .{             >                  }        Cut at the beginning of this priority.
      .{              +                 }        Insert the current priority to
                                                 prevent the array being empty.
      .{               :e>              }        Array maximum.
      .{                  )1$-          }        Count of integers between the current
                                                 priority and the maximum, inclusive.
      .{                      @*        }        That number of the current character.
      .{                        123Se]  }        Fill irrelevant priorities with spaces.
      .{                              m>}        Rotate the array to make non-spaces
                                                 starting at the current priority.
                                                 Returns a column containing 123 items.
                                         z       Zip to get the rows from columns.
                                          _|     Remove duplicate rows, including
                                                 unused priorities and all-space rows.
                                            (;   Remove the first row (an all-space row).
                                              N* Insert newlines.


Posted 2015-04-19T15:19:58.093

Reputation: 34 042


IDL 8.4, 316 318 304 bytes

New version, still too long, but shorter! And, in the true spirit of IDL, completely vectorized, which means (since there's no for loop) that I can now do it as one line, and run it on itself, once I get my version completely upgraded to 8.4. That'll be edited in later.

One line version:

c=(f='')&read,c,f&l=[0:strlen(f)-1]&c=strmid(c,l,1)&s=strmid(f,l,1)&u=s.uniq()&k=value_locate(u,s)&n=[0:max(k)]&d=hash(n,,y,z:max(z[(r=stregex(y,'('+x+'(.*))?'+x,len=w)):r+w-1])),f,k))&print,,l,c,d,i:i.reduce(lambda(x,i,l,c,d,n:x+(d[l[i]]ge n?c[i]:' ')),l,c,d,n)),k,c,d,l)&end

With line breaks (same number of bytes, subbing \n vs &), and commented:

c=(f='') ;initialize code and format as strings
read,c,f ;read two lines of input from the prompt
l=[0:strlen(f)-1] ;index array for the strings
c=strmid(c,l,1) ;split the code string into an array, via substrings of length 1
s=strmid(f,l,1) ;same for the format string, saving f for regex later
u=s.uniq() ;get the sorted unique values in the format string (sorts A->a)
k=value_locate(u,s) ;assign layer values to the format characters
n=[0:max(k)] ;index array for the unique format characters
print,,l,c,d,i:i.reduce(lambda(x,i,l,c,d,n:x+(d[l[i]]ge n?c[i]:' ')),l,c,d,n)),k,c,d,l)
end ;end the script

Here's an algorithmic breakdown for line 9:

r=stregex(y,'('+x+'(.*))?'+x,len=w) ; r, w = starting position & length of substring in y {format string} bracketed by x {character} (inclusive)
z[(r=...):r+w-1] ; grab a slice of z {layer array} from r to r+w-1 -> layer values for each character in the substring
max(z[...]) ; max layer (depth) of any characters in that slice,y,z:max(...)),f,k) ;map an inline function of the above to every element of the unique-formatting-character array
d=hash(n, ; create a hash using the unique indices, the result is a hash of (character:max_substring_depth)

...and 10:

x+(d[l[i]]ge n?c[i]:' ')) ; ternary concatenation: if maxdepth for this character >= current depth, add the character, otherwise add ' '
i.reduce(lambda(x,i,c,d,l,n:...)),,l,c,d,n) ;accumulate elements of i {code/format index array} by passing them through the inline ternary concatenation function
print,,l,c,d,i:i.reduce(...)),k,c,d,l) ;map the depth index through the reduction, ending up with a string for each depth layer, then print it

Lines 9 and 10 do the real work, the rest of it sets up the variables you need for the end. I think this is about as golfed as it's going to get, I can't find anywhere else to do it better.

Old version (everything below here is out of date):

This is nowhere near short enough to win, because this is a terrible golfing language, but no one ever answers in IDL so I'm just gonna go for it.

for i=0,max(f)do begin
print,f.reduce(lambda(x,y,z:x+(s.haskey(y)?z[y]:' '),s,a)
s=s.filter(lambda(x,y:x[1]gt y),i)

I'm not sure if there's any way I can cut it down more... I could call strmid on both a and b at the same time, but then I spend more bytes indexing d and it works out the same. I'll keep working on it, though! (And tomorrow I'll edit in an explanation of the algorithm.)


Posted 2015-04-19T15:19:58.093

Reputation: 1 824