I present to you, the first 3% of a Hexagony self-interpreter...
|./...\..._..>}{<$}=<;>'<..../;<_'\{*46\..8._~;/;{{;<..|M..'{.>{{=.<.).|.."~....._.>(=</.\=\'$/}{<}.\../>../..._>../_....@/{$|....>...</..~\.>,<$/'";{}({/>-'(<\=&\><${~-"~<$)<....'.>=&'*){=&')&}\'\'2"'23}}_}&<_3.>.'*)'-<>{=/{\*={(&)'){\$<....={\>}}}\&32'-<=._.)}=)+'_+'&<
Try it online! You can also run it on itself, but it will take about 5-10 seconds.
In principle this might fit into side-length 9 (for a score of 217 or less), because this uses only 201 commands, and the ungolfed version I wrote first (on side-length 30) needed only 178 commands. However, I'm pretty sure it would take forever to actually make everything fit, so I'm not sure whether I'll actually attempt it.
It should also be possible to golf this a bit in size 10 by avoiding the use of the last one or two rows, such that the trailing no-ops can be omitted, but that would require a substantial rewrite, as one of the first path joins makes use of the bottom left corner.
Explanation
Let's start by unfolding the code and annotating the control flow paths:
That's still quite messy, so here is the same diagram for the "ungolfed" code which I wrote first (in fact, this is side-length 20 and originally I wrote the code on side-length 30 but that was so sparse that it wouldn't improve the readability at all, so I compacted it just a little bit to make the size a bit more reasonable):
Click for larger version.
The colours are exactly the same apart from a few very minor details, the non-control-flow commands are also exactly the same. So I'll be explaining how this works based on the ungolfed version, and if you really want to know how the golfed one works, you can check which parts there correspond to which in the larger hexagon. (The only catch is that the golfed code starts with a mirror so that the actual code begins in the right corner going left.)
The basic algorithm is almost identical to my CJam answer. There are two differences:
- Instead of solving the centred hexagonal number equation, I just compute consecutive centred hexagonal numbers until one is equal to or larger than the length of the input. This is because Hexagony does not have a simple way to compute a square root.
- Instead of padding the input with no-ops right away, I check later if I've already exhausted the commands in the input and print a
.
instead if I have.
That means the basic idea boils down to:
- Read and store input string while computing its length.
- Find the smallest side-length
N
(and corresponding centred hexagonal number hex(N)
) which can hold the entire input.
- Compute the diameter
2N-1
.
- For each line, compute the indent and the number of cells (which sum to
2N-1
). Print the indent, print the cells (using .
if the input is already exhausted), print a linefeed.
Note that there are only no-ops so the actual code starts in the left corner (the $
, which jumps over the >
, so the we really start on the ,
in the dark grey path).
Here is the initial memory grid:
So the memory pointer starts out on edge labelled input, pointing North. ,
reads a byte from STDIN or a -1
if we've hit EOF into that edge. Hence, the <
right after is a conditional for whether we've read all the input. Let's remain in the input loop for now. The next code we execute is
{&32'-
This writes a 32 into the edge labelled space, and then subtracts it from the input value in the edge labelled diff. Note that this can never be negative because we're guaranteed that the input contains only printable ASCII. It will be zero when the input was a space. (As Timwi points out, this would still work if the input could contain linefeeds or tabs, but it would also strip out all other unprintable characters with character codes less than 32.) In that case, the <
deflects the instruction pointer (IP) left and the light grey path is taken. That path simply resets the position of the MP with {=
and then reads the next character - thus, spaces are skipped. Otherwise, if the character was not a space, we execute
=}}})&'+'+)=}
This first moves around the hexagon through the length edge until its opposite the diff edge, with =}}}
. Then it copies the value from opposite the length edge into the length edge, and increments it with )&'+'+)
. We'll see in a second why this makes sense. Finally, we move the a new edge with =}
:
(The particular edge values are from the last test case given in the challenge.) At this point, the loop repeats, but with everything shifted one hexagon northeast. So after reading another character, we get this:
Now you can see that we're gradually writing the input (minus spaces) along the northeast diagonal, with the characters on every other edge, and the length up to that character being stored parallel to the edge labelled length.
When we're done with the input loop, memory will look like this (where I've already labelled a few new edges for the next part):
The %
is the last character we read, the 29
is the number of non-space characters we read. Now we want to find the side-length of the hexagon. First, there is some linear initialisation code in the dark green/grey path:
=&''3{
Here, =&
copies the length (29 in our example) into the edge labelled length. Then ''3
moves to the edge labelled 3 and sets its value to 3
(which we just need as a constant in the computation). Finally {
moves to the edge labelled N(N-1).
Now we enter the blue loop. This loop increments N
(stored in the cell labelled N) then computes its centered hexagonal number and subtracts it from the input length. The linear code which does that is:
{)')&({=*'*)'-
Here, {)
moves to and increments N. ')&(
moves to edge labelled N-1, copies N
there and decrements it. {=*
computes their product in N(N-1). '*)
multiplies that by the constant 3
and increments the result in the edge labelled hex(N). As expected, this is the Nth centered hexagonal number. Finally '-
computes the difference between that and the input length. If the result is positive, the side-length is not large enough yet, and the loop repeats (where }}
move the MP back to the edge labelled N(N-1)).
Once the side-length is large enough, the difference will be zero or negative and we get this:
First off, there is now the really long linear green path which does some necessary initialisation for the output loop:
{=&}}}32'"2'=&'*){=&')&}}
The {=&
starts by copying the result in the diff edge into the length edge, because we later need something non-positive there. }}}32
writes a 32 into the edge labelled space. '"2
writes a constant 2 into the unlabelled edge above diff. '=&
copies N-1
into the second edge with the same label. '*)
multiplies it by 2 and increments it so that we get the correct value in the edge labelled 2N-1 at the top. This is the diameter of the hexagon. {=&')&
copies the diameter into the other edge labelled 2N-1. Finally }}
moves back to the edge labelled 2N-1 at the top.
Let's relabel the edges:
The edge we're currently on (which still holds the diameter of the hexagon) will be used to iterate over the lines of the output. The edge labelled indent will compute how many spaces are needed on the current line. The edge labelled cells will be used to iterate over the number of cells in the current line.
We're now on the pink path which computes indent. ('-
decrements the lines iterator and subtracts it from N-1 (into the indent edge). The short blue/grey branch in the code simply computes the modulus of the result (~
negates the value if it's negative or zero, and nothing happens if it's positive). The rest of the pink path is "-~{
which subtracts the indent from the diameter into the cells edge and then moves back to the indent edge.
The dirty yellow path now prints the indentation. The loop contents are really just
'";{}(
Where '"
moves to the space edge, ;
prints it, {}
moves back to indent and (
decrements it.
When we're done with that the (second) dark grey path searches for next character to print. The =}
moves in position (which means, onto the cells edge, pointing South). Then we have a very tight loop of {}
which simply moves down two edges in the South-West direction, until we hit the end of the stored string:
Notice that I've relabelled one edge there EOF?. Once we've processed this character, we'll make that edge negative, so that the {}
loop will terminate here instead of the next iteration:
In the code, we're at the end of the dark grey path, where '
moves back one step onto the input character. If the situation is one of the last two diagrams (i.e. there's still a character from the input we haven't printed yet), then we're taking the green path (the bottom one, for people who aren't good with green and blue). That one is fairly simple: ;
prints the character itself. '
moves to the corresponding space edge which still holds a 32 from earlier and ;
prints that space. Then {~
makes our EOF? negative for the next iteration, '
moves a back a step so that we can return to the North-West end of the string with another tight }{
loop. Which ends on the length cell (the non-positive one below hex(N). Finally }
moves back to the cells edge.
If we've already exhausted the input though, then the loop which searches for EOF? will actually terminate here:
In that case '
moves onto the length cell, and we're taking the light blue (top) path instead, which prints a no-op. The code in this branch is linear:
{*46;{{;{{=
The {*46;
writes a 46 into the edge labelled no-op and prints it (i.e. a period). Then {{;
moves to the space edge and prints that. The {{=
moves back to the cells edge for the next iteration.
At this point the paths join back together and (
decrements the cells edge. If the iterator is not zero yet, we will take the light grey path, which simply reverses the MP's direction with =
and then goes looking for the next character to print.
Otherwise, we've reached the end of the current line, and the IP will take the purple path instead. This is what the memory grid looks like at that point:
The purple path contains this:
=M8;~'"=
The =
reverses the direction of the MP again. M8
sets the sets its value to 778
(because the character code of M
is 77
and digits will append themselves to the current value). This happens to be 10 (mod 256)
, so when we print it with ;
, we get a linefeed. Then ~
makes the edge negative again, '"
moves back to the lines edge and =
reverses the MP once more.
Now if the lines edge is zero, we're done. The IP will take the (very short) red path, where @
terminates the program. Otherwise, we continue on the purple path which loops back into the pink one, to print another line.
Control flow diagrams created with Timwi's HexagonyColorer. Memory diagrams created with the visual debugger in his Esoteric IDE.
6
Also, I'm not sure whether you want to be that picky, but backticks are ignored in the process of determining the code width because they annotate the next character. So
– Martin Ender – 2015-12-15T20:01:05.353abc\
defg` would actually become http://pastebin.com/ZrdJmHiR2@MartinBüttner Oh, I didn't know that :). For this challenge, backticks will not be ignored. – Adnan – 2015-12-15T20:04:02.240
18I really want to see an answer in Hexagony for this question. – Arcturus – 2015-12-15T21:20:27.827
2@Adnan Probably a better response would be "You can assume that the input contains no debug flags (``` characters)." – Riking – 2015-12-16T01:03:05.350
4@Ampora Ask and you shall receive. – Martin Ender – 2015-12-17T22:46:18.863
@MartinBüttner Nice work! I was about 24 hours away from offering a bounty to someone who posted a Hexagony answer too. – Arcturus – 2015-12-18T07:55:18.323
@Ampora I guess I should have waited 24 hours then. ;) – Martin Ender – 2015-12-18T08:00:58.437
1@MartinBüttner I apologize, I know that for you, reputation is really hard to come by. :P – Arcturus – 2015-12-18T08:03:18.660