Write a ~ATH Interpreter

12

1

The popular webcomic Homestuck makes use of a programming language called ~ATH to destroy universes. While this code golf challenge is not to write a program to annihilate our existence, we will be destroying some more tame (albeit less interesting) entities: variables

~ATH (pronounced "til death," notice how ~ath is "tilde ath") works by creating a variable called THIS, executing a command with EXECUTE, and finishing the program with THIS.DIE(). A wiki page for the language's use in Homestuck can be found here. The goal of this challenge will be to create a ~ATH interpreter.

For the sake of the challenge, I'm going to create some details of ~ATH that don't really exist but make it (somewhat) useful.

  • The language will only work with integers, which are declared with import <variable name>;. The variable will automatically be set to a value of 0. Only one variable at a time can be imported.
  • A variable x can be copied by writing bifurcate x[y,z];, which will delete the variable x and replace it with identical variables y and z. Note that it cannot create a variable with the same name as the one deleted. Essentially, a variable is renamed, then a copy of the variable with a different name is created. This seems like a stupid feature, but stupidity is very deeply ingrained in Homestuck.
  • The syntax for writing a program that executes code on x is ~ATH(x){EXECUTE(<code>)}. If you want to execute code on two variables simultaneously, the code becomes nested, like this: ~ATH(x){~ATH(y){EXECUTE(<code>)}}. All commands in <code> will be executed on both x and y.
  • Now let's move onto commands. + increments relevant variable(s) by 1 and - decrements them by 1. And... that's it.
  • The final feature of ~ATH is that it kills whatever it works with. Variables are printed in the format <name>=<value> (followed by a newline) at the command [<name>].DIE();. Afterwards, the program prints the word DIE <name> and a newline a number of times equal to the absolute value of the value of the variable. When variables are killed simultaneously with [<name1>,<name2>].DIE(); (you can have as many variables killed as you want, so long as they exist), the DIE() command is executed on the variables sequentially.

Example programs

Program 1:

import sollux;                  //calls variable "sollux"
import eridan;                  //calls variable "eridan"
~ATH(sollux){EXECUTE(--)}       //sets the value of "sollux" to -2
~ATH(eridan){EXECUTE(+++++)}    //sets the value of "eridan" to 5
[sollux].DIE();                 //kills "sollux", prints "DIE sollux" twice
~ATH(eridan){EXECUTE(+)}        //sets the value of "eridan" to 6
[eridan].DIE();                 //kills "eridan", prints "DIE eridan" 6 times

Output:

sollux=-2
DIE sollux
DIE sollux
eridan=6
DIE eridan
DIE eridan
DIE eridan
DIE eridan
DIE eridan
DIE eridan

Program 2:

import THIS;                    //calls variable "THIS"
~ATH(THIS){EXECUTE(++++)}       //sets the value of "THIS" to 4
bifurcate THIS[THIS1,THIS2];    //deletes "THIS", creates variables "THIS1" and "THIS2" both equal to 4
~ATH(THIS1){EXECUTE(++)}        //sets the value of "THIS1" to 6
[THIS1,THIS2].DIE();            //kills "THIS1" and "THIS2", prints "DIE THIS1" 6 times then "DIE THIS2" 4 times

import THAT;                                         //calls variable "THAT"
bifurcate THAT[THESE,THOSE];                         //deletes "THAT", creates variables "THESE" and "THOSE"
~ATH(THESE){~ATH(THOSE){EXECUTE(+++)}EXECUTE(++)}    //sets the value of "THESE" and "THOSE" to 3, then sets the value of "THESE" to 5
[THESE,THOSE].DIE();                                 //kills "THESE" and "THOSE", prints "DIE THESE" 5 times then "DIE THOSE" 3 times

Output:

THIS1=6
DIE THIS1
DIE THIS1
DIE THIS1
DIE THIS1
DIE THIS1
DIE THIS1
THIS2=4
DIE THIS2
DIE THIS2
DIE THIS2
DIE THIS2
THESE=5
DIE THESE
DIE THESE
DIE THESE
DIE THESE
DIE THESE
THOSE=3
DIE THOSE
DIE THOSE
DIE THOSE

This is code golf, so standard rules apply. Shortest code in bytes wins.

Arcturus

Posted 2015-11-22T02:39:01.923

Reputation: 6 537

2Til death. I see what you did there. – Digital Trauma – 2015-11-22T03:03:17.503

3@DigitalTrauma I gotta pass the credit to Andrew Hussie (the guy who writes Homestuck) for coming up with the name. – Arcturus – 2015-11-22T03:11:28.900

@ the downvoters, would you mind sharing what you think could be improved? – Arcturus – 2015-11-22T04:30:38.850

Should the interpreter be pedantic about semicolons on line-endings? or can it just ignore their existence or lack thereof? – cat – 2015-11-22T16:09:29.993

Also, does it have to function like a REPL or can we read from a file? – cat – 2015-11-22T16:10:02.540

Is case sensitivity important everywhere or just in output? – cat – 2015-11-22T16:14:36.667

1@sysreq ~ATH uses semicolons as line-endings for the import, bifurcate, and DIE commands. Both REPL and files are fine. Case sensitivity is required in both the input and the output (I'm trying to match the actual ~ATH as much as possible). – Arcturus – 2015-11-22T16:14:39.037

Can we assume that EXECUTE statements like {EXECUTE(+-++--+--)} are not valid syntax and should be ignored? – cat – 2015-11-22T18:15:23.013

@sysreq Without ~ATH(<name>), those statements are not valid syntax. However, with ~ATH(<name>), that statement would decrement the variable by 1. – Arcturus – 2015-11-22T19:51:19.883

the last line in program 2 doesn't end in a semicolon, but it should. I just spent 15 minutes refactoring my parser to figure out why it was recording values incorrectly, but it was a missing semicolon. I never thought I'd fall for one of those :-( – cat – 2015-11-22T23:07:24.563

Also, the wiki's description of the language is wildly different (from a bare syntax perspective) from what's described here; I assume it's o.k. if I use strictly your spec? – cat – 2015-11-22T23:11:00.807

1@sysreq I had to change a few things so the language would actually do something in real life, pecs I described are fine. – Arcturus – 2015-11-23T01:45:24.657

2I'm honestly surprised this question hasn't gotten more responses, and even more surprised there's no horde of Perl wizards armed with regexy wands of magic – cat – 2015-11-23T01:48:09.773

1Maybe it's because the final product is rather large or that after all, this is an interpreter for a fake language that doesn't exist in the human world. – Arcturus – 2015-11-23T01:50:06.047

1That means I maintain the only interpreter for a fictional language from a fictional universe with a large nerd fanbase. Makes me want to improve it. – cat – 2015-11-23T13:08:58.910

Answers

3

Python 2.7.6, 1244 1308 1265 1253 1073 1072 1071 1065 1064 1063 bytes

Alright, I'm not breaking any records here but this is about the smallest Python will go insofar as reading input all at once from a file rather than sequentially over time. I'll try to one-up this later in a different language (and an interpreter, not just a parser). Until then, enjoy the disgustingly horrid monstrosity.

Note: opens a file called t in the working directory. To make it open a command line argument, add import sys to the top of the file and change 't' to sys.argv[1]

n=s='\n';m=',';X='[';Y=']';c=';';A='~ATH';D='import';b,g,k=[],[],[];r=range;l=len;f=open('t','r').read().split(n)
def d(j,u):
 p=[]
 for e in j:
  if e!=u:p.append(e)
 return''.join(p)
for h in r(l(f)):f[h]=f[h].split('//')[0].split()
while[]in f:f.remove([])
for h in r(l(f)):
 i=f[h]
 if i[0]==D and l(i)==2and i[1][l(i[1])-1]==c and d(i[1],c)not in b:g.append(0);b.append(d(i[1],c))
 elif i[0].startswith(A):
  i=i[0].split('){')
  for e in r(l(i)):
   if i[e].startswith(A):
    i[e]=i[e].split('(')
    if i[0][1]in b:g[b.index(i[0][1])]+=(i[1].count('+')-i[1].count('-'))
 elif i[0].startswith('bifurcate')and l(i)==2and i[1][l(i[1])-1]==c:
  i=i[1].split(X)
  if i[0] in b:
   z=d(d(i[1],c),Y).split(m)
   for e in r(l(z)):g.append(g[b.index(i[0])]);b.append(z[e])
   g.remove(g[b.index(i[0])]);b.remove(i[0])
 elif i[0].startswith(X)and i[0].endswith('.DIE();')and l(i)==1:
  z=d(i[0],X).split(Y)[0].split(m)
  for e in r(l(z)):
   k.append((z[e],g[b.index(z[e])]))
for e in r(l(k)):k0=k[e][0];k1=k[e][1];s+=k0+'='+str(k1)+n+('DIE '+k0+n)*abs(k1)
print s

cat

Posted 2015-11-22T02:39:01.923

Reputation: 4 989

2

Python 2, 447 475 463 443 bytes

exec("eNp1UUtrAjEQvu+vCEshiYnrxl7KbqOUVmjvCoUkxUdiG7BRkpW2iP3tTVwrReppMsx8r4l936x9A8JXoN5kmu/2WeCxK0KjrSu8mWmEs0Ad96YI27lDPu/1is7wKqcQ0kBLenM+ty0nilu4zqnPtYCSQcXL2P2LmNvl1i9mjWlBUhwKbRt14uhHjlSvjzVy1tqswO/7AjsSpKtwIpGvt2zALqyNnkf3k/FIolb2ACjlpe2jR6lk8fAUQbKNulx7YIF1IDkqwmZlGwQpxNXGW9cASyCHZKqFVVOCoJQOEhjxABKLO7N5QGmET5qOs/Qfoqq6TGUfb3ZlgKvOnOxTwJKpDq6HSLzsVfK1k7g1iB7Hd9/JWh3T9wclkYwTlY4odP0nnvk0C3RUwj95/ZUq".decode('base64').decode('zip'))

It turns out zipping and encoding the program base64 still saves bytes over the normal version. For comparison, here's the normal one:

import sys,re
d={}
s=sys.stdin.read()
s,n=re.subn(r"//.*?$",'',s,0,8)
s,n=re.subn(r"import (.*?);",r"d['\1']=0;",s,0,8)
s,n=re.subn(r"bifurcate (.*?)\[(.*?),(.*?)\];",r"d['\2']=d['\3']=d['\1'];del d['\1'];",s,0,8)
s,n=re.subn(r"([+-])",r"\g<1>1",s,0,8)
s,n=re.subn(r"EXECUTE\((.*?)\)",r"0\1",s,0,8)
s,n=re.subn(r"\[(.*?)\]\.DIE\(\);",r"for i in '\1'.split(','):print i+'='+`d[i]`+('\\n'+'DIE '+i)*abs(d[i])",s,0,8)
n=1
s=s[::-1]
while n:s,n=re.subn(r"\}([+-01]*);?([^}]*?)\{\)(.*?)\(HTA~",r";\g<2>0+\1=+]'\3'[d;\1",s,0,8)
exec(s[::-1])

Basically the "regexy wands of magic" solution that was desired. Reads in the entire program from stdin as a single string, replaces ~ATH expressions with Python expressions that do the described semantics, and exec()s the resulting string.

To see what it's doing, look at the python program the second provided test program gets translated to:

d['THIS']=0;                    
0+1+1+1+1;d['THIS']+=0+1+1+1+1+0;       
d['THIS1']=d['THIS2']=d['THIS'];del d['THIS'];    
0+1+1;d['THIS1']+=0+1+1+0;        
for i in 'THIS1,THIS2'.split(','):print i+'='+`d[i]`+('\n'+'DIE '+i)*abs(d[i])            

d['THAT']=0;                                         
d['THESE']=d['THOSE']=d['THAT'];del d['THAT'];                         
0+1+1;d['THESE']+=0+1+1+00+1+1+1;d['THOSE']+=0+1+1+1+0;    
for i in 'THESE,THOSE'.split(','):print i+'='+`d[i]`+('\n'+'DIE '+i)*abs(d[i])                                 

It's a good thing that 00 == 0 :P

Obviously, a few bytes could be saved by exploiting ambiguity in the rules. For instance, it isn't said what should happen in the event someone tries to DIE() a variable that hasn't been imported, or that has already been bifurcated. My guess based on the description was that there should be an error. If no error is required, I could remove the del statement.

EDIT: Fixed a bug that the provided test cases didn't test for. Namely, the way it was, every ~ATH block reset the variable to zero before incrementing it. It cost me 28 bytes to fix that. If anyone sees a better way to replace ~ATH blocks, I'd love to know it.

EDIT 2: Saved 12 bytes by unrolling the regex loop, making them all subns and letting the compression take care of the repetition.

EDIT 3: Saved 20 more bytes by replacing the inner for loop with a string multiplication.

quintopia

Posted 2015-11-22T02:39:01.923

Reputation: 3 899

Hey, finally the regexy wands of magic! I won't be able to beat this but well done! – cat – 2015-12-01T14:05:36.153

My implementation completely ignores things not explicitly covered by the rules, which means it's okay for you to not throw an error and just ignore those cases too. – cat – 2015-12-01T14:06:47.217

you could save some bytes by doing import sys,re rather than import sys;import re – cat – 2015-12-01T14:11:30.403

Also, how do you actually use this? you read from sys.stdin but far as I can tell, termination/execution/printing never occurs because it never stops reading from stdin. (Testing on IDLE, 2.7.6) – cat – 2015-12-01T14:12:39.107

1syntax highlighting makes this a lot easier to read – cat – 2015-12-01T14:18:25.837

1@cat sorry i forgot to answer you ever so long ago. you run it from the command line and pipe the input into from a file: python ~ath.py < program.~ath – quintopia – 2016-01-12T05:40:52.057