Tiny Lisp, tiny interpreter

33

9

Lisp programmers boast that Lisp is a powerful language which can be built up from a very small set of primitive operations. Let's put that idea into practice by golfing an interpreter for a dialect called tinylisp.

Language specification

In this specification, any condition whose result is described as "undefined" may do anything in your interpreter: crash, fail silently, produce random gobbldegook, or work as expected. A reference implementation in Python 3 is available here.

Syntax

Tokens in tinylisp are (, ), or any string of one or more printable ASCII characters except parentheses or space. (I.e. the following regex: [()]|[^() ]+.) Any token that consists entirely of digits is an integer literal. (Leading zeros are okay.) Any token that contains non-digits is a symbol, even numeric-looking examples like 123abc, 3.14, and -10. All whitespace (including, at a minimum, ASCII characters 32 and 10) is ignored, except insofar as it separates tokens.

A tinylisp program consists of a series of expressions. Each expression is either an integer, a symbol, or an s-expression (list). Lists consist of zero or more expressions wrapped in parentheses. No separator is used between items. Here are examples of expressions:

4
tinylisp!!
()
(c b a)
(q ((1 2)(3 4)))

Expressions that are not well-formed (in particular, that have unmatched parentheses) give undefined behavior. (The reference implementation auto-closes open parens and stops parsing on unmatched close parens.)

Data types

The data types of tinylisp are integers, symbols, and lists. Built-in functions and macros can also be considered a type, though their output format is undefined. A list can contain any number of values of any type and can be nested arbitrarily deeply. Integers must be supported at least from -2^31 to 2^31-1.

The empty list ()--also referred to as nil--and the integer 0 are the only values that are considered logically false; all other integers, nonempty lists, builtins, and all symbols are logically true.

Evaluation

Expressions in a program are evaluated in order and the results of each sent to stdout (more on output formatting later).

  • An integer literal evaluates to itself.
  • The empty list () evaluates to itself.
  • A list of one or more items evaluates its first item and treats it as a function or macro, calling it with the remaining items as arguments. If the item is not a function/macro, the behavior is undefined.
  • A symbol evaluates as a name, giving the value bound to that name in the current function. If the name is not defined in the current function, it evaluates to the value bound to it at global scope. If the name is not defined at current or global scope, the result is undefined (reference implementation gives an error message and returns nil).

Built-in functions and macros

There are seven built-in functions in tinylisp. A function evaluates each of its arguments before applying some operation to them and returning the result.

  • c - cons[truct list]. Takes two arguments, a value and a list, and returns a new list obtained by adding the value at the front of the list.
  • h - head (car, in Lisp terminology). Takes a list and returns the first item in it, or nil if given nil.
  • t - tail (cdr, in Lisp terminology). Takes a list and returns a new list containing all but the first item, or nil if given nil.
  • s - subtract. Takes two integers and returns the first minus the second.
  • l - less than. Takes two integers; returns 1 if the first is less than the second, 0 otherwise.
  • e - equal. Takes two values of the same type (both integers, both lists, or both symbols); returns 1 if the two are equal (or identical in every element), 0 otherwise. Testing builtins for equality is undefined (reference implementation works as expected).
  • v - eval. Takes one list, integer, or symbol, representing an expression, and evaluates it. E.g. doing (v (q (c a b))) is the same as doing (c a b); (v 1) gives 1.

"Value" here includes any list, integer, symbol, or builtin, unless otherwise specified. If a function is listed as taking specific types, passing it different types is undefined behavior, as is passing the wrong number of arguments (reference implementation generally crashes).

There are three built-in macros in tinylisp. A macro, unlike a function, does not evaluate its arguments before applying operations to them.

  • q - quote. Takes one expression and returns it unevaluated. E.g., evaluating (1 2 3) gives an error because it tries to call 1 as a function or macro, but (q (1 2 3)) returns the list (1 2 3). Evaluating a gives the value bound to the name a, but (q a) gives the name itself.
  • i - if. Takes three expressions: a condition, an iftrue expression, and an iffalse expression. Evaluates the condition first. If the result is falsy (0 or nil), evaluates and returns the iffalse expression. Otherwise, evaluates and returns the iftrue expression. Note that the expression that is not returned is never evaluated.
  • d - def. Takes a symbol and an expression. Evaluates the expression and binds it to the given symbol treated as a name at global scope, then returns the symbol. Trying to redefine a name should fail (silently, with a message, or by crashing; the reference implementation displays an error message). Note: it is not necessary to quote the name before passing it to d, though it is necessary to quote the expression if it's a list or symbol you don't want evaluated: e.g., (d x (q (1 2 3))).

Passing the wrong number of arguments to a macro is undefined behavior (reference implementation crashes). Passing something that's not a symbol as the first argument of d is undefined behavior (reference implementation doesn't give an error, but the value cannot be referenced subsequently).

User-defined functions and macros

Starting from these ten built-ins, the language can be extended by constructing new functions and macros. These have no dedicated data type; they are simply lists with a certain structure:

  • A function is a list of two items. The first is either a list of one or more parameter names, or a single name which will receive a list of any arguments passed to the function (thus allowing for variable-arity functions). The second is an expression which is the function body.
  • A macro is the same as a function, except that it contains nil before the parameter name(s), thus making it a three-item list. (Trying to call three-item lists that do not start with nil is undefined behavior; the reference implementation ignores the first argument and treats them as macros as well.)

For example, the following expression is a function that adds two integers:

(q               List must be quoted to prevent evaluation
 (
  (x y)          Parameter names
  (s x (s 0 y))  Expression (in infix, x - (0 - y))
 )   
)

And a macro that takes any number of arguments and evaluates and returns the first one:

(q
 (
  ()
  args
  (v (h args))
 )
)

Functions and macros can be called directly, bound to names using d, and passed to other functions or macros.

Since function bodies are not executed at definition time, recursive functions are easily definable:

(d len
 (q (
  (list)
  (i list                      If list is nonempty
   (s 1 (s 0 (len (t list))))  1 - (0 - len(tail(list)))
   0                           else 0
  )
 ))
)

Note, though, that the above is not a good way to define a length function because it doesn't use...

Tail-call recursion

Tail-call recursion is an important concept in Lisp. It implements certain kinds of recursion as loops, thus keeping the call stack small. Your tinylisp interpreter must implement proper tail-call recursion!

  • If the return expression of a user-defined function or macro is a call to another user-defined function or macro, your interpreter must not use recursion to evaluate that call. Instead, it must replace the current function and arguments with the new function and arguments and loop until the chain of calls is resolved.
  • If the return expression of a user-defined function or macro is a call to i, do not immediately evaluate the branch that is selected. Instead, check whether it is a call to another user-defined function or macro. If so, swap out the function and arguments as above. This applies to arbitrarily deeply nested occurrences of i.

Tail recursion must work both for direct recursion (a function calls itself) and indirect recursion (function a calls function b which calls [etc] which calls function a).

A tail-recursive length function (with a helper function len*):

(d len*
 (q (
  (list accum)
  (i list
   (len*
    (t list)
    (s 1 (s 0 accum))
   )
   accum
  )
 ))
)
(d len
 (q (
  (list)
  (len* list 0)
 ))
)

This implementation works for arbitrarily large lists, limited only by the max integer size.

Scope

Function parameters are local variables (actually constants, since they can't be modified). They are in scope while the body of that call of that function is being executed, and out of scope during any deeper calls and after the function returns. They can "shadow" globally defined names, thereby making the global name temporarily unavailable. For example, the following code returns 5, not 41:

(d x 42)
(d f
 (q (
  (x)
  (s x 1)
 ))
)
(f 6)

However, the following code returns 41, because x at call level 1 is not accessible from call level 2:

(d x 42)
(d f
 (q (
  (x)
  (g 15)
 ))
)
(d g
 (q (
  (y)
  (s x 1)
 ))
)
(f 6)

The only names in scope at any given time are 1) the local names of the currently executing function, if any, and 2) global names.

Submission requirements

Input and output

Your interpreter may read the program from stdin or from a file specified via stdin or command-line argument. After evaluating each expression, it should output the result of that expression to stdout with a trailing newline.

  • Integers should be output in your implementation language's most natural representation. Negative integers can be output, with leading minus signs.
  • Symbols should be output as strings, with no surrounding quotes or escapes.
  • Lists should be output with all items space-separated and wrapped in parentheses. A space inside the parentheses is optional: (1 2 3) and ( 1 2 3 ) are both acceptable formats.
  • Outputting built-in functions and macros is undefined behavior. (The reference interpretation displays them as <built-in function>.)

Other

The reference interpreter includes a REPL environment and the ability to load tinylisp modules from other files; these are provided for convenience and are not required for this challenge.

Test cases

The test cases are separated into several groups so that you can test simpler ones before working up to more-complex ones. However, they will also work just fine if you dump them all in one file together. Just don't forget to remove the headings and the expected output before running it.

If you have properly implemented tail-call recursion, the final (multi-part) test case will return without causing a stack overflow. The reference implementation computes it in about six seconds on my laptop.

DLosc

Posted 2015-11-04T03:56:59.933

Reputation: 21 213

"Any token that consists entirely of digits is an integer literal. (Leading zeros are okay.) Any token that contains non-digits is a symbol, even numeric-looking examples like 123abc, 3.14, and -10." seems to contradict "Integers must be supported at least from -2^31 to 2^31-1." – msh210 – 2015-11-04T06:24:58.170

3@msh210 Not really, because the former is talking about tokens whereas the latter is talking about values. Even though there's no direct way to enter -1, I can still generate the value -1 by doing (s 0 1). – DLosc – 2015-11-04T06:51:44.390

The spec says it's undefined behavior to compare objects of different types. If I'm recursively comparing lists, and at some level have to compare two objects of different types, is that also undefined? – feersum – 2015-11-04T07:05:42.390

@feersum Correct. This means it's not strictly possible in tinylisp to write a function to compare two arbitrary lists (though the builtin e can tell you whether two lists are identical or not). Of course, if your submission is capable of handling this case gracefully, then so much the better. – DLosc – 2015-11-04T07:09:02.357

Is the evaluation order of items in a list defined? The case I have in mind is that one of them defines a global variable, which the others then reference. – feersum – 2015-11-04T08:42:44.243

@feersum Good question. I'm going to say that's undefined. I'm pretty sure the reference implementation does LTR, but there's nothing in the test cases that would require one way or the other. – DLosc – 2015-11-04T08:52:13.217

From your explanations, I guess that you want to have lexical binding of symbols. Maybe this should be made explicit. – coredump – 2015-11-04T14:29:53.820

1

@coredump After reading the pertinent Wikipedia article, I've concluded that the implementation is actually closer to dynamic, but with no scope nesting. Variables in function F are not available in function G if F calls G (as with dynamic scoping), but they are also not available in function H if H is a nested function defined inside F (as with lexical scoping)--see test case 5. So calling it "lexical" might be misleading.

– DLosc – 2015-11-04T18:37:57.160

1

To put it another way: because of the lack of scope nesting, an implementation could use either a dynamic or a lexical scoping strategy and come up with the same results. The only names in scope at any given time are 1) the local names of the currently executing function, if any, and 2) global names. Closures are not supported. (The reference implementation keeps a stack of name bindings corresponding to the call stack--a dynamic-style approach, which I think will be the easiest to implement.)

– DLosc – 2015-11-04T18:54:49.340

@Ell I brought this up in chat previously: http://chat.stackexchange.com/transcript/240?m=25126700#25126700

– feersum – 2015-11-05T03:57:34.513

What are the valid types for the condition of i? – feersum – 2015-11-05T07:22:33.890

@feersum Anything. (See second paragraph of "Data types" section.) – DLosc – 2015-11-05T07:30:40.663

Should v eval in the caller's scope or the global scope? – feersum – 2015-11-05T08:04:14.150

@feersum In the current local scope. It's just as if the code were being executed there directly. – DLosc – 2015-11-05T08:19:40.063

1

Obligatory xkcd.

– mınxomaτ – 2015-11-05T10:36:46.380

Can I assume a maximum name length? – feersum – 2015-11-05T12:18:15.040

@feersum Sure, I can live with that. How does 99 characters sound? – DLosc – 2015-11-05T21:00:24.447

@Ell v does not have to be tail recursive. (It isn't tail recursive in the reference implementation, incidentally.) But it's really neat that your implementation does it that way! – DLosc – 2015-11-06T23:47:49.250

The evaluation rule for a list should make explicit that the first item can be a symbol which is bound to a function, not just a function itself. For example: »A list of one or more items treats its first item as a function or macro (or a symbol which is bound to one) and calls it with the remaining items as arguments.« – Paŭlo Ebermann – 2015-11-07T15:05:53.673

To what should ((q ((#2) (s 6 #2))) 1) evaluate? As I see it, this takes (q ((#2) (s 6 #2))) as a function list, where q is the (unused) parameters name and ((#2) (s 6 #2)) the expression. And this tries to interpret (#2) as a function/macro, but it isn't (too few elements). Did I understand something wrong, or is there a mistake in your last example of In 5? – Paŭlo Ebermann – 2015-11-07T21:03:25.920

@PaŭloEbermann I think this edit will answer both of your points: "A list of one or more items evaluates its first item and treats it as a function or macro." To your first point, the symbol is evaluated to whatever that name is bound to (which should be a function/macro). To the second point, the list (q ((#2) (s 6 #2))) is first evaluated as code. That evaluation sends the list ((#2) (s 6 #2)) to the q macro, which returns it unevaluated as a literal list, and that list is then treated as a function. Does that help? – DLosc – 2015-11-07T23:14:16.600

@DLosc yes, this helps, thank you. I will post my solution as soon as I solved the last quirks (not related to these issues). – Paŭlo Ebermann – 2015-11-08T20:11:32.453

Answers

11

Python 2, 685 675 660 657 646 642 640 bytes

import sys,re
E=[]
G=zip("chtsle",[eval("lambda x,y=0:"+f)for f
in"[x]+y (x+[E])[0] x[1:] x-y +(x<y) +(x==y)".split()])
def V(e,L=E):
 while 1:
    try:return e and int("0%s"%e)
    except:A=e[1:]
    if""<e:return dict(G+L).get(e,e)
    f=V(e[0],L)
    if""<f:
     if f in"iv":t=V(A[0],L);e=(e[~bool(t)],t)[f>"u"];continue
     if"e">f:G[:]+=(A[0],V(A[1],L)),
     return A[0]
    if[]>f or f[0]:A=[V(a,L)for a in A]
    if[]>f:return f(*A)
    P,e=f[-2:];L=([(P,A)],zip(P,A))[P<""]
F=lambda x:V<x<""and"(%s)"%" ".join(map(F,x))or"%s"%x
for t in re.sub("([()])"," \\1 ",sys.stdin.read()).split():
 if")"==t:t=E.pop()
 if"("==t:E+=[],
 elif E:E[-1]+=t,
 else:print F(V(t))

Reads input from STDIN and writes output to STDOUT.

Although not strictly required, the interpreter supports nullary functions and macros, and optimizes tail calls executed through v.

Explanation

Parsing

To parse the input, we first surround each occurence of ( and ) with spaces, and split the resulting string into words; this gives us the list of tokens. We maintain an expression stack E, which is initially empty. We scan the tokens, in order:

  • if we encounter a (, we push an empty list at the top of the expression stack;
  • if we encounter a ), we pop the value at the top of the expression stack, and append it to the list that was previously below it on the stack;
  • otherwise, we append the current token, as a string, to the list at the top of the expression stack (we keep integers as strings at this stage, and parse them during evaluation.)

If, when processsing an ordinary token, or after popping an expression from the stack due to ), the expression stack is empty, we're at a top-level expression, and we evaluate the value we'd otherwise have appended, using V(), and print its result, formatted appropriately using F().

Evaluation

We maintain the global scope, G, as a list of key/value pairs. Initially, it contains only the builtin functions (but not the macros, and not v, which we treat as a macro), which are implemented as lambdas.

Evaluation happens inside V(), which takes the expression to evaluate, e, and the local scope, L, which is, too, a list of key/value pairs (when evaluating a top-level expression, the local scope is empty.) The guts of V() live inside an infinite loop, which is how we perform tail-call optimization (TCO), as explained later.

We process e according to its type:

  • if it's the empty list, or a string convertible to an int, we return it immediately (possibly after conversion to int); otherwise,

  • if it's a string, we look it up in a dictionary constructed from the concatenation of the global and local scopes. If we find an associated value, we return it; otherwise, e must be the name of a builtin macro (i.e. q, i, d or v), and we return it unchanged. Otherwise, if e is not a string,

  • e is a (nonempty) list, i.e., a function call. We evaluate the first element of the list, i.e., the function expression, by calling V() recursively (using the current local scope); we call the result f. The rest of the list, A, is the list of arguments. f can only be a string, in which case it's a builtin macro (or the function v), a lambda, in which case it's a builtin function, or a list, in which case it's a user-defined function or macro.

    If f is a a string, i.e., a builtin macro, we handle it in-place. If it's the macro i or v, we evaluate its first operand, and either select the second or third operand accordingly, in the case of i, or use the result of the first operand, in the case of v; instead of evaluating the selected expression recursively, which would defeat TCO, we simply replace e with the said expression, and jump to the beginning of the loop. If f is the macro d, we append a pair, whose first element is the first operand, and whose second element is the result of evaluating the second operand, to the global scope, G, and return the first operand. Otherwise, f is the macro q, in which case we simply return its operand directly.

    Othrtwise, if f is a lambda, or a list whose first element is not (), then it's a non-nullary function, not a macro, in which case we evaluate its arguments, i.e., the elements of A, and replace A with the result.

    If f is a lambda, we call it, passing it the unpacked arguments in A, and return the result.

    Otherwise, f is a list, i.e., a user-defined function or macro; its parameter list is the second-to-last element, and its body is the last element. Like in the case of the macros i and v, in order to perform TCO, we don't evaluate the body recursively, but rather replace e with the body and continue to the next iteration. Unlike i and v, however, we also replace the local scope, L, with the new local scope of the function. If the parameter list, P, is, in fact, a list, the new local scope is constructed by zipping the parameter list, P, with the argument list, A; otherwise, we're dealing with a variadic function, in which case the new local scope has only one element, the pair (P, A).

REPL

If you want to play with it, here's a REPL version of the interpreter. It supports redefining symbols, and importing files through either the command line arguments, or the (import <filename>) macro. To exit the interpreter, terminate the input (usually, Ctrl+D or Ctrl+Z).

try:import sys,re,readline
except:0
E=[];G=zip("chtsle",[eval("lambda x,y=0:"+f)for f
in"[x]+y (x+[E])[0] x[1:] x-y +(x<y) +(x==y)".split()])
def V(e,L=E):
 while 1:
    try:return e and int("0%s"%e)
    except:A=e[1:]
    if""<e:return dict(G+L).get(e,e)
    f=V(e[0],L)
    if""<f:
     if f in"iv":t=V(A[0],L);e=(e[~bool(t)],t)[f>"u"];continue
     if"e">f:G[:]+=(A[0],V(A[1],L)),
     elif"j">f:X(open(str(A[0])).read())
     return A[0]
    if[]>f or f[0]:A=[V(a,L)for a in A]
    if[]>f:return f(*A)
    P,e=f[-2:];L=([(P,A)],zip(P,A))[P<""]
F=lambda x:V<x<""and"(%s)"%" ".join(map(F,x))or"%s"%x
def X(s,v=0):
 for t in re.sub("([()])"," \\1 ",s).split():
    if")"==t:t=E.pop()
    if"("==t:E[:]+=[],
    elif E:E[-1]+=t,
    else:
     x=V(t)
     if v:print F(x)
for f in sys.argv[1:]:X("(g %s)"%f)
while 1:
 try:X(raw_input(">."[[]<E]*3+" "),1)
 except EOFError:break
 except KeyboardInterrupt:E=[];print
 except Exception as e:print"Error: "+e.message

And here's an example session, implementing merge sort:

>>> (d let d) (d if i) (d head h) (d tail t) (d prepend c) (d less l)
let
if
head
tail
prepend
less
>>>
>>> (let list (q (x... x...)))
list
>>> (let lambda (q (() (params body) (list params body))))
lambda
>>> (let def (q (() (name params body) (
...     v (list (q let) name (list (q lambda) params body))
... ))))
def
>>>
>>> (def else(body) body)
else
>>> (def or(x y) ( if x x y ))
or
>>> (def and(x y) ( if x y x ))
and
>>>
>>> (def front-half(L) ( front-half/impl L L ))
front-half
>>> (def front-half/impl(L M) (
...     if M (
...         prepend (head L)
...                 (front-half/impl (tail L) (tail (tail M)))
...     ) (else
...         ()
...     )
... ))
front-half/impl
>>>
>>> (def back-half(L) ( back-half/impl L L ))
back-half
>>> (def back-half/impl(L M) (
...     if M (
...         back-half/impl (tail L) (tail (tail M))
...     ) (else
...         L
...     )
... ))
back-half/impl
>>>
>>> (def merge(L M comp) (
...     if (and L M) (
...         if (comp (head M) (head L)) (
...             prepend (head M) (merge L (tail M) comp)
...         ) (else (
...             prepend (head L) (merge (tail L) M comp)
...         ))
...     ) (else (
...         or L M
...     ))
... ))
merge
>>>
>>> (def sort(L comp) (
...     if (and L (tail L)) (
...         merge (sort (front-half L) comp)
...               (sort (back-half L) comp)
...               comp
...     ) (else
...         L
...     )
... ))
sort
>>>
>>>
>>> (let my-list (list 4 7 2 5 9 1 6 10 8 3))
my-list
>>> my-list
(4 7 2 5 9 1 6 10 8 3)
>>> (sort my-list less)
(1 2 3 4 5 6 7 8 9 10)
>>> (sort my-list (lambda(x y) ( less y x )))
(10 9 8 7 6 5 4 3 2 1)

Ell

Posted 2015-11-04T03:56:59.933

Reputation: 7 317

You can get something ever shorter using zlib :) Compress your code converted in bytes, and replace it with : import zlib;exec(zlib.decompress(your_code_compressed_in_bytes)) – Labo – 2015-11-09T12:48:47.610

You could save two bytes by assigning A[0] to some one-char variable just after except block – Hannes Karppila – 2015-11-10T11:32:28.143

@HannesKarppila That's right, but this would break nullary functions (since A is empty in this case), and I don't want to "regress". – Ell – 2015-11-10T16:13:00.850

4

C (GNU), 1095 bytes

Much of the action takes place in the giant v function. Instead of implementing tail recursion explicitly, v is structured so that many of the calls from v to v will be handled by gcc's tail recursion optimization. There is no garbage collection.

This makes heavy use of GCC extensions, so it could only be compiled with gcc (use the command gcc -w -Os tl.c). It also uses some scanf extensions which were not available on Windows, which I usually use. The prospect of writing the parser with standard scanf was so awful that I used a Linux VM to test the program instead. Parsing without scanf character classes probably would have added 100+ bytes.

#define O(...)({o*_=malloc(32);*_=(o){__VA_ARGS__};_;})
#define P printf
#define F(I,B)({for(I;x->c;x=x->l)B;})
#define Z return
typedef struct o{struct o*n,*l,*c;int i,t;}o;E(o a,o b){Z
a.n?!strcmp(a.n,b.n):a.c?b.c&&E(*a.c,*b.c)&E(*a.l,*b.l):!b.c&a.i==b.i;}p(o*x){x->t?P("%d ",x->i):x->n?P("%s ",x->n):F(P("("),p(x->c);P(")"));}o*x,G,N;*C(o*h,o*t){Z
O(c:h,l:t);}o*v(o*x,o*e){o*W(o*l,o*e){Z
l->c?C(v(l->c,e),W(l->l,e)):&N;}o*y,*a,*f;int t;Z
x->c?y=v(x->c,e),x=x->l,t=y->i,t?9/t?a=v(x->c,e),t>7?(t>8?a->c:a->l)?:a:t>6?v(a,e):t<6?x=v(x->l->c,e),t>4?C(a,x):O(t:1,i:t>3?E(*a,*x):t>2?a->i<x->i:a->i-x->i):v((a-&N&&!a->t|a->i?x:x->l)->l->c,e):(t&1&&d(x->c->n,v(x->l->c,e)),x->c):(y->l->l->l?y=y->l:(x=W(x,e)),a=y->c,v(y->l->c,a->n?O(n:a->n,c:x,l:&G):F(f=&G,(f=O(n:a->c->n,c:x->c,l:f),a=a->l);f))):x->n?e->n?strcmp(x->n,e->n)?v(x,e->l):e->c:e:x;}d(o*n,o*x){*v(O(n:""),&G)=(o){n:n,c:x,l:O()};}*R(h){char*z,*q;Z
scanf(" %m[^ \n()]",&q)>0?h=strtoul(q,&z,10),C(*z?O(n:q):O(t:1,i:h),R()):~getchar()&1?q=R(),C(q,R()):&N;}main(i){for(;++i<12;)d(strndup("slecivthqd"+i-2,1),O(i:i));F(x=R(),p(v(x->c,&G)));}

Semi-ungolfed

typedef struct o o;
struct o {
    char* n;
    o* l, //next in this list
     * c; 
    int i,
        t;
} ;



#define O(...)({o*_=malloc(32);*_=(o){__VA_ARGS__};_;})

E(o a, o b) { //tests equality 
    return
        a.n ? !strcmp(a.n,b.n) :
        a.t ? a.i==b.i :
        a.c ? b.c && E(*a.c,*b.c)&E(*a.l,*b.l) :
        !b.c
    ;
}

#define P printf


p(o*x){
    x->t?P("%d ",x->i):x->n?P("%s ",x->n):({for(P("(");x->c;x=x->l)p(x->c);P(")");});
}


o*_,G,N; //N = nil



o*C(o*h,o*t){return O(c:h,l:t);}


/*
        2 3 4 5 6 7 8 9 10 11
        s l e c i v t h d  q
    */


o* v(o* x, o* e) { //takes list, int, or name
    o*W(o* l, o* e) { //eval each item in list
        return l->c ? C(v(l->c ,e), W(l->l, e)) : &N;
    }

    o*y,*a,*f;int t;
    return x->c ? //nonempty list = function/macro call
        y = v(x->c,e), //evals to function/macro
        x = x->l,   //list position of first arg (if it exists)
        (t=y->t)?   //builtin no., if any
             t>9 ?
              t&1 ? x->c // 11 = q
                : //10 = d
                (d(x->c,v(x->l->c,e)),x->c)
           : (a = v(x->c,e), //eval'd first arg
             t)>7 ? // t/h
                (t > 8 ? a->c : a->l) ?: a
           : t>6 ? //v
                v(a,e)
           : (x = x->l, //position of 2nd arg in list
             t)>5 ? //i
                v( (a->n||a->l||a->i|a->t>1 ? x : x->l)->c, e)
           : (x = v(x->c,e), //evaluated 2nd arg
             t)>4 ? // c
                C(a,x)
           : O(t:1,i:
                t>3 ? E(*a,*x) :  //e
                t>2 ? a->i<x->i : //l
                      a->i-x->i   //s
              )
        :
        (
            y->l->l->l ? //whether this is macro
                y = y->l :
                (x = W(x,e)),  //eval args
            a = y->c,  //a = arg list
            //a = a->n ? x=C(x, &N), C(a, &N) : a, //transform variadic style to normal
            v(y->l->c,
               a->n ? //variadic
                O(n:a->n,c:x,l:&G)
              : ({
                   for(f=&G; a->c; a=a->l,x=x->l)
                      f=O(n:a->c->n, c: x->c, l:f);
                   f;
                })
            )
        )
    :
    x->n ? // name
        e->n ?
            strcmp(x->n,e->n) ?
                v(x,e->l)
            : e->c
        : e
     : x; //int or nil
}

d(o*n,o*x){
    * v(O(n:""),&G) =
        (o){n:n->n,c:x,l:O()};
}


;
o*R(){
    char*z,*q;int h;
return scanf(" %m[^ \n()]",&q)>0?
    h=strtoul(q,&z,10),
    C(*z ? O(n:q) : O(t:1,i:h), R())
: getchar()&1?&N:(q=R(),C(q,R()));
}
main(i) {

    for(;++i<12;) d(O(n:strndup("slecivthdq"+i-2,1)),O(t:i));

    o *q;
    for(q=R(); q->c; q=q->l) p(v(q->c,&G));

}

feersum

Posted 2015-11-04T03:56:59.933

Reputation: 29 566

What is the usage of the compiled executable? Is it REPL? Does it take a filename as input? – ckjbgames – 2017-02-24T22:03:47.180

@ckjbgames It reads a program from stdin. – feersum – 2017-03-05T07:19:46.950

Okay. I think you should edit your answer and note that. – ckjbgames – 2017-03-05T12:55:06.550

1

Ceylon, 2422 bytes

(I think this is my longest golfed program yet.)

import ceylon.language{sh=shared,va=variable,fo=formal,O=Object}import ceylon.language.meta.model{F=Function}interface X{sh fo V v(S t);sh fo G g;}class G(va Map<S,V>m)satisfies X{v(S t)=>m[t]else nV;g=>this;sh S d(S s,V v){assert(!s in m);m=map{s->v,*m};return s;}}V nV=>nothing;class LC(G c,Map<S,V>m)satisfies X{g=>c;v(S t)=>m[t]else g.v(t);}alias C=>V|Co;interface Co{sh fo C st();}interface V{sh fo C l(X c,V[]a);sh default Boolean b=>0<1;sh fo C vO(X c);sh default V vF(X c){va C v=vO(c);while(is Co n=v){v=n.st();}assert(is V r=v);return r;}}class L(sh V*i)satisfies V{vO(X c)=>if(nonempty i)then i[0].vF(c).l(c,i.rest)else this;equals(O o)=>if(is L o)then i==o.i else 1<0;b=>!i.empty;string=>"(``" ".join(i)``)";hash=>i.hash;sh actual C l(X c,V[]p){value[h,ns,x]=i.size<3then[f,i[0],i[1]]else[m,i[1],i[2]];value n=if(is L ns)then[*ns.i.narrow<S>()]else ns;assert(is S|S[]n,is V x);V[]a=h(c,p);LC lC=if(is S n)then LC(c.g,map{n->L(*a)})else LC(c.g,map(zipEntries(n,a)));return object satisfies Co{st()=>x.vO(lC);};}}class S(String n)satisfies V{vO(X c)=>c.v(this);l(X c,V[]a)=>nV;equals(O o)=>if(is S o)then n==o.n else 1<0;hash=>n.hash;string=>n;}class I(sh Integer i)satisfies V{vO(X c)=>this;l(X c,V[]a)=>nV;equals(O o)=>if(is I o)then i==o.i else 1<0;hash=>i;b=>!i.zero;string=>i.string;}V[]f(X c,V[]a)=>[for(v in a)v.vF(c)];V[]m(X c,V[]a)=>a;L c(X c,V h,L t)=>L(h,*t.i);V h(X c,L l)=>l.i[0]else L();V t(X c,L l)=>L(*l.i.rest);I s(X c,I f,I s)=>I(f.i-s.i);I l(X c,I f,I s)=>I(f.i<s.i then 1else 0);I e(X c,V v1,V v2)=>I(v1==v2then 1else 0);C v(X c,V v)=>v.vO(c);V q(X c,V a)=>a;C i(X c,V d,V t,V f)=>d.vF(c).b then t.vO(c)else f.vO(c);S d(X c,S s,V x)=>c.g.d(s,x.vF(c));class B<A>(F<C,A>nat,V[](X,V[])h=f)satisfies V given A satisfies[X,V+]{vO(X c)=>nV;string=>nat.declaration.name;l(X c,V[]a)=>nat.apply(c,*h(c,a));}{<S->V>*}b=>{S("c")->B(`c`),S("h")->B(`h`),S("t")->B(`t`),S("s")->B(`s`),S("l")->B(`l`),S("e")->B(`e`),S("v")->B(`v`),S("q")->B(`q`,m),S("i")->B(`i`,m),S("d")->B(`d`,m)};[V*]p(String inp){value ts=inp.split(" \n()".contains,1<0,1<0);va[[V*]*]s=[];va[V*]l=[];for(t in ts){if(t in" \n"){}else if(t=="("){s=[l,*s];l=[];}else if(t==")"){l=[L(*l.reversed),*(s[0]else[])];s=s.rest;}else if(exists i=parseInteger(t),i>=0){l=[I(i),*l];}else{l=[S(t),*l];}}return l.reversed;}sh void run(){va value u="";while(exists l=process.readLine()){u=u+l+"\n";}V[]e=p(u);X c=G(map(b));for(v in e){print(v.vF(c));}}

I could have golfed some bytes more, as I used some two-letter identifiers in some places, but I've run out of somewhat meaningful single letters for those. Although even this way it doesn't look like Ceylon very much ...

This is an object-oriented implementation.

We have a value interface V with implementing classes L (list – just a wrapper around a Ceylon sequential of V), S (symbol – wrapper around a string), I (integer – wrapper around a Ceylon integer) and B (builtin function or macro, a wrapper around a Ceylon function).

I use the standard Ceylon equality notation by implementing the equals method (and also the hash attribute, which is really only needed for symbols), and also the standard string attribute for output.

We have a Boolean attribute b (which is true by default, overridden in I and L to return false for empty lists), and two methods l (call, i.e. use this object as a function) and vO (evaluate one step). Both return either a value or a Continuation object which allows then evaluation for one more step, and vF (evaluate fully) loops until the result is not a continuation anymore.

A context interface allows access to variables. There are two implementations, G for the global context (which allows adding variables using the d builtin) and LC for a local context, which is active when evaluating the expression of a user function (it falls back to the global context).

Symbols evaluation accesses the context, lists (if non empty) evaluate by first evaluating their first element and then calling its call method. Call is implemented just by lists and builtins – it first evaluates the argument (if a function, not if a macro) and then does the actual interesting stuff – for builtins just what is hardcoded, for lists it creates a new local context and returns a continuation with that.

For the builtins I used a trick similar to what I used in my Shift Interpreter, which allows me to define them with the argument types they need, but call them with a generic sequence using reflection (the types will be checked at call time). This avoids type conversion/assertion hassle inside the functions/macros, but needs top-level functions so I can get their meta-model Function objects.

The p (parse) function splits the string at spaces, newlines and parentheses, then loops over the tokens and builds lists using a stack and a running list.

The interpreter (in the run method, which is the entry point) then takes this list of expressions (which are just values), evaluates each of them, and prints the result.


Below is a version with comments and run through a formatter.

An earlier version before I started golfing (and still with some misunderstandings about list evaluation) is found at my Github repository, I'll put this one there soon (so make sure to look at the first version if you want the original).

//  Tiny Lisp, tiny interpreter
//
// An interpreter for a tiny subset of Lisp, from which most of the
// rest of the language can be bootstrapped.
//
// Question:   https://codegolf.stackexchange.com/q/62886/2338
// My answer:  https://codegolf.stackexchange.com/a/63352/2338
//
import ceylon.language {
    sh=shared,
    va=variable,
    fo=formal,
    O=Object
}
import ceylon.language.meta.model {
    F=Function
}

// context

interface X {
    sh fo V v(S t);
    sh fo G g;
}
// global (mutable) context, with the buildins 
class G(va Map<S,V> m) satisfies X {
    // get entry throws error on undefined variables. 
    v(S t) => m[t] else nV;
    g => this;
    sh S d(S s, V v) {
        // error when already defined
        assert (!s in m);
        // building a new map is cheaper (code-golf wise) than having a mutable one.
        m = map { s->v, *m };
        return s;
    }
}

// This is simply a shorter way of writing "this is not an allowed operation".
// It will throw an exception when trying to access it.
// nV stands for "no value".
V nV => nothing;

// local context
class LC(G c, Map<S,V> m) satisfies X {
    g => c;
    v(S t) => m[t] else g.v(t);
    // sh actual String string => "[local: ``m``, global: ``g``]";
}

// continuation or value
alias C => V|Co;

// continuation
interface Co {
    sh fo C st();
}

// value
interface V {
    // use this as a function and call with arguments.
    // will not work for all types of stuff.
    sh fo C l(X c, V[] a);
    // check the truthiness. Defaults to true, as
    // only lists and integers can be falsy.
    sh default Boolean b => 0 < 1;
    // evaluate once (return either a value or a continuation).
    // will not work for all kinds of expression.
    sh fo C vO(X c);
    /// evaluate fully
    sh default V vF(X c) {
        va C v = vO(c);
        while (is Co n = v) {
            v = n.st();
        }
        assert (is V r = v);
        return r;
    }
}
class L(sh V* i) satisfies V {

    vO(X c) => if (nonempty i) then i[0].vF(c).l(c, i.rest) else this;
    equals(O o) => if (is L o) then i == o.i else 1 < 0;
    b => !i.empty;
    string => "(``" ".join(i)``)";
    hash => i.hash;

    sh actual C l(X c, V[] p) {
        value [h, ns, x] =
                i.size < 3
                then [f, i[0], i[1]]
                else [m, i[1], i[2]];
        // parameter names – either a single symbol, or a list of symbols.
        // If it is a list, we filter it to throw out any non-symbols.
        // Should throw an error if there are any, but this is harder.
        value n = if (is L ns) then [*ns.i.narrow<S>()] else ns;
        assert (is S|S[] n, is V x);
        V[] a = h(c, p);

        // local context
        LC lC = if (is S n) then
            LC(c.g, map { n -> L(*a) })
        else
            LC(c.g, map(zipEntries(n, a)));
        // return a continuation instead of actually
        // calling it here, to allow stack unwinding.
        return object satisfies Co {
            st() => x.vO(lC);
        };
    }
}

// symbol
class S(String n) satisfies V {
    // evaluate: resolve
    vO(X c) => c.v(this);
    // call is not allowed
    l(X c, V[] a) => nV;
    // equal if name is equal
    equals(O o) => if (is S o) then n == o.n else 1 < 0;
    hash => n.hash;
    string => n;
}

// integer
class I(sh Integer i) satisfies V {

    vO(X c) => this;
    l(X c, V[] a) => nV;
    equals(O o) => if (is I o) then i == o.i else 1 < 0;
    hash => i;
    b => !i.zero;
    string => i.string;
}

// argument handlers for functions or macros
V[] f(X c, V[] a) => [for (v in a) v.vF(c)];
V[] m(X c, V[] a) => a;

// build-in functions
// construct
L c(X c, V h, L t) => L(h, *t.i);
// head
V h(X c, L l) => l.i[0] else L();
// tail
V t(X c, L l) => L(*l.i.rest);
// subtract
I s(X c, I f, I s) => I(f.i - s.i);
// lessThan
I l(X c, I f, I s) => I(f.i < s.i then 1 else 0);
// equal
I e(X c, V v1, V v2) => I(v1 == v2 then 1 else 0);
// eval (returns potentially a continuation)
C v(X c, V v) => v.vO(c);

// build-in macros
// quote
V q(X c, V a) => a;
// if (also returns potentially a continuation)
C i(X c, V d, V t, V f) => d.vF(c).b then t.vO(c) else f.vO(c);
// define symbol in global context
S d(X c, S s, V x) => c.g.d(s, x.vF(c));

// buildin function or macro, made from a native function and an argument handler
class B<A>(F<C,A> nat, V[](X, V[]) h = f)
        satisfies V
        given A satisfies [X, V+] {
    vO(X c) => nV;
    string => nat.declaration.name;
    // this "apply" is a hack which breaks type safety ...
    // but it will be checked at runtime.
    l(X c, V[] a) => nat.apply(c, *h(c, a));
}

// define buildins
{<S->V>*} b => {
    S("c") -> B(`c`),
    S("h") -> B(`h`),
    S("t") -> B(`t`),
    S("s") -> B(`s`),
    S("l") -> B(`l`),
    S("e") -> B(`e`),
    S("v") -> B(`v`),
    S("q") -> B(`q`, m),
    S("i") -> B(`i`, m),
    S("d") -> B(`d`, m)
};

// parses a string into a list of expressions.
[V*] p(String inp) {
    // split string into tokens (retain separators, don't group them –
    // whitespace and empty strings will be sorted out later in the loop)
    value ts = inp.split(" \n()".contains, 1 < 0, 1 < 0);
    // stack of not yet finished nested lists, outer most at bottom
    va [[V*]*] s = [];
    // current list, in reverse order (because appending at the start is shorter)
    va [V*] l = [];
    for (t in ts) {
        if (t in " \n") {
            // do nothing for empty tokens
        } else if (t == "(") {
            // push the current list onto the stack, open a new list.
            s = [l, *s];
            l = [];
        } else if (t == ")") {
            // build a lisp list from the current list,
            // pop the latest list from the stack, append the created lisp list. 
            l = [L(*l.reversed), *(s[0] else [])];
            s = s.rest;
        } else if (exists i = parseInteger(t), i >= 0) {
            // append an integer to the current list.
            l = [I(i), *l];
        } else {
            // append a symbol to the current list.
            l = [S(t), *l];
        }
    }
    return l.reversed;
}

// Runs the interpreter.
// This handles input and output, calls the parser and evaluates the expressions.
sh void run() {
    va value u = "";
    while (exists l = process.readLine()) {
        u = u + l + "\n";
    }
    V[] e = p(u);
    // create global context
    X c = G(map(b));
    // iterate over the expressions, ...
    for (v in e) {
        // print("  '``v``' → ...");
        // ... evaluate each (fully) and print the result.
        print(v.vF(c));
    }
}

Paŭlo Ebermann

Posted 2015-11-04T03:56:59.933

Reputation: 1 010