McCarthy's LISP

39

12

McCarthy's 1959 LISP

In early 1959, John McCarthy wrote a groundbreaking paper defining just nine primitive functions that when put together still form the basis for all LISP-like languages today. The paper is available digitized here:

http://www-formal.stanford.edu/jmc/recursive.pdf

Your job is to fully implement a parser and interpreter for McCarthy's LISP exactly as described in the 1960 paper: That is, the functions QUOTE, ATOM, EQ, CAR, CDR, CONS, COND, LAMBDA, and LABEL should all be functional. The paper will take precedence over this challenge text when considering the correctness of answers, but I've tried to summarize the nine functions below. Note that the language will be in ALL CAPS and no error checking is necessary, all input should be presumed to be valid.

Types

  • There are only two types in McCarthy's LISP: An atom, and a linked list, which is recursively defined as a head, which may be a list or an atom, and a list that the head is attached to (tail). NIL has the special property of being both an atom and a list.
  • As per the paper, atom names will only consist of capital letters, numbers, and the space character, though strings of consecutive spaces should be considered as just one space and all leading and trailing space characters should be removed. Example equivalent atom names (replace underscore with space character): ___ATOM__1__ = ATOM_1. Example not equivalent atom names: A_TOM_1 != ATOM_1
  • Lists are denoted by parentheses, and an implied NIL is at the end of every list. Elements in a list are separated by commas and not whitespace like in most modern Lisps. So the list (ATOM 1, (ATOM 2)) would be {[ATOM 1] -> {[ATOM 2] -> NIL} -> NIL}.

QUOTE:

  • Takes one argument which may be either an atom (single element) or a linked list. Returns the argument exactly.
  • Test cases:
  • (QUOTE, ATOM 1) -> ATOM 1
  • (QUOTE, (ATOM 1, ATOM 2)) -> (ATOM 1, ATOM 2)

ATOM:

  • Takes one argument which may be either an atom (single element) or a linked list. Returns T (true) if the argument is an atom, or NIL (false) if the argument is not an atom.
  • Test cases:
  • (ATOM, (QUOTE, ATOM 1)) -> T
  • (ATOM, (QUOTE, (ATOM 1, ATOM 2))) -> NIL

EQ:

  • Takes two arguments which must be atoms (behavior is undefined if either of the arguments are not atoms). Returns T (true) if the two atoms are equivalent, or NIL (false) if they are not.
  • Test cases:
  • (EQ, (QUOTE, ATOM 1), (QUOTE, ATOM 1)) -> T
  • (EQ, (QUOTE, ATOM 1), (QUOTE, ATOM 2)) -> NIL

CAR:

  • Takes one argument which must be a list (behavior is undefined if it is not a list). Returns the first atom (head) of that list.
  • Test cases:
  • (CAR, (QUOTE, (ATOM 1, ATOM 2))) -> ATOM 1

CDR:

  • Takes one argument which must be a list (behavior is undefined if it is not a list). Returns every atom but the first atom of the list, i.e. the tail. Note that every list ends in an implied NIL, so running CDR on a list that appears to just have one element will return NIL.
  • Test cases:
  • (CDR, (QUOTE, (ATOM 1, ATOM 2))) -> (ATOM 2)
  • (CDR, (QUOTE, (ATOM 1))) -> NIL

CONS:

  • Takes two arguments. The first may be an atom or a list but the second must be a list or NIL. Prepends the first argument to the second argument and returns the newly-created list.
  • Test cases:
  • (CONS, (QUOTE, ATOM 1), (QUOTE, NIL)) -> (ATOM 1)
  • (CONS, (QUOTE, ATOM 1), (CONS, (QUOTE, ATOM 2), (QUOTE, NIL))) -> (ATOM 1, ATOM 2)

COND:

  • This is LISP's "if-else" statement of sorts. Takes a variable-length amount of arguments, each of which must be a list of length exactly 2. For each argument list in order, evaluate the first term and if it is true (T), return the associated second term and exit the function. If the first term is not true, move on to the next argument and test its condition, and so on until the first true condition is reached. At least one of the argument conditions can be assumed to be true -- if they are all false, this is undefined behavior. See page 4 for a good example of the behavior of this function.
  • Test cases:
  • (COND, ((ATOM, (QUOTE, ATOM 1)), (QUOTE, 1)), ((ATOM, (QUOTE, (ATOM 1, ATOM 2))), (QUOTE, 2))) -> 1
  • (COND, ((ATOM, (QUOTE, (ATOM 1, ATOM 2))), (QUOTE, 2)), ((ATOM, (QUOTE, ATOM 1)), (QUOTE, 1))) -> 1

LAMBDA:

  • Defines an anonymous function. Takes two arguments, the first being a list of atoms which represent the arguments to the function and the second being any S-expression (the function body), which would typically use the arguments.
  • Test cases:
  • Defining and using an anonymous "isNull" function:
  • ((LAMBDA, (ATOM 1), (EQ, ATOM 1, (QUOTE, NIL))), (QUOTE, NIL)) -> T
  • ((LAMBDA, (ATOM 1), (EQ, ATOM 1, (QUOTE, NIL))), (QUOTE, ATOM 1)) -> NIL

LABEL:

  • Gives a name to an anonymous LAMBDA function, which also allows that function to be called recursively in the body of the LAMBDA. Takes two arguments, the first being a label and the second being the LAMBDA function to which the label should be bound. Returns the name supplied. The scope of all LABEL names is global, and redefining a LABEL is undefined behavior.
  • Fun fact, LABEL is not actually necessary to create recursive functions as we now know LAMBDA can be used with a 'Y-Combinator' to accomplish this task, but McCarthy wasn't aware of this method when writing the original paper. It makes programs much easier to write anyways.
  • Test cases:
  • (LABEL, SUBST, (LAMBDA, (X, Y, Z), (COND, ((ATOM, Z), (COND, ((EQ, Y, Z), X), ((QUOTE, T), Z))), ((QUOTE, T), (CONS, (SUBST, X, Y, (CAR, Z)), (SUBST, X, Y, (CDR, Z))))))) -> SUBST
  • (after running the above) (SUBST, (QUOTE, A), (QUOTE, B), (QUOTE, (A, B, C))) -> (A, A, C)

To help visualize the SUBST function above, it could be represented as this Python-like pseudocode:

def substitute(x, y, z): # substitute all instances of y (an atom) with x (any sexp) in z
    if isAtom(z):
        if y == z:
            return x
        elif True: 
            return z
    elif True:
        return substitute(x,y,z[0]) + substitute(x,y,z[1:])

FINAL TEST CASE:

If I have transcribed it correctly, your interpreter should be able to interpret EVAL with this code:

(LABEL, CAAR, (LAMBDA, (X), (CAR, (CAR, X))))
(LABEL, CDDR, (LAMBDA, (X), (CDR, (CDR, X))))
(LABEL, CADR, (LAMBDA, (X), (CAR, (CDR, X))))
(LABEL, CDAR, (LAMBDA, (X), (CDR, (CAR, X))))
(LABEL, CADAR, (LAMBDA, (X), (CAR, (CDR, (CAR, X)))))
(LABEL, CADDR, (LAMBDA, (X), (CAR, (CDR, (CDR, X)))))
(LABEL, CADDAR, (LAMBDA, (X), (CAR, (CDR, (CDR, (CAR, X))))))

(LABEL, ASSOC, (LAMBDA, (X, Y), (COND, ((EQ, (CAAR, Y), X), (CADAR, Y)), ((QUOTE, T), (ASSOC, X, (CDR, Y))))))

(LABEL, AND, (LAMBDA, (X, Y), (COND, (X, (COND, (Y, (QUOTE, T)), ((QUOTE, T), (QUOTE, NIL)))), ((QUOTE, T), (QUOTE, NIL)))))
(LABEL, NOT, (LAMBDA, (X), (COND, (X, (QUOTE, NIL)), ((QUOTE, T), (QUOTE, T)))))

(LABEL, NULL, (LAMBDA, (X), (AND, (ATOM, X), (EQ, X, (QUOTE, NIL)))))

(LABEL, APPEND, (LAMBDA, (X, Y), (COND, ((NULL, X), Y), ((QUOTE, T), (CONS, (CAR, X), (APPEND, (CDR, X), Y))))))

(LABEL, LIST, (LAMBDA, (X, Y), (CONS, X, (CONS, Y, (QUOTE, NIL))))) 

(LABEL, PAIR, (LAMBDA, (X, Y), (COND, ((AND, (NULL, X), (NULL, Y)), (QUOTE, NIL)), ((AND, (NOT, (ATOM, X)), (NOT, (ATOM, Y))), (CONS, (LIST, (CAR, X), (CAR, Y)), (PAIR, (CDR, X), (CDR, Y)))))))

(LABEL, EVAL, (LAMBDA, (E, A), (COND, ((ATOM, E), (ASSOC, E, A)), ((ATOM, (CAR, E)), (COND, ((EQ, (CAR, E), (QUOTE, QUOTE)), (CADR, E)), ((EQ, (CAR, E), (QUOTE, ATOM)), (ATOM, (EVAL, ((CADR, E), A)))), ((EQ, (CAR, E), (QUOTE, EQ)), (EQ, (EVAL, (CADR, E, A)), (EVAL, (CADDR, E, A)))), ((EQ, (CAR, E), (QUOTE, COND)), (EVCON, (CDR, E), A)), ((EQ, (CAR, E), (QUOTE, CAR)), (CAR, (EVAL, (CADR, E), A))), ((EQ, (CAR, E), (QUOTE, CDR)), (CDR, (EVAL, (CADR, E), A))), ((EQ, (CAR, E), (QUOTE, CONS)), (CONS, (EVAL, (CADR, E), A), (EVAL, (CADDR, E), A))), ((QUOTE, T), (EVAL, (CONS, (ASSOC, (CAR, E), A), (EVLIS, (CDR, E), A)), A)))), ((EQ, (CAAR, E), (QUOTE, LABEL)), (EVAL, (CONS, (CADDAR, E), (CDR, E)), (CONS, (CONS, (CADAR, E), (CONS, (CAR, E), (CONS, A, (QUOTE, NIL))))))), ((EQ, (CAAR, E), (QUOTE, LAMBDA)), (EVAL, (CADDAR, E), (APPEND, (PAIR, (CADAR, E), (EVLIS, (CDR, E), A)), A))))))

(LABEL, EVCON, (LAMBDA, (C, A), (COND, ((EVAL, (CAAR, C), A), (EVAL, (CADAR, C), A)), ((QUOTE, T), (EVCON, (CDR, C), A)))))

(LABEL, EVLIS, (LAMBDA, (M, A), (COND, ((NULL, M), (QUOTE, NIL)), ((QUOTE, T), (CONS, (EVAL, (CAR, M), A), (EVLIS, (CDR, M), A))))))

After running that behemoth, this line should return (A, B, C):

(EVAL, (QUOTE, (CONS, X, (QUOTE, (B, C)))), (QUOTE, ((X, A), (Y, B))))

However, to quote John McCarthy himself on page 16, it seems like he was running out of characters on his computer:

If more characters were available on the computer, it could be improved considerably ...

Therefore, this challenge is tagged and the shortest answer in characters will be the winner. Standard loopholes apply. Good luck!

Note on String Evals: I understand that some think it may be possible to win this challenge by using a Lisp and modifying the syntax to fit the host language and then using a string (eval). I'm not particularly convinced that this approach will necessarily win especially with the identifier naming rules, and even if it did I think banning string evals in all languages would be a subjective and slippery slope. But I don't want to punish people for doing this challenge the 'right' way, so I may allow two winners for this challenge, one in a Lisp-like language and one in a non-Lispy language, if this becomes a problem.

Harry

Posted 2016-12-15T19:20:06.333

Reputation: 1 189

1You have a Lambda example defining an "IsNull" function, but it looks like the Nil return Nil, when it seems to me like it should return T? – nmjcman101 – 2016-12-16T20:57:33.427

@nmjcman101 Sorry, forgot to add quotes to that lambda, but fixed now. It looks correct to me -- if the input ATOM 1 is NIL, then EQ will return T so the lambda will return T. – Harry – 2016-12-16T21:51:45.400

1You have ((LAMBDA, (ATOM 1), (EQ, ATOM 1, (QUOTE, NIL))), (QUOTE, NIL)) -> NIL Where the (QUOTE NIL) at the end is the input, so this should return T? – nmjcman101 – 2016-12-16T22:41:10.157

@nmjcman101 Yes, the (QUOTE NIL) at the end is the argument being applied the anonymous function. So because nil is null, EQ and thus the whole expression will return T. – Harry – 2016-12-16T23:43:26.887

1Right, but you've written -> NIL – nmjcman101 – 2016-12-17T03:03:45.190

@nmjcman101 Oh, haha, my mistake, you're 100% right. Fixed now. – Harry – 2016-12-17T07:19:39.243

1In your description for CONS you say "Appends the first argument to the second argument and returns the newly-created list," but the test cases show the second argument being appended to the first. Which is correct? – Jordan – 2016-12-19T04:17:50.530

@Jordan, you are right, I fixed the challenge text to say "prepends" and added another test case to be more clear. It acts similarly to the cons operator in Common Lisp, and remember that in the case of inconsistencies in my summary the original paper is the first authority. Would love for you to take this challenge up! – Harry – 2016-12-19T05:02:48.603

1

I'm basing my implementation on kjetilvalle's lisp tutorial, and the syntax is slightly different. Lowercase is used and no commas are present. Could I simply run a lowercase transformation and remove commas from the input string so that it conforms more or less to the above interpreter's design? I'm pretty new to Lisp, but wanted to explore this challenge in my own language. So far I have the parser implemented. (My language looks like Lisp, but is implemented in Node.js)

– Andrakis – 2016-12-19T11:31:24.860

1@Andrakis As long as it is a correct interpreter of the LISP in the original paper and passes the test cases, I have no control over how you are supposed to get there. kjetilvalle's tutorial is good, but he does note that there are some specific differences between his version and McCarthy's original version, so you'll need to make sure you bridge the gap between those differences. Thanks for working on this! – Harry – 2016-12-19T17:41:51.017

1In your description for CONS you say, "the second [argument] must be a list or NIL," but in the first test case, (CONS, (QUOTE, ATOM 1), (QUOTE, ATOM 2)), doesn't the second argument evaluate to ATOM 2, which isn't a list? The same for the nested CONS in the third test case. – Jordan – 2016-12-19T19:58:51.483

@Jordan You're right, I fixed the test cases. – Harry – 2016-12-19T23:12:33.537

1You're missing a comma after LAMBDA in (LABEL, NOT, (LAMBDA (X), (COND, (X, (QUOTE, NIL)), ((QUOTE, T), (QUOTE, T))))). – orlp – 2016-12-21T01:34:34.440

@orlp Thank you, fixed. Wish I had caught all these errors when it was in the sandbox, haha. – Harry – 2016-12-21T01:59:59.270

1Shouldn't (SUBST, ((QUOTE, A), (QUOTE, B), (QUOTE, (A, B, C)))) in the example be (SUBST, (QUOTE, A), (QUOTE, B), (QUOTE, (A, B, C)))? SUBST takes 3 arguments, but in the example it's given a single list of 3 elements as argument. – orlp – 2016-12-21T02:38:23.193

1Also, in the SUBST definition, you have written (CAR Z) instead of (CAR, Z). I'm sorry, but have you tested the examples at all? – orlp – 2016-12-21T02:41:10.953

1You have QUOTE NIL in the final test case, another missing comma. – orlp – 2016-12-21T02:50:34.033

@orlp 1. You are right, fixed. 2. Looks like Mr. McCarthy also had some trouble writing in his own language, because that same typo was in the original paper as well (page 16). Fixed though. 3. Fixed. As for whether I have tested the examples, the answer is no, because I am not aware of any implementation that follows the original syntax yet. I did put this question in the Sandbox for a few days before putting it here and it attracted a few comments there which I fixed, but I guess I didn't expect this many errors to remain and I apologize for that but it should be fixed now. – Harry – 2016-12-21T03:49:55.663

1@Harry SUBST is still broken. (COND, (EQ, Y, Z), X) this is not how COND operates. – orlp – 2016-12-21T03:56:54.457

@orlp You are right, that was also in the original paper but I believe I have fixed it now to match the pseudocode I wrote for that function. – Harry – 2016-12-21T04:06:53.817

1@Harry I updated SUBST with a fixed version. – orlp – 2016-12-21T05:17:53.160

Answers

17

Python 3, 770 bytes

This is a REPL on stdin/stdout. Expects every line to be a full statement or empty. eval is used to shorten the implementation, but is otherwise not necessary for logic.

import re,sys;S=re.sub
P=lambda l:eval(S("([A-Z0-9][A-Z0-9 ]*)",r"' '.join('\1'.strip().split())",S("NIL","()",S("\)",",)",l))))
d={"QUOTE":'(v,L[1])[1]',"EQ":'[(),"T"][E(L[1],v)==E(L[2],v)]',
"CDR":'E(L[1],v)[1:]',"CONS":'(E(L[1],v),)+E(L[2],v)',"CAR":'E(L[1],v)[0]',
"LAMBDA":'("#",)+L[1:]',"LABEL":'[v.update({L[1]:E(L[2],v)}),L[1]][1]'}
def E(L,v):
 if L*0=="":return v[L]
 elif L[0]in d:return eval(d[L[0]])
 elif L[0]=="COND":return next(E(l[1],v)for l in L[1:]if E(l[0],v)=="T")
 elif L[0]=="ATOM":o=E(L[1],v);return[(),"T"][o*0in["",o]]
 else:l=E(L[0],v);n=v.copy();n.update({a:E(p,v)for a,p in zip(l[1],L[1:])});return E(l[2],n)
R=lambda o:o==()and"NIL"or 0*o==()and"(%s)"%", ".join(R(e)for e in o)or o
g={}
for l in sys.stdin:
 if l.strip():print(R(E(P(l),g)))

orlp

Posted 2016-12-15T19:20:06.333

Reputation: 37 067

Thank you for completing this! I really appreciate the effort and it looks good. However if I run it for me it doesn't pass these test cases: 1. (CONS, (QUOTE, ATOM 1), (QUOTE, NIL)) -> (ATOM 1) 2. (CONS, (QUOTE, ATOM 1), (CONS, (QUOTE, ATOM 2), (QUOTE, NIL))) -> (ATOM 1, ATOM 2) 3. (SUBST, (QUOTE, A), (QUOTE, B), (QUOTE, (A, B, C))) -> (A, A, C) (I haven't tried it on the EVAL test case yet). I understand that there have been errors in the tests, but I think these tests are correct and it would need to pass these to get the bounty – Harry – 2016-12-21T04:06:23.570

1@Harry The first two test cases work after fixing a small bug that I introduced in the final touches. The eval works flawlessly. But the SUBST example is still broken (to my knowledge) as a testcase. One of the CONDs reaches the end before finding a T. – orlp – 2016-12-21T05:08:29.320

1Thank you for fixing that! This is very impressive! It works for me on all testcases now, including EVAL (so pleasantly surprised I got that one right on the first try!) I'm going to award you the bounty and the accepted answer now! – Harry – 2016-12-21T07:15:45.600

2Also I love the R(E(P(l) setup ;-) – Harry – 2016-12-21T07:33:51.853

2@Harry I kid you not that was an accident! R = repr, E = eval, P = parse, l = line. – orlp – 2016-12-21T08:08:28.667

4

Just wanted to let you know, I wrote an article mentioning your implementation here!

– Harry – 2016-12-25T04:18:07.360