Self-compiling compiler

36

12

This is based on a talk on compilers I listened to a while back, but I, unfortunately, can't remember when or where.

Create the shortest compiler in any language that can compile itself. Target any reasonable ISA (68K, x86, MIPS, ARM, SPARC, IBM BAL, etc.) that doesn't have a "compile-program" instruction (this may exclude some versions of VAX). Read source programs from stdin and print generated code to stdout. You may use the standard C library for I/O and string processing (e.g., _printf). You don't need to compile the whole language, just any subset that contains the compiler (i.e., just printing an assembly language quine, while impressive, does not count as a solution.)

Hoa Long Tam

Posted 2011-02-04T07:13:51.070

Reputation: 1 902

Question was closed 2019-09-09T15:20:55.940

5Actually, this reminds me of some guy who wrote a C-compiler in about 3000 bytes of C-code as a submission to the IOCCC. – FUZxxl – 2011-02-06T14:26:31.893

5

@FUZxxl: That became tcc, which is a pretty cool product.

– dmckee --- ex-moderator kitten – 2011-02-07T04:52:46.007

Is it okay to use C as a target "ISA", provided the language being compiled is sufficiently different than C? – Joey Adams – 2011-03-31T22:43:05.917

3@dmckee "C script supported: just add #!/usr/local/bin/tcc -run at the first line of your C source, and execute it directly from the command line." That's pretty cool. – Tobia – 2014-02-03T20:41:09.157

1You need to add some criteria to prevent people from ignoring the input and producing a hard-coded constant output. – R. Martinho Fernandes – 2011-02-04T07:19:24.790

Presumably the UCSD pcode machine is "reasonable" as there was a hardware implementation. What about the JVM? Lisp machine? OISC? Hardware forth implementation?

– dmckee --- ex-moderator kitten – 2011-02-04T20:24:45.123

1Wow... this is a challenge. (Especially for Perl.) – Nathan Osman – 2011-02-05T00:19:07.907

@dmckee: pcode looks fine, as do JVM bytecode and OISC. The LISP machine didn't run LISP directly, but the ISA for the LISP machine is fine. Forth strikes me as a bit too high-level to qualify as an ISA--even if there's hardware optimized to support it--but I'm not well-versed in the language so perhaps I'm misunderstanding its Wikipedia page. To throw it out there, MMIX is totally fine, too.

– Hoa Long Tam – 2011-02-05T09:44:58.373

Can we write a self-interpreting interpreter? – Nathan Osman – 2011-02-05T23:28:14.327

@George: Self-interpreting interpreters don't count, so I've split self-interpreting interpreters to this question

– Hoa Long Tam – 2011-02-06T02:22:29.100

Answers

33

Haskell subset → C - 18926 characters

This compiles a small subset of Haskell to C. Features it supports:

  • Pattern matching and guards
  • Data declarations
  • Select infix operators
  • Lazy evaluation

The biggest missing features are nested variables (meaning no lambda/let/where/case), type checking, and type classes. Resulting programs leak memory, and self-compilation takes about 200 megabytes on my system (the Boehm garbage collector helps a lot, but only if the compiler optimizes tail recursion well).

To bootstrap, uncomment the first three lines (not counted in the score) and compile with GHC. The compiler takes Haskell-subset code on stdin and produces C code on stdout.

It's long not because the language is complex, but because I'm lazy. However, it's currently the shortest solution Not anymore. Guess I won't be bored this weekend.

-- import Prelude hiding (fmap, lookup, snd, zip);import Data.Char
-- import Data.List hiding (lookup, zip);data P a b = P a b;data B = B
-- add=(+);sub=(-);showInt=show;append[]ys=ys;append(x:xs)ys=x:append xs ys
data Program = Program [[Constructor]] [Function]
data Toplevel = TD [Constructor] | TE Equation | TO
data Constructor = Constructor String Int
data Function = Function String Int [Equation]
data Equation = Equation String [Pattern] (Maybe Expression) Expression
data Pattern = PVar String | PCon String [Pattern]
data Expression = Var String | Con String | Int String | Char String | String String | Ap Expression Expression
data Environment = Environment [P String Int] [P String VarInfo]
data VarInfo = VBox String | VArg Int | VItem VarInfo Int
main = interact (compile . parse)
constructorName (Constructor name _) = name
functionName (Function name _ _) = name
equationName (Equation name _ _ _) = name
sortToplevels [] = (P [] [])
sortToplevels (TD x : xs) = applyFst ((:) x) (sortToplevels xs)
sortToplevels (TE x : xs) = applySnd ((:) x) (sortToplevels xs)
sortToplevels (TO : xs)   = sortToplevels xs
pcons x xs = PCon "Cons" [x, xs];pnil = PCon "Nil" []
ebinary op a b = Ap (Ap (Var op) a) b;ebinaryE op a b = Ap (Ap op a) b
econs x xs = Ap (Ap (Con "Cons") x) xs
enil = Con "Nil"
listEq eq [] [] = True
listEq eq (x:xs) (y:ys) | eq x y = listEq eq xs ys
listEq _ _ _ = False
snd (P a b) = b
zip = zipWith P
lookup q (P k v : _)  | listEq (==) q k = Just v
lookup q (_     : xs)                   = lookup q xs
lookup q _                              = Nothing
compose2 f g x y = f (g x y)
applyFst f (P x y) = P (f x) y
applySnd f (P x y) = P x (f y)
fMaybe f Nothing  = Nothing
fMaybe f (Just x) = Just (f x)
cond f t False = f
cond f t True = t
condList f t [] = f
condList f t xs = t xs
countFrom n = n : countFrom (add n 1)
range l h | l > h = []
range l h         = l : range (add l 1) h
parse = makeProgram . sortToplevels . concatMap parse_p . ((:) prelude) . preprocess
parse_p (P lineno line) = maybe (parse_err lineno line) snd (parseLine line)
parse_err lineno line = error (concat ["Parse error on line ", showInt lineno, ": `", line, "`"])
preprocess = filter (not . isCommentOrEmpty . snd) . zip (countFrom 1) . map (dropWhile isBlank) . lines
isCommentOrEmpty = parserSucceeds (pro (ignore (pro (parseS "--") (parseS "import "))) parseEof)
liftA2 f a b = ap (fmap f a) b
parserSucceeds p s = maybe False (const True) (p s)
fmap f p = fMaybe (applySnd f) . p
pure x s = Just (P s x)
ap1 b (P s x) = maybe Nothing (ap2 x) (b s)
ap2 x (P s y) = Just (P s (x y))
empty = (const Nothing)
pro a b s = maybe (b s) Just (a s)
ap a b = maybe Nothing (ap1 b) . a
prc = liftA2 (:)
pra = liftA2 append
prl = liftA2 const
prr = liftA2 (const id)
many p = pro (some p) (pure [])
some p = prc p (many p)
optional p = pro (fmap Just p) (pure Nothing)
choice = foldr pro (const Nothing)
parseEof = parseEof_1
parseEof_1 [] = Just (P "" B)
parseEof_1 _  = Nothing
parsePred pred = parsePred_1 pred
parsePred_1 pred (x:xs) | pred x = Just (P xs x)
parsePred_1 _    _               = Nothing
manyParsePred = justFlipSplit
justFlipSplit pred xs = Just (P (dropWhile pred xs) (takeWhile pred xs))
someParsePred pred = prc (parsePred pred) (manyParsePred pred)
parseC = parsePred . (==)
parseS = foldr (prc . parseC) (pure [])
wrapC = wrapSpace . parseC
wrapS = wrapSpace . parseS
skipPred pred = prr (parsePred pred) (pure B)
manySkipPred pred = prr (manyParsePred pred) (pure B)
preSep p sep = many (prr sep p)
sepBy1 p sep = prc p (many (prr sep p))
sepByChar p c = pro (sepByChar1 p c) (pure [])
sepByChar1 p c = sepBy1 p (wrapSpace (parseC c))
wrapSpace p = prl (prr skipSpace p) skipSpace
ignore = fmap (const B)
isBlank c | c == ' ' || c == '\t' = True
isBlank _                         = False
isDigit1 c = c >= '1' && c <= '9'
parseBetween l r p = prl (prr (parseC l) (wrapSpace p)) (parseC r)
skipSpace = manySkipPred isBlank
chainl1 f sep p = fmap (foldl1 f) (sepBy1 p sep)
chainr1 f sep p = fmap (foldr1 f) (sepBy1 p sep)
chainl f z sep p = pro (fmap (foldl f z) (sepBy1 p sep)) (pure z)
chainr f z sep p = pro (fmap (foldr f z) (sepBy1 p sep)) (pure z)
parseNonassoc ops term = liftA2 (flip ($)) term (pro (liftA2 flip (choice ops) term) (pure id))
parseVar = prc (parsePred (orUnderscore isLower)) (many (parsePred (orUnderscore isAlphaNum)))
orUnderscore p c | p c || c == '_' = True
orUnderscore _ _ = False
parseCon = prc (parsePred isUpper) (many (parsePred (orUnderscore isAlphaNum)))
parseInt = pro (parseS "0") (prc (parsePred isDigit1) (many (parsePred isDigit)))
parseEscape q (c:x:xs) | c == '\\' = Just (P xs (c:x:[]))
parseEscape q [c]      | c == '\\' = Just (P [] [c])
parseEscape q (c:xs)   | c /= q    = Just (P xs [c])
parseEscape q _                    = Nothing
parseStringLiteral q = pra (parseS [q]) (pra (fmap concat (many (parseEscape q))) (parseS [q]))
parsePattern = chainr1 pcons (wrapC ':') (pro (liftA2 PCon parseCon (preSep parsePatternPrimary skipSpace)) parsePatternPrimary)
parsePatternPrimary = choice [fmap PVar parseVar, fmap (flip PCon []) parseCon, parseBetween '(' ')' parsePattern, parseBetween '[' ']' (fmap (foldr pcons pnil) (sepByChar parsePattern ','))]
relops f = relops_1 (ops_c f)
otherops f = f ":" (Con "Cons") : otherops_1 (ops_c f)
ops_c f x y = f x (Var y)
relops_1 f   = [f "<=" "_le", f "<" "_lt", f "==" "_eq", f ">=" "_ge", f ">" "_gt", f "/=" "_ne"]
otherops_1 f = [f "$" "_apply", f "||" "_or", f "&&" "_and", f "." "_compose"]
parseRelops = parseNonassoc (relops parseRelops_f)
parseRelops_f op func = prr (wrapS op) (pure (ebinaryE func))
parseExpression = chainr1 (ebinary "_apply") (wrapC '$') $ chainr1 (ebinary "_or") (wrapS "||") $ chainr1 (ebinary "_and") (wrapS "&&") $ parseRelops $ chainr1 econs (wrapC ':') $ chainr1 (ebinary "_compose") (wrapC '.') $ chainl1 Ap skipSpace $ choice [fmap Var parseVar, fmap Con parseCon, fmap Int parseInt, fmap Char (parseStringLiteral '\''), fmap String (parseStringLiteral '"'), parseBetween '(' ')' (pro parseSection parseExpression), parseBetween '[' ']' (chainr econs enil (wrapC ',') parseExpression)]
parseSection = choice (append (relops parseSection_f) (otherops parseSection_f))
parseSection_f op func = prr (wrapS op) (pure func)
parseEquation = ap (ap (ap (fmap Equation parseVar) (many (prr skipSpace parsePatternPrimary))) (optional (prr (wrapC '|') parseExpression))) (prr (wrapC '=') parseExpression)
skipType = ignore (sepBy1 (sepBy1 skipTypePrimary skipSpace) (wrapS "->"))
skipTypePrimary = choice [ignore parseVar, ignore parseCon, parseBetween '(' ')' skipType, parseBetween '[' ']' skipType]
parseDataDecl = prr (parseS "data") (prr skipSpace (prr parseCon (prr (preSep parseVar skipSpace) (prr (wrapC '=') (sepByChar1 (liftA2 Constructor parseCon (fmap length (preSep skipTypePrimary skipSpace))) '|')))))
skipTypeSignature = prr parseVar (prr (wrapS "::") skipType)
skipTypeAlias = prr (parseS "type") (prr skipSpace (prr parseCon (prr (preSep parseVar skipSpace) (prr (wrapC '=') skipType))))
parseToplevel = choice [fmap (const TO) (pro skipTypeSignature skipTypeAlias), fmap TD parseDataDecl, fmap TE parseEquation]
parseLine = prl (prl (sepByChar1 parseToplevel ';') skipSpace) parseEof
patternCount (Equation _ ps _ _) = length ps
makeProgram (P ds es) = Program ds (makeFunctions es)
makeFunctions = map makeFunctions_f . groupBy makeFunctions_g
makeFunctions_f []     = error "Internal error: No equations in binding group"
makeFunctions_f (x:xs) = cond (error (concat ["Equations for ", equationName x, " have different numbers of arguments"])) (Function (equationName x) (patternCount x) (x:xs)) (all (((==) (patternCount x)) . patternCount) xs)
makeFunctions_g (Equation name_a _ _ _) (Equation name_b _ _ _) = listEq (==) name_a name_b
lookupCon name (Environment c _) = lookup name c
lookupVar name (Environment _ v) = lookup name v
walkPatterns f = walkPatterns_items f VArg
walkPatterns_items f base = concat . zipWith (walkPatterns_f2 f) (map base (countFrom 0))
walkPatterns_f2 f v (PCon name ps) = append (f v (PCon name ps)) (walkPatterns_items f (VItem v) ps)
walkPatterns_f2 f v p              = f v p
compile (Program decls funcs) = concat [header, declareConstructors decls, declareFunctions funcs, boxConstructors decls, boxFunctions funcs, compileConstructors decls, compileFunctions (globalEnv decls funcs) funcs]
globalEnv decls funcs = Environment (append (globalEnv_constructorTags decls) (globalEnv_builtinConstructors)) (append (map (globalEnv_f . functionName) funcs) globalEnv_builtinFunctions)
globalEnv_f name = (P name (VBox name))
globalEnv_constructorTags = concatMap (flip zip (countFrom 0) . map constructorName)
globalEnv_builtinConstructors = [P "Nil" 0, P "Cons" 1, P "P" 0]
globalEnv_builtinFunctions = map globalEnv_f ["add", "sub", "_lt", "_le", "_eq", "_ge", "_gt", "_ne", "_and", "_or", "divMod", "negate", "not", "error"]
localEnv ps (Environment t v) = Environment t (append (walkPatterns localEnv_f ps) v)
localEnv_f v (PVar name) = [P name v]
localEnv_f _ (PCon _ _)  = []
declareFunctions_f [] = ""
declareFunctions_f xs = concat ["static Function ", intercalate ", " xs, ";\n"]
declareConstructors = declareFunctions_f . map ((append "f_") . constructorName) . concat
declareFunctions = declareFunctions_f . map ((append "f_") . functionName)
boxConstructors = concatMap boxConstructors_f . concat
boxConstructors_f (Constructor name n) = boxThing name n
boxFunctions = concatMap boxFunctions_f
boxFunctions_f (Function name n _) = boxThing name n
boxThing name n | n == 0 = concat ["static Box b_", name, " = {0, f_", name, ", NULL};\n"]
boxThing name n = concat ["static Partial p_", name, " = {", showInt n, ", 0, f_", name, "};\n", "static Box b_", name, " = {1, NULL, &p_", name, "};\n"]
compileConstructors = concatMap (concat . zipWith compileConstructors_f (countFrom 0))
compileConstructors_f tag (Constructor name n) = concat ["static void *f_", name, "(Box **args)\n", "{\n", allocate n, "\tv->tag = ", showInt tag, ";\n", concatMap initialize (range 0 (sub n 1)), "\treturn v;\n", "}\n"]
allocate n | n == 0 = "\tValue *v = malloc(sizeof(Value));\n\t(void) args;\n"
allocate n = concat ["\tValue *v = malloc(sizeof(Value) + ", showInt n, " * sizeof(Box*));\n"]
initialize i = concat ["\tv->items[", showInt i, "] = args[", showInt i, "];\n"]
compileFunctions env = concatMap (compileFunction env)
compileFunction env (Function name argc equations) =  concat ["static void *f_", name, "(Box **args)\n", "{\n", concatMap (compileEquation env) equations, "\tNO_MATCH(", name, ");\n", "}\n"]
compileEquation genv (Equation _ patterns guard expr) = compileEquation_a (localEnv patterns genv) patterns guard expr
compileEquation_a env patterns guard expr = compileEquation_b (concat ["\treturn ", compileExpressionStrict env expr, ";\n"]) (append (compilePatterns env patterns) (compileGuard env guard))
compileEquation_b returnExpr preds = condList returnExpr (compileEquation_f returnExpr) preds
compileEquation_f returnExpr xs = concat ["\tif (", intercalate " && " xs, ")\n\t", returnExpr]
compilePatterns env = walkPatterns (compilePatterns_f env)
compilePatterns_f _ _ (PVar name) = []
compilePatterns_f env v (PCon name ps) = compilePatterns_h v name (lookupCon name env)
compilePatterns_h v name (Just n) = [concat ["match(", compileVarInfo v, ",", showInt n, ")"]]
compilePatterns_h v name Nothing  = error (append "Not in scope: data constructor " name)
compileGuard env Nothing     = []
compileGuard env (Just expr) = [concat ["isTrue(", compileExpressionStrict env expr, ")"]]
compileExpressionStrict env (Var name) = concat ["force(", compileVar (lookupVar name env) name, ")"]
compileExpressionStrict _   (Con name) = concat ["force(&b_", name, ")"]
compileExpressionStrict _   (Int s)    = concat ["mkInt(", s, ")"]
compileExpressionStrict _   (Char s)   = concat ["mkInt(", s, ")"]
compileExpressionStrict _   (String s) = concat ["mkString(", s, ")"]
compileExpressionStrict env (Ap f x)   = concat ["apply(", compileExpressionStrict env f, ",", compileExpressionLazy env x, ")"]
compileExpressionLazy env (Var name) = compileVar (lookupVar name env) name
compileExpressionLazy _   (Con name) = concat ["&b_", name, ""]
compileExpressionLazy _   (Int s)    = concat ["box(mkInt(", s, "))"]
compileExpressionLazy _   (Char s)   = concat ["box(mkInt(", s, "))"]
compileExpressionLazy _   (String s) = concat ["box(mkString(", s, "))"]
compileExpressionLazy env (Ap f x)   = concat ["deferApply(", compileExpressionLazy env f, ",", compileExpressionLazy env x, ")"]
compileVar (Just v) _    = compileVarInfo v
compileVar Nothing  name = error (append "Not in scope: " name)
compileVarInfo (VBox name) = append "&b_" name
compileVarInfo (VArg n)    = concat ["args[", showInt n, "]"]
compileVarInfo (VItem v n) = concat ["item(", compileVarInfo v, ",", showInt n, ")"]
header="#include <assert.h>\n#include <stdarg.h>\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\ntypedef struct Box Box;\ntypedef struct Value Value;\ntypedef struct Partial Partial;\ntypedef void *Function(Box**);\nstruct Box{int state;Function *func;void*vc;Box*fx[];};\nstruct Value{int tag;Box *items[];};\nstruct Partial{int remaining;int applied;Function *func;Box *args[];};\n#define copy(...)memdup(&(__VA_ARGS__), sizeof(__VA_ARGS__))\n#define countof(...)(sizeof(__VA_ARGS__) / sizeof(*(__VA_ARGS__)))\n#define match(box, expectedTag)(((Value*)force(box))->tag == (expectedTag))\n#define item(box, n)(((Value*)(box)->vc)->items[n])\n#define isTrue(value)(!!*(int*)(value))\n#define NO_MATCH(func)fatal(\"Non-exhaustive patterns in function \" #func)\nstatic void fatal(const char *str){fprintf(stderr,\"*** Exception: %s\\n\", str);exit(EXIT_FAILURE);}\nstatic void *memdup(void *ptr, size_t size){void*ret=malloc(size);memcpy(ret,ptr,size);return ret;}\nstatic void *force(Box *box){switch(box->state){\ncase 0:box->state=2;box->vc=box->func(box->vc);box->state=1;\ncase 1:return box->vc;\ndefault:fatal(\"infinite loop\");}}\nstatic void *apply(Partial*f,Box*x){Partial*f2=malloc(sizeof(Partial)+(f->applied+1)*sizeof(Box*));\nmemcpy(f2->args,f->args,f->applied*sizeof(Box*));f2->args[f->applied]=x;\nif(f->remaining>1){f2->remaining=f->remaining-1;f2->applied=f->applied+1;f2->func=f->func;return f2;\n}else return f->func(f2->args);}\nstatic void*deferApply_cb(Box**a){return apply(force(a[0]),a[1]);}\nstatic Box*deferApply(Box*f,Box*x){\nBox*ret=malloc(sizeof(Box)+2*sizeof(Box*));\nret->state=0;\nret->func=deferApply_cb;\nret->vc=ret->fx;\nret->fx[0]=f;\nret->fx[1]=x;\nreturn ret;}\n\nstatic Box*defer(Function*func,void*ctx){\nBox*ret=malloc(sizeof(Box));\nret->state=0;\nret->func=func;\nret->vc=ctx;\nreturn ret;}\n\nstatic Box *box(void *value)\n{\n\tBox *ret = malloc(sizeof(Box));\n\tret->state = 1;\n\tret->func = NULL;\n\tret->vc = value;\n\treturn ret;\n}\n\nstatic int *mkInt(int n)\n{\n\tint *ret = malloc(sizeof(*ret));\n\t*ret = n;\n\treturn ret;\n}\n\nstatic Function f_Nil, f_Cons, f_P;\nstatic Box b_Nil, b_Cons, b_P, b_main;\n\n#define FUNCTION(name, argc) \\\n\tstatic Function f_##name; \\\n\tstatic Partial p_##name = {argc, 0, f_##name}; \\\n\tstatic Box b_##name = {1, NULL, &p_##name}; \\\n\tstatic void *f_##name(Box **args)\n\n#define intop(name, expr) \\\n\tFUNCTION(name, 2) \\\n\t{ \\\n\t\tint a = *(int*)force(args[0]); \\\n\t\tint b = *(int*)force(args[1]); \\\n\t\treturn mkInt(expr); \\\n\t}\n\n#define intop1(name, expr) \\\n\tFUNCTION(name, 1) \\\n\t{ \\\n\t\tint a = *(int*)force(args[0]); \\\n\t\treturn mkInt(expr); \\\n\t}\n\nintop(add,  a + b)\nintop(sub,  a - b)\n\nintop(_lt,  a <  b)\nintop(_le,  a <= b)\nintop(_eq,  a == b)\nintop(_ge,  a >= b)\nintop(_gt,  a >  b)\nintop(_ne,  a != b)\nintop(_and, a && b)\nintop(_or,  a || b)\n\nintop1(negate, -a)\nintop1(not,    !a)\n\nFUNCTION(divMod, 2)\n{\n\tint n = *(int*)force(args[0]);\n\tint d = *(int*)force(args[1]);\n\tint div = n / d;\n\tint mod = n % d;\n\t\n\tif ((mod < 0 && d > 0) || (mod > 0 && d < 0)) {\n\t\tdiv--;\n\t\tmod += d;\n\t}\n\t\n\tBox *pair[2] = {box(mkInt(div)), box(mkInt(mod))};\n\treturn f_P(pair);\n}\n\nstatic void *mkString(const char *str)\n{\n\tif (*str != '\\0') {\n\t\tBox *cons[2] =\n\t\t\t{box(mkInt(*str)), defer((Function*) mkString, (void*)(str + 1))};\n\t\treturn f_Cons(cons);\n\t} else {\n\t\treturn force(&b_Nil);\n\t}\n}\n\nstatic void putStr(Value *v, FILE *f)\n{\n\tif (v->tag == 1) {\n\t\tint c = *(int*)force(v->items[0]);\n\t\tputc(c, f);\n\t\tputStr(force(v->items[1]), f);\n\t}\n}\n\nFUNCTION(error, 1)\n{\n\tfflush(stdout);\n\tfputs(\"*** Exception: \", stderr);\n\tputStr(force(args[0]), stderr);\n\tputc('\\n', stderr);\n\texit(EXIT_FAILURE);\n}\n\nstruct mkStringFromFile\n{\n\tFILE *f;\n\tconst char *name;\n};\n\nstatic void *mkStringFromFile(struct mkStringFromFile *ctx)\n{\n\tint c = fgetc(ctx->f);\n\t\n\tif (c == EOF) {\n\t\tif (ferror(ctx->f))\n\t\t\tperror(ctx->name);\n\t\treturn force(&b_Nil);\n\t}\n\t\n\tBox *cons[2] = {box(mkInt(c)), defer((Function*) mkStringFromFile, ctx)};\n\treturn f_Cons(cons);\n}\n\nint main(void)\n{\n\tstruct mkStringFromFile c_in = {stdin, \"<stdin>\"};\n\tBox *b_in = defer((Function*) mkStringFromFile, copy(c_in));\n\tputStr(apply(force(&b_main), b_in), stdout);\n\treturn 0;\n}\n"
prelude = P 0 "_apply f x=f x;_compose f g x=f(g x);data List a=Nil|Cons a(List a);data P a b=P a b;data B=B;data Maybe a=Nothing|Just a;data Bool=False|True;id x=x;const x _=x;flip f x y=f y x;foldl f z[]=z;foldl f z(x:xs)=foldl f(f z x)xs;foldl1 f(x:xs)=foldl f x xs;foldl1 _[]=error\"foldl1: empty list\";foldr f z[]=z;foldr f z(x:xs)=f x(foldr f z xs);foldr1 f[x]=x;foldr1 f(x:xs)=f x(foldr1 f xs);foldr1 _[]=error\"foldr1: empty list\";map f[]=[];map f(x:xs)=f x:map f xs;filter p[]=[];filter p(x:xs)|p x=x:filter p xs;filter p(x:xs)=filter p xs;zipWith f(x:xs)(y:ys)=f x y:zipWith f xs ys;zipWith f _ _=[];append[]ys=ys;append(x:xs)ys=x:append xs ys;concat=foldr append[];concatMap f=concat.map f;length[]=0;length(_:l)=add 1(length l);take n _|n<=0=[];take _[]=[];take n(x:xs)=x:take(sub n 1)xs;takeWhile p[]=[];takeWhile p(x:xs)|p x=x:takeWhile p xs;takeWhile _ _=[];dropWhile p[]=[];dropWhile p(x:xs)|p x=dropWhile p xs;dropWhile p xs=xs;span p[]=P[][];span p(x:xs)|p x=span_1 x(span p xs);span p xs=P[]xs;span_1 x(P ys zs)=P(x:ys)zs;break p=span(not.p);reverse=foldl(flip(:))[];groupBy _[]=[];groupBy eq(x:xs)=groupBy_1 x eq(span(eq x)xs);groupBy_1 x eq(P ys zs)=(x:ys):groupBy eq zs;maybe n f Nothing=n;maybe n f(Just x)=f x;all p=foldr(&&)True.map p;intersperse _[]=[];intersperse _[x]=[x];intersperse sep(x:xs)=x:sep:intersperse sep xs;intercalate xs xss=concat(intersperse xs xss);isDigit c=c>='0'&&c<='9';isAlphaNum c=c>='0'&&c<='9'||c>='A'&&c<='Z'||c>='a'&&c<='z';isUpper c=c>='A'&&c<='Z';isLower c=c>='a'&&c<='z';showInt n|n<0='-':showInt(negate n);showInt n|n==0=\"0\";showInt n|n>0=reverse(map(add 48)(showInt_1 n));showInt_1 n|n==0=[];showInt_1 n=showInt_2(divMod n 10);showInt_2(P div mod)=mod:showInt_1 div;lines []=[];lines s=lines_1(break((==)'\\n')s);lines_1(P l[])=[l];lines_1(P l(_:s))=l:lines s;interact=id"

Joey Adams

Posted 2011-02-04T07:13:51.070

Reputation: 9 929

16

Custom Language → C - (7979)

Since the question doesn't preclude creating my own language, I thought I'd give that a try.

The Environment

The language has access to two stacks, The Call Stack, and The Data Stack. The Call Stack is used for the jumping instructions { and }, while The Data Stack is used by most other instructions. The Call Stack is opaque to applications.

The Data Stack can hold three different types of values: integer, text, and empty. Integers are of type intptr_t, while text is stored as C-style strings.

The ^ instruction has access to The Array. The Array is a constant array of length 17 of text items. You should probably see the source for the indexing scheme since it's a little wonky.

The Language

#   -   Begin number    - Marks the beginning of a number, for example: #42.
.   -   End number      - Marks the end of a number and pushes it to the data stack.
^   -   Translate       - Pops a number, and pushes the corresponding text from The Array.
<   -   Write           - Pops a value, and prints it to stdout.
>   -   Read            - Reads a character from stdin and pushes it as a number. If EOF,
                          exit.
{   -   Start Loop      - Pushes the current location in the program to the call stack.
}   -   End Loop        - Go to the position specified by the top of the call stack.
+   -   Add             - Pop two numbers from the data stack, add them, push the result.
-   -   Subtract        - Pop into A, pop into B, push B - A. Both B & A must be numbers.
!   -   Duplicate       - Pop from The Data Stack, push that value twice.
_   -   Discard         - Pop from The Data Stack.
=   -   Skip if Equal   - Pop two values, if they are equal skip the next instruction
                          and pop one item from the call stack.
?   -   Loop            - Pop one number, subtract one, if it's less than one, pop one
                          item from the call stack and skip the next instruction.
@   -   Array Separator - Marks the end of an array item.
$   -   Program End     - Marks the end of the program.

The Compiler

This is the compiler. It isn't golfed, and I expect it could be cut down considerably. It should be possible to use machine code directly and output a dos COM file, but I haven't gotten around to that yet. I know this looks like a C program, but the actual compiler implementation is down at the end.

Currently the compiler generates a lot of debugging information on stderr.

#include <string.h>
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <setjmp.h>
#include <stdbool.h>

const char* position;
const char* array[] = {"@"};

void die(const char* reason)
{
    fprintf(stderr, "%s\n", reason);
    exit(1);
}

//
// Stack Functions
//

#define T_EMPTY     (0)
#define T_NUMBER    (1)
#define T_TEXT      (2)

typedef struct {
    unsigned char type;
    union {
        const char* text;
        intptr_t number;
    };
} stack_entry;

#define STACK_MAX   (1024)
stack_entry stack[STACK_MAX];
size_t stack_position = 0;

stack_entry* _push()
{
    if (stack_position >= STACK_MAX) {
        die("out of stack space");
    }
    return &stack[stack_position++];
}

void push(stack_entry v)
{
    if (v.type == T_EMPTY) {
        fprintf(stderr, "\tpushed: None\n");
    } else if (v.type == T_TEXT) {
        fprintf(stderr, "\tpushed: %s\n", v.text);
    } else {
        fprintf(stderr, "\tpushed: %d\n", v.number);
    }
    stack_entry* entry = _push();
    *entry = v;
}

void push_empty()
{
    fprintf(stderr, "\tpushed: None\n");
    stack_entry* entry = _push();
    entry->type = T_EMPTY;
    entry->number = 0;
}

void push_number(intptr_t number)
{
    fprintf(stderr, "\tpushed: %d\n", number);
    stack_entry* entry = _push();
    entry->type = T_NUMBER;
    entry->number = number;
}

void push_text(const char* text)
{
    fprintf(stderr, "\tpushed: %s\n", text);
    stack_entry* entry = _push();
    entry->type = T_TEXT;
    entry->text = text;
}

// Polymorphic Push (for literals)
#define PUSH0()     do { fprintf(stderr, "literal:\n"); push_empty(); } while (0)
#define PUSH1(a)    do { fprintf(stderr, "literal:\n"); push_number(a); } while (0)

#define GET_MACRO(_0, _1, NAME, ...) NAME
#define PUSH(...) GET_MACRO(_0, ##__VA_ARGS__, PUSH1, PUSH0)(__VA_ARGS__)

stack_entry pop()
{
    if (stack_position <= 0) {
        fprintf(stderr, "\tpopped: None\n");
        return (stack_entry) {.type = T_EMPTY, .number = 0};
    }

    stack_entry v = stack[--stack_position];
    if (v.type == T_EMPTY) {
        fprintf(stderr, "\tpopped: None\n");
    } else if (v.type == T_TEXT) {
        fprintf(stderr, "\tpopped: %s\n", v.text);
    } else {
        fprintf(stderr, "\tpopped: %d\n", v.number);
    }
    return v;
}

stack_entry peek()
{
    if (stack_position <= 0) {
        return (stack_entry) {.type = T_EMPTY, .number = 0};
    }

    return stack[stack_position-1];
}

//
// Jump Functions
//

#define JUMP_MAX    (1024)
jmp_buf jump[JUMP_MAX];
size_t jump_position = 0;

#define start()                                     \
    do {                                            \
    if (jump_position >= JUMP_MAX) {                \
        die("out of jump space");                   \
    }                                               \
    fprintf(stderr, "start: %d\n", jump_position);  \
    setjmp(jump[jump_position++]);                  \
    } while (0)

void pop_jump() {
    if (jump_position <= 0) {
        die("empty jump stack");
    }
    jump_position -= 1;
}

#define end()                                       \
    do {                                            \
    if (jump_position <= 0) {                       \
        die("empty jump stack");                    \
    }                                               \
    fprintf(stderr, "end: %d\n", jump_position-1);  \
    longjmp(jump[jump_position-1],1);               \
    } while (0)

//
// Program functions
//

void translate()
{
    fprintf(stderr, "translate:\n");
    stack_entry entry = pop();
    if (entry.type == T_TEXT) {
        die("translating text");
    } else if (entry.type == T_EMPTY) {
        push_empty();
    } else {
        switch (entry.number) {
            case 0:
            case 1:
                push_text(array[entry.number]);
                break;
            case 64:
                push_text(array[2]);
                break;
            case 94:
                push_text(array[3]);
                break;
            case 45:
                push_text(array[4]);
                break;
            case 43:
                push_text(array[5]);
                break;
            case 62:
                push_text(array[6]);
                break;
            case 60:
                push_text(array[7]);
                break;
            case 33:
                push_text(array[8]);
                break;
            case 95:
                push_text(array[9]);
                break;
            case 61:
                push_text(array[10]);
                break;
            case 63:
                push_text(array[11]);
                break;
            case 123:
                push_text(array[12]);
                break;
            case 125:
                push_text(array[13]);
                break;
            case 35:
                push_text(array[14]);
                break;
            case 46:
                push_text(array[15]);
                break;
            case 36:
                push_text(array[16]);
                break;
            default:
                push_empty();
                break;
        }
    }
}

void subtract()
{
    fprintf(stderr, "subtract:\n");
    stack_entry v1 = pop();
    stack_entry v2 = pop();

    if (v1.type != T_NUMBER || v2.type != T_NUMBER) {
        die("not a number");
    }

    push_number(v2.number - v1.number);
}

void add()
{
    fprintf(stderr, "add:\n");
    stack_entry v1 = pop();
    stack_entry v2 = pop();

    if (v1.type != T_NUMBER || v2.type != T_NUMBER) {
        die("not a number");
    }

    push_number(v2.number + v1.number);
}

void read()
{
    fprintf(stderr, "read:\n");
    int in = getchar();

    if (in >= 0) {
        push_number(in);
    } else {
        die("end of input");
    }
}

void write()
{
    fprintf(stderr, "write:\n");
    stack_entry v = pop();

    if (v.type == T_NUMBER) {
        putchar(v.number);
    } else if (v.type == T_TEXT) {
        const char* x = v.text;
        char y;
        while (0 != (y=*(x++))) {
            y -= 128;
            putchar(y);
        }
    }
}

void duplicate()
{
    fprintf(stderr, "duplicate:\n");
    stack_entry v = pop();
    push(v);
    push(v);
}

void discard()
{
    fprintf(stderr, "discard:\n");
    pop();
}

bool equals()
{
    fprintf(stderr, "equals:\n");
    stack_entry x = pop();
    stack_entry y = pop();

    bool skip;

    if (x.type != y.type) {
        skip = false;
    } else if (x.type == T_EMPTY) {
        skip = true;
    } else if (x.type == T_NUMBER) {
        skip = x.number == y.number;
    } else {
        skip = strcmp(x.text, y.text) == 0;
    }

    if (skip) {
        pop_jump();
    }

    return !skip;
}

bool question()
{
    fprintf(stderr, "question:\n");
    stack_entry x = pop();

    intptr_t value;

    if (x.type == T_EMPTY) {
        value = 0;
    } else if (x.type == T_NUMBER) {
        value = x.number;
    } else {
        die("it is bad form to question text");
    }

    value -= 1;

    if (value < 1) {
        pop_jump();
        return false;
    } else {
        push_number(value);
        return true;
    }
}

int main()
{
@","@translate();@subtract();@add();@read();@write();@duplicate();@discard();@if(equals())@if(question())@start();@end();@PUSH(@);@return 0;}@

#0.^<                           Emit the preface

#17.{                           Loop for as many array slots exist
    #.{<>#128.+!#192.=}         Copy characters, adding 128 until reaching an at sign
    #128.-
    ^<                          Emit the code between array items
?}                              Return to start

#1.^<                           Emit the prologue


{{
>!^<                            Read character, translate it, and print it
!#35.=}                         Check if we have a literal
#.{<>!#46.=}^<                  If so, verbatim copy characters until a period
}                               Continue executing
$

To compile the generated C code:

gcc -finput-charset=CP437 -fexec-charset=CP437 -std=gnu11

The charset is required because the compiler escapes special characters by adding 128.

The Bootstrap

To compile the first compiler, I wrote a python interpreter for the language.

import sys
from collections import defaultdict
KEYS = [0,1] + map(ord, ['@','^','-','+','>','<','!','_','=','?','{','}','#','.','$'])

# Read the source file
with file(sys.argv[1]) as f:
    data = f.read()
pos = 0

# Initialize the environment
array = defaultdict(str)
jmp = []
stk = []

def log(x):
    sys.stderr.write(x + '\n')

def read():
    global pos,data
    pos += 1
    return data[pos-1]

def pop():
    global stk
    try:
        x = stk.pop()
    except IndexError:
        x = None
    log('\tpopped ' + repr(x))
    return x

def push(value):
    global stk
    log('\tpushing ' + repr(value))
    stk.append(value)

# Read the array initialization section
for key in KEYS:
    while True:
        c = read()
        if c == '@':
            break
        array[key] += c

# Execute the program
while pos < len(data):
    c = read()
    if c == '^':
        log('translate:')
        push(array.get(pop(), None))
    elif c == '-':
        log('subtract:')
        x = pop()
        y = pop()
        push(y - x)
    elif c == '+':
        log('add:')
        x = pop()
        y = pop()
        push(y + x)
    elif c == '>':
        log('read:')
        push(ord(sys.stdin.read(1)))
    elif c == '<':
        log('write:')
        v = pop()
        if isinstance(v, int):
            sys.stdout.write(chr(v))
        elif v is not None:
            sys.stdout.write(v)
    elif c == '!':
        log('duplicate:')
        x = pop()
        push(x)
        push(x)
    elif c == '_':
        log('discard:')
        pop()
    elif c == '=':
        log('skip if equal:')
        x,y = pop(),pop()
        if x == y:
            pos += 1
            jmp.pop()
    elif c == '?':
        log('loop:')
        x = pop()
        x -= 1
        if x < 1:
            pos += 1
            jmp.pop()
        else:
            push(x)
    elif c == '{':
        log('start: ' + repr(pos))
        jmp.append(pos)
    elif c == '}':
        log('end:')
        pos = jmp[-1]
    elif c == '#':
        literal = ''
        while True:
            c = read()
            if c == '.':
                log('literal: ' + repr(literal))
                if literal == '':
                    push(None)
                else:
                    push(int(literal))
                break
            else:
                literal += c

Putting It All Together

Assuming you've saved the compiler as compiler.cmp and the bootstrap as bootstrap.py, here's how to build the compiler, and then use it to compile itself:

$ cat compiler.cmp |
  python bootstrap.py compiler.cmp 2> trace-bootstrap |
  gcc -finput-charset=CP437 -fexec-charset=CP437 -std=gnu11 -o result -xc -
$ cat compiler.cmp | ./result 2> trace-final

So I'm not much of a C programmer, nor am I much of a language designer, so any suggestions on improving this are quite welcome!

Example Programs

Hello, World!

Hello, World!@@@@@@@@@@@@@@@@@#0.^<$

tecywiz121

Posted 2011-02-04T07:13:51.070

Reputation: 1 127

4

You should add your language to esolangs.org

– mbomb007 – 2015-08-28T21:07:59.010

1While impressive, this violates the [tag:code-golf] rules as defined in the Help Center. – Iszi – 2014-02-03T17:48:52.183

1

Correction, the rule is found in the [tag:code-golf] tag wiki here as well as in Meta.

– Iszi – 2014-02-03T18:54:12.910

7@Iszi I hadn't come across that page before, and you are totally correct... However, I'd like to provide some mitigating facts before you consider sentencing: This language is general purpose, and does not provide a "one character solution", and secondly, if you include the language implementation in the solution, my score doesn't change, since it implements itself (granted, I couldn't claim that my solution compiles C, so it wouldn't solve the original problem.) – tecywiz121 – 2014-02-03T19:14:19.877

What's your language called? – Beta Decay – 2014-08-24T08:29:08.427

14

Extended Brainfuck v0.9: 618 bytes (not counting the uneccesary linefeeds)

:c:n:z:g:i:t:w:a:p++++++++[->++++++++<]>[->>>>>>>[>>>>>>>>]+[<<<<<<<<]>]>>>>>>>[->>>>>>>>]@i
$i,[[-$t+$w+$i]$t[-$i+$t]+$a+++[-$w-----------$a]$w---[$a++[-$w-----------$a]$w[--[--[--[$i.
$t+++++++[-$w++++++++$t]$w[-]]$t[-$p[-]$i.$n,.[-<[<<]+[>>]<]@n$c[<<]>[-<<<+>>>>[>>]@z$p+$c[<
<]>]<<<[->>>+<<<]>>>>[->>]@z$t]$w]$t[-$i.$p+$t]$w]$t[-$i.$p-$t]$w]$t[$i.$n,.[-<[<<]+[>>]<]@n
$g[-$t+$c[<<]>+>[>>]@z>]$c[<<]>>[->>]@z$t[-$g+$t]$t]$w]$t[-$i.[-]$n,.[-<[<<]+[>>]<]@n$c[<<]>
[-<<<+>>>>[>>]@z$i+$a+$c[<<]>]<<<[->>>+<<<]>>>>[->>]@z<++++++[->++++++++++<]$w+$p[$a[-$w-]<[
@w-$p[-$z.$p]+$t]$w+$p-]$z++$w-$a[-$z.$a]$z[-]$i[-$p+$i]$t]$w$i,]

This is a golfed version of my very first version of EBF with removed support for comments and dead code to support removing of variables.

So basically it's BrainFuck with variables. :x creates variables x. The compiler knows where you are so $y will produce <'s and >'s to get to that position. Sometimes you need asymmetric loops and then you need to tell the compiler where you are with @x. As current EBF it compiles to Brainfuck.

This first version had only one char variable names, but I have used this version to compile the next version and so on until the current version that has an impressive feature set. When compiling from github source it actually downloads the handcompiled binary to bootstrap 6 intermediate ebf versions in order to create the current version.

To bootstrap it you may use this first and only binary in the EBF git repository which was compiled by hand successfully after a couple of tries.

wget -nv https://raw.githubusercontent.com/westerp/ebf-compiler/34c378c8347aafa5dbf37f4973461d42c8120ea4/ebf-handcompiled.bf
beef ebf-handcompiled.bf < ebf09.ebf > ebf09a.bf
beef ebf09a.bf < ebf09.ebf > ebf09b.bf
diff -s ebf09a.bf ebf09b.bf # Files ebf09a.bf and ebf09b.bf are identical

Brainfuck has a few hardware implementations, eg. this, this and this to mention a few. But mostly it's so easy to implement you can practically implement an interpreter on any system at all. I use to joke that Zozotez LISP, which is written in EBF, probably is the most portable LISP ever.

Sylwester

Posted 2011-02-04T07:13:51.070

Reputation: 3 678

8

Hex, 550 bytes

This specifically targets x86_64 systems running Linux.

7f454c4602010100000000000000000002003e0001000000780040000000000040000000000000000000000000000000000000004000380001004000000000000100000005000000000000000000000000004000000000000000400000000000130100000000000013010000000000000000200000000000e81700000085c07c0b31ff01c7e879000000ebec31c089c7b03c0f05e84c0000004885c07c203c2178f2741b89c7e825000000c0e00450e83100000089c7e8150000005900c8c3e8210000003c0d7ff774ca3c0a75f1ebc489f831c93c400f9cc148ffc980e12780c13028c8c36a004889e631c089c2fec289c70f0531c985c0580f95c148ffc94809c8c3574889e631c0fec089c289c70f0558c3

In this language, source code consists of bytes represented as two lowercase hexadecimal digits, [0-9a-f][0-9a-f]. These bytes may have any amount of surrounding whitespace, but nothing may occur between the digits that form a single byte. Further, '!' is a line-comment character: it is ignored, as well as everything between it and the next '\n' character.

If you understand x86 assembly, here's a much more readable version of the source code:

! ELF Header !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
7f 45 4c 46                !e_ident[EI_MAG0] (0x7F "ELF")
02                         !e_ident[EI_CLASS] (64-bit)
01                         !e_ident[EI_DATA] (little-endian)
01                         !e_ident[EI_VERSION] (ELF v1)
00                         !e_ident[EI_OSABI] (System V ABI)
00                         !e_ident[EI_ABIVERSION] (version 0)
00 00 00 00 00 00 00       !e_ident [EI_PAD]
02 00                      !e_type (executable)
3e 00                      !e_machine (x86_64)
01 00 00 00                !e_version (ELF v1)
78 00 40 00 00 00 00 00    !e_entry (0x40078)
40 00 00 00 00 00 00 00    !e_phoff (0x   40)
00 00 00 00 00 00 00 00    !e_shoff (0x    0)
00 00 00 00                !e_flags
40 00                      !e_ehsize (ELF header size = 64 bytes)
38 00                      !e_phentsize (Program headers = 56 bytes)
01 00                      !e_phnum (1 program header)
40 00                      !e_shentsize (Section headers = 64 bytes)
00 00                      !e_shnum (no section headers)
00 00                      !e_shstrndx (section names, not useful here)

! Program Headers !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
01 00 00 00                !p_type (LOAD)
05 00 00 00                !p_flags (R+E)
00 00 00 00 00 00 00 00    !p_offset (file-loc 0)
00 00 40 00 00 00 00 00    !p_vaddr (vmem-loc 0x40000)
00 00 40 00 00 00 00 00    !p_paddr (pmem-loc 0x40000)
13 01 00 00 00 00 00 00    !p_filesz (length 0x113 bytes)
13 01 00 00 00 00 00 00    !p_memsz (allocate 0x113 bytes)
00 00 20 00 00 00 00 00    !p_align (align pages in 0x20000 increments)


! Program Code !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! _start: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
      e8 17 00 00 00 ! callq _gethx
               85 c0 ! test %eax,%eax
               7c 0b ! jl .+11
               31 ff ! xor %edi,%edi
               01 c7 ! add %eax,%eax
      e8 79 00 00 00 ! callq _putch
               eb ec ! jmp .-20
               31 c0 ! xor %eax,%eax
               89 c7 ! mov %eax,%edi
               b0 3c ! mov $0x3c,%al
               0f 05 ! syscall

!! _gethx: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
      e8 4c 00 00 00 ! callq _getch
            48 85 c0 ! test %rax,%rax
               7c 20 ! jl _gethx+42
               3c 21 ! cmp $0x21,al
               78 f2 ! js _gethx
               74 1b ! je _gethx+43
               89 c7 ! mov %eax,%edi
      e8 25 00 00 00 ! callq _h2d
            c0 e0 04 ! sal $4,%al
                  50 ! push %rax
      e8 31 00 00 00 ! callq _getch
               89 c7 ! mov %eax,%edi
      e8 15 00 00 00 ! callq _h2d
                  59 ! pop %rcx
               00 c8 ! add %cl,%al
                  c3 ! retq
      e8 21 00 00 00 ! callq _getch
               3c 0d ! cmp $0xd,%al
               7f f7 ! jg _gethx+43
               74 ca ! je _gethx
               3c 0a ! cmp $0xa,%al
               75 f1 ! jne _gethx+43
               eb c4 ! jmp _gethx

!! _h2d: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
               89 f8 ! mov %edi,%eax
               31 c9 ! xor %ecx,%ecx
               3c 40 ! cmp $0x40,%al
            0f 9c c1 ! setl %cl
            48 ff c9 ! dec %rcx
            80 e1 27 ! and $0x27,%cl
            80 c1 30 ! add $0x30,%cl
               28 c8 ! sub %cl,%al
                  c3 ! retq

!! _getch: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
               6a 00 ! push $0
            48 89 e6 ! mov %rsp,%rsi
               31 c0 ! xor %eax,%eax
               89 c2 ! mov %eax,%edx
               fe c2 ! inc %dl
               89 c7 ! mov %eax,%edi
               0f 05 ! syscall
               31 c9 ! xor %ecx,%ecx
               85 c0 ! test %eax,%eax
                  58 ! pop %rax
            0f 95 c1 ! setne %cl
            48 ff c9 ! dec %rcx
            48 09 c8 ! or %rcx,%rax
                  c3 ! retq

!! _putch: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
                  57 ! push %rdi
            48 89 e6 ! mov %rsp,%rsi
               31 c0 ! xor %eax,%eax
               fe c0 ! inc %al
               89 c2 ! mov %eax,%edx
               89 c7 ! mov %eax,%edi
               0f 05 ! syscall
                  58 ! pop %rax
                  c3 ! retq

If you extract the assembly language from the comments below ! Program Code, you can assemble and run the Hex compiler. Input and output use stdin and stdout.

Fox

Posted 2011-02-04T07:13:51.070

Reputation: 341

Hex is not a language. – TuxCrafting – 2016-11-02T19:23:46.580

@TùxCräftîñg That's not exactly true. It doesn't have a Wikipedia page, but it did exist in this form, with this implementation, before I knew of this question (unnamed at the time). From my svn: http://lutras-hacking.ddns.net/websvn/listing.php?repname=sasm The idea was to build an assembler from nothing, but I never got that far.

– Fox – 2016-11-04T06:28:12.287

Oh, sorry >_> ... – TuxCrafting – 2016-11-04T08:37:21.130

Since it's not a very known esolang, you can put a link to the repo in the post – TuxCrafting – 2016-11-04T08:38:35.923

@TuxCopter I suppose, that "x86 linux assembly" is name of language – Евгений Новиков – 2017-10-25T06:26:13.977

3

Javascript subset -> Java, 504 bytes

document.write("public class Generated{public static void main(String[]args){"+prompt().replace(RegExp("[r]eplace(,"g"),"replaceAll(").replace(RegExp("[v]ar","g"),"double")+"}static class document{static void write(String s){System.out.print(s);}}static void prompt(){return javax.swing.JOptionPane.showInputDialog(\"\");}static void alert(String a){JOptionPane.showMessageDialog(null,a);}static double Number(String a){return Double.parseDouble(a);}static String RegExp(String a,String b){return a;}}");

SuperJedi224

Posted 2011-02-04T07:13:51.070

Reputation: 11 342

2

05AB1E, 2 bytes (possibly non-competing)

.V

Try it online!

Code on the first line of input, inputs on subsequent lines.

Geno Racklin Asher

Posted 2011-02-04T07:13:51.070

Reputation: 466

0

Lumber, 0 bytes

Lumber is a complete esoteric programming language invented by Unrelated String written in just 10 lines of Prolog code.

Can't believe it? These programs had comments removed and makes the interpreter source more concise.

lumber_corefuncs.pl:

:- use_module(lumber_types).

lumber_types.pl

:- module(lumber_types,
          []).

lumber_corefuncs.pl takes in the library lumber_types; and in turn, this library defines a module with nothing in it. Therefore, Lumber does nothing on arbitary inputs, which is in turn a self-compiler.

user85052

Posted 2011-02-04T07:13:51.070

Reputation:

2@A__ Just because it's a technically valid answer, doesn't mean it's a good one, especially if it is lazy or rather nonsensical (especially when there's already an existing answer using the same excuse with 3 downvotes) – Jo King – 2019-09-08T11:40:11.947

Also, it's not Perl, it's Prolog. (Also also I'll probably have a legitimate answer for this in Lumber some time this decade) – Unrelated String – 2019-09-08T18:29:13.260

-1

Nil, 0 bytes

Incredibly, despite not being Turing-complete, the Nil language is expressive enough to implement an interpreter for itself, much more concisely than many 'proper' languages can. The example presented here is a simple implementation, but using advanced compression techniques Nil developers have been able to produce working interpreters in as little as 0 lines of code.

l4m2

Posted 2011-02-04T07:13:51.070

Reputation: 5 985

@ØrjanJohansen https://codegolf.meta.stackexchange.com/a/10426/aa

– l4m2 – 2018-05-30T23:49:57.060

2Gah! remembering what on meta is currently relevant is too hard. – Ørjan Johansen – 2018-05-31T00:13:06.853