Make a finky syntax checker

8

You are to make a program that can check the syntax of programs of its same language. For example, if you do it in python, it checks python syntax. Your program will receive a program on standard input, and verify whether or not its syntax is correct. If it is correct, output just "true" on standard output. If not, output just "false" on standard output.

Your program must be incorrect in one instance though. If fed its own source code, it will just output "false" on standard output. This is code golf, so shortest program wins!

Note: Although this isn't technically a quine, you must follow the quine rules, meaning you can't access your source code through the file system or whatever.

Note: You can't claim a program that could not be run due to a syntax error solves this challenge, since it needs to be runnable.

PyRulez

Posted 2014-07-04T18:30:38.230

Reputation: 6 547

1Is using standard and/or 3rd-party libraries allowed? (specifically, to use a working lexer/parser for your language) – Martin Ender – 2014-07-04T19:02:22.047

Is try:exec(raw_input())... allowed? – user80551 – 2014-07-04T19:07:00.467

@user80551 That won't work for syntactically correct input which loops forever. It's also a slight security risk. – Martin Ender – 2014-07-04T19:12:36.580

Shouldn't it have the quine tag? – Sylwester – 2014-07-04T19:12:49.453

1@m.buettner What if we indent raw_input() into a new function and exec it so that the function is never actually called. BTW who he heck cares about security risks in code-golf? – user80551 – 2014-07-04T19:18:46.980

@user80551 Well that kind of security risk would prevent the code from working in all cases, because the executed code might crash the computer, kill the process, etc. Wrapping the code into something that's never executed and then calling exec sounds like a decent idea though! :) – Martin Ender – 2014-07-04T19:20:05.410

@Sylwester Its not quite a quine, since you don't technically have to produce the code. In fact, you could just put in a syntax error (according to, say, your language's standard) as long as it is ignored by the compiler (and therefore run-able.) – PyRulez – 2014-07-04T23:31:46.437

@PyRulez I don't think that would work, because then there would be other programs which would exhibit the same behaviour. To actually make the program behave differently only on itself, I'm pretty sure you'll have to use quine techniques in that you need to check the input against some string which happens to be equal to your own source code (like Ventero did). – Martin Ender – 2014-07-05T14:00:17.250

Answers

9

Ruby 2.0, 65 76 164 characters

eval r="gets p;$<.pos=0;`ruby -c 2>&0`;p$?==0&&$_!='eval r=%p'%r"

This uses Ruby's built-in syntax checker (ruby -c) to check the input's syntax, which means the code won't be evaluated.

Basic usage example:

ruby syntax.rb <<< foo
true
ruby syntax.rb <<< "'"
false
ruby syntax.rb < synxtax.rb # assumes the file was saved without trailing newline
false

Explanation

This solution is (was) based on the standard Ruby quine:

q="q=%p;puts q%%q";puts q%q

%p is the format specifier for arg.inspect, which can be compared to uneval: when evaling the string returned by arg.inspect, you (usually) get the original value again. Thus, when formatting the q string with itself as argument, the %p inside the string will be replaced with the quoted string itself (i.e. we get something like "q=\"q=%p;puts q%%q\";puts q%q").

Generalizing this type of quine leads to something like the following:

prelude;q="prelude;q=%p;postlude";postlude

This approach has one huge drawback though (at least in ): All code needs to be duplicated. Luckily, eval can be used to get around this:

eval r="some code;'eval r=%p'%r"

What happens here is that the code passed to eval is stored inside r before eval is called. As a result, the full source code of the eval statement can be obtained with 'eval r=%p'%r. If we do this inside the evald code and ensure that the top level of our consists only of the one eval statement, that expression actually gives us the full source code of our program, since any additional code passed to eval is already stored inside r.

Side note: This approach actually allows us to write a Ruby quine in 26 characters: eval r="puts'eval r=%p'%r"

Now, in this solution the additional code executed inside eval consists of four statements:

gets p

First, we read all input from STDIN and implicitly save it into $_.

$<.pos=0

Then, we rewind STDIN so the input is available again for the subprocess we start in the next step.

`ruby -c 2>&0`

This starts Ruby in its built-in syntax checking mode, reading the source code from stdin. If the syntax of the supplied script (filename or stdin) is ok, it prints Syntax OK to its stdout (which is captured by the parent process), but in case of a syntax error, a description of the error is printed to stderr - which would be visible, so we redirect that into nirvana (2>&0) instead.

p$?==0&&$_!='eval r=%p'%r

Afterwards, we check the subprocess's exit code $?, which is 0 if the syntax was ok. Lastly, the input we read earlier ($_) is compared against our own source code (which, as I described earlier, can be obtained with 'eval r=%p'%r).

Edit: Saved 14 characters thanks to @histocrat!

Ventero

Posted 2014-07-04T18:30:38.230

Reputation: 9 842

I do believe that your answer is the only one that has truly followed the rules so far. – PyRulez – 2014-07-04T23:29:41.823

1Very nice! I think you can replace .write with <<, and >>1<1 with ==0. – histocrat – 2014-07-05T17:20:51.710

@histocrat Huh, somehow missed the part about == in the Process::Status docs. Thanks a lot! – Ventero – 2014-07-05T17:28:14.717

6

Rebol - 69 or 7475 fully compliant with all rules

New working versions thanks to @rgchris! Not sure if the first fails the "don't access the source" requirement as the interpreter holds the loaded and parsed code it has been passed as a cmd line parameter in the system object (system/options/do-arg) which is used to recognise itself.

probe not any[error? i: try[load/all input]i = system/options/do-arg]

This one follows all of the rules:

do b:[i: input prin block? if i <> join "do b:" mold b [try [load/all i]]]

Example usage:

First printing a valid integer, the second printing an invalid integer.

echo "rebol [] print 123" | rebol --do "probe not any[error? i: try[load/all input]i = system/options/do-arg]"
true
echo "rebol [] print 1w3" | rebol --do "probe not any[error? i: try[load/all input]i = system/options/do-arg]"
false
echo "probe not any[error? i: try[load/all input]i = system/options/do-arg]" | rebol --do "probe not any[error? i: try[load/all input]i = system/options/do-arg]"
false 

Fully compliant version:

echo 'rebol [] 123' |r3 --do 'do b:[i: input prin block? if i <> join "do b:" mold b [try [load/all i]]
true
echo 'rebol [] 123a' |r3 --do 'do b:[i: input prin block? if i <> join "do b:" mold b [try [load/all i]]]'
false
echo 'do b:[i: input prin block? if i <> join "do b:" mold b [try [load/all i]]]' |r3 --do 'do b:[i: input prin block? if i <> join "do b:" mold b [try [load/all i]]
false

Explanation:

First version

This uses Rebols built-in load function to parse and load the code from stdin but it does not execute it.

The try block catches any syntax errors and the error? function converts the error to a simple boolean.

The i = system/options/do-arg compares the input from stdin (assigned to i) with the code passed on the do-arg argument (sneaky but very golf :).

any is a great function which returns true if any-thing in the block evaluates to true (for example, any [ false false true ] would return true).

not then just inverts the boolean to give us the correct answer and probe displays the contents of the returned value.

Fully compliant version

Let's go through this in order ...

Assign the word b to the block [] that follows.

Use the do function to interpret the do dialect in the b block.

Inside the b block ...

Set the word i to refer to the contents of stdin (input).

Now, if we join the string "do b:" to the mold'ed block b and it is not equal (<>) to the stdin input i then we try to load the input i.

If the result is a block then we have load'ed the passed data correctly otherwise we would receive a none from the failed if.

Use prin to display the result of block? which returns true if the result is a block. Using prin as opposed to print does not display a carriage return after the output (and it saves us another char).

johnk

Posted 2014-07-04T18:30:38.230

Reputation: 459

I think you can shave that down to 36 chars with print not error? try[load/all input] – draegtun – 2014-07-08T08:30:24.867

That checks the syntax... but the problem stipulates that it return false on its own code. Offhand, ungolfed... c:[c: compose/only [c: (c)]print not error? try[if c = load/all input[1 / 0]]] – HostileFork says dont trust SE – 2014-07-08T14:07:51.677

I missed the bit about not valid code. How about this?

`(prin none? attempt[load/all input] halt) 1a`

The invalid integer would fail syntax evaluation, but the halt would stop an error in normal operation – johnk – 2014-07-08T22:55:31.777

"Still can't come up with a way of failing to validate my own code short of checksum'ing the code and returning false if I see my own code." If you can checksum/secure your program, and include the hash inside the source itself for purposes of comparison, then I would like to enlist your services in my cybercrime syndicate. Barring that, start from my code, which works. :-) – HostileFork says dont trust SE – 2014-07-09T13:58:34.807

A 69 that flunks loaded source: probe not any[error? i: try[load/all input]i = system/options/do-arg] – rgchris – 2014-07-10T05:51:12.313

An 83 that flunks exact source: probe not any [input = mold/only system/options/do-arg error? try [load/all input]] — note: can't be trimmed, has to match the loaded source after molding. – rgchris – 2014-07-10T05:52:06.717

This 78 is definitive, adheres to the quine rule and does not refer directly to source: do b:[probe not any [input = join "do b:" mold b error? try [load/all input]]] – rgchris – 2014-07-10T20:31:42.007

Down to 75: do b:[i: input probe block? if i <> join "do b:" mold b [try [load/all i]]] — could shave one more off by using prin instead of probe at the expense of a newline after true/false. – rgchris – 2014-07-10T21:52:01.973

RebMu at 53: do b:[pb bl? iu a cb ["do b:" ml b] [try [ld/all a]]] Usage: rebmu/args line input – rgchris – 2014-07-14T07:29:41.753

1D'oh! RebMu at 49: do B[pb bl? iu a jn "do B" ml b [try [ld/all a]]] – rgchris – 2014-07-14T07:40:16.497

2

I think this follows the rules:

JS (✖╭╮✖)

function f(s){if(s==f.toString())return false;try{eval(s)}catch(e){return false}return true}

The code will be evaluated if it's correct.

Need to take a look at arrow notation to see if it can't be shortened more.

!function f(){try{s=prompt();return"!"+f+"()"!=s?eval(s):1}catch(e){return 0}}()

After a couple failed attempts and reverts - new version!

!function f(){try{s=prompt();"!"+f+"()"!=s?eval(s):o}catch(e){return 1}}()

And I'm back!

!function f(){try{s=prompt();"!"+f+"()"!=s?eval(s):o}catch(e){return !(e instanceof SyntaxError)}}()

And I'm gone! Unfortunately due to nature of eval and thanks to @scragar (damn you @scragar!) this approach will not work (seeing as throw new SyntaxError is valid JS code, which ticks this method of) - as such, I'd say it's impossible to create a syntax checker (at least using eval or any variation thereof)

(*see the comments!)

eithed

Posted 2014-07-04T18:30:38.230

Reputation: 1 229

Suppose the input is an infinite loop? I suggest you use eval("x=function(){"+t+"}"); – DankMemes – 2014-07-04T20:03:22.287

1@ZoveGames That can be broken with input like }// or };{. – Ventero – 2014-07-04T20:05:42.673

I don't know if this is valid or not according to the rules since it has to be a program, not a function (I think) – Abraham – 2014-07-04T20:06:49.550

@ZoveGames Good point! While the browsers script handling should kick in (loop counter / timeout), it's still easy to write script that will cause the "system" to hang. I'll await with the change until OP specifies the rule about this. – eithed – 2014-07-04T20:07:20.477

@Abraham - fair point. Though what would comprise as JS program then? Let JS folks play as well ;) – eithed – 2014-07-04T20:10:11.517

@eithedog Compare with my JS answer, which runs automatically without requiring a function to be called. (Not trying to be rude, just want to make a level playing field) – Abraham – 2014-07-04T20:11:42.287

@Abraham great rebuttal :D will have a think when I'm home – eithed – 2014-07-04T20:14:27.060

@scragar - throw new Exception(''); gives me ReferenceError: Exception is not defined in console (tried it in jsfiddle as well to make sure that console doesn't override exception handling). Can you explain how can I replicate it? – eithed – 2014-07-09T12:44:05.303

1@eithedog Sorry, got confused on languages, Javascripts base throwable is actually called Error, not Exception. throw new Error('') causes the incorrect behaviour. – scragar – 2014-07-09T12:53:56.360

@scragar - I forgot as well :D Anyhow - good point; I'm putting it at hold for now (and can't really do anything about it - at work atm) – eithed – 2014-07-09T12:57:03.983

@scragar - hopefully this version should work against that. I'm not sure if ReferenceError should be treated as invalid syntax (it was earlier), soo... (I guess, thinking about it - why should it, it's a valid syntax :D). On the other hand... let me update the answer again! – eithed – 2014-07-09T13:11:15.483

@eithedog I've managed to resolve that problem, but it's still not quite safe, since it's still possible to generate SyntaxErrors by using internal eval statements. http://jsfiddle.net/kA2Wv/

– scragar – 2014-07-09T13:37:56.583

@scragar - yup - b('throw new _SyntaxError("")') gives false, and I think any redefining of SyntaxError won't really work. One more thing I was thinking of was to user new Function(s) but don't have time atm to look into that (don't think it will work though) – eithed – 2014-07-09T14:19:07.467

2

Javascript - 86 82

!function $(){b=(a=prompt())=='!'+$+'()';try{Function(a)}catch(e){b=1}alert(!b)}()

Paste into your browser's Javascript console to test.
Explanation:

// Declare a function and negate it to call it immediately
// <http://2ality.com/2012/09/javascript-quine.html>
!function $ () { 
  // Concatenating $ with strings accesses the function source
  invalid = (input = prompt()) == '!' + $ + '()';
  try {
    // Use the Function constructor to only catch syntax errors
    Function(input)
  }
  catch (e) {
    invalid = 1
  }
  alert(!invalid)
// Call function immediately
}()

Note: @m.buettner brought up the point that the program returns true for a naked return statement, such as return 0;. Because Javascript doesn't throw a syntax error for an illegal return statement until it actually runs (meaning code like if (0) { return 0; } doesn't throw a syntax error), I don't think there's any way to fix this short of writing a Javascript parser in Javascript. For example, consider the code:

while (1) {}
return 0;

If the code is executed, it will hang because of the loop. If the code is not executed, no error will be thrown for the illegal return statement. Therefore, this is as good as Javascript can get for this challenge. Feel free to disqualify Javascript if you feel that this doesn't fulfill the challenge sufficiently.

Abraham

Posted 2014-07-04T18:30:38.230

Reputation: 1 023

2Now it will not report a syntax error when giving it return 0. – Martin Ender – 2014-07-04T20:39:07.297

Mm, good one @m.buettner – Abraham – 2014-07-04T21:12:13.707

2

Haskell - 222 bytes

Note that this uses a true parser. It does not depend on eval like functions of dynamic languages.

import Language.Haskell.Parser
main=interact(\s->case parseModule s of ParseOk _->if take 89s=="import Language.Haskell.Parser\nmain=interact(\\s->case parseModule s of ParseOk _->if take"then"False"else"True";_->"False")

This solution is not particularly pretty but does work.

gxtaillon

Posted 2014-07-04T18:30:38.230

Reputation: 577

Does it fail for itself? – PyRulez – 2014-07-04T23:27:26.570

It does. The if take ... instruction check to see if the input matches a string literal which is the first part of the program. – gxtaillon – 2014-07-05T12:38:23.953

1

Python (95)

c=raw_input()
try:compile('"'if sum(map(ord,c))==7860 else c,'n','exec');print 1
except:print 0

ɐɔıʇǝɥʇuʎs

Posted 2014-07-04T18:30:38.230

Reputation: 4 449

Doesn't that only work for one line of input? – Ian D. Scott – 2014-07-05T17:48:29.523

2Doesn't work: c=u'#\u1e91' because ord('#') + ord(u'\u1e91') == 7860 – ThinkChaos – 2014-07-05T18:20:32.063

Try a hash instead. – PyRulez – 2014-07-06T12:59:12.073

1

PHP - 140

<?exec("echo ".escapeshellarg($argv[1])." | php -l",$A,$o);echo$o|(array_sum(array_map(ord,str_split($argv[1])))==77*150)?"false":"true";//Q

Comment necessary to keep the 'hash' (a shameless copy of s,ɐɔıʇǝɥʇuʎs). Using php -l / lint to check for errors.

$ php finky_syntax_check.php '<?php echo "good php code"; ?>' 
true
$ php finky_syntax_check.php '<?php echo bad--php.code.; ?>'
false
$ php finky_syntax_check.php '<?exec("echo ".escapeshellarg($argv[1])." | php -l",$A,$o);echo$o|(array_sum(array_map(ord,str_split($argv[1])))==77*150)?"false":"true";//Q'
false
$ php finky_syntax_check.php '<?exec("echo ".escapeshellarg($argv[1])." | php -l",$A,$o);echo$o|(array_sum(array_map(ord,str_split($argv[1])))==77*150)?"false":"true";//D'
true // note that the last character was changed

Aurel Bílý

Posted 2014-07-04T18:30:38.230

Reputation: 1 083

0

C 174

Explanation -Wall needed to produce system error while still being compilable. The syntax error is no return 0; To enter via stdin in Windows console type Ctrl-Z after pasting and press enter.

Golfed

char c[256];int i;int main(){FILE *f=fopen("a.c","w");while(fgets(c,256,stdin)!=NULL){fputs(c,f);}fclose(f);i=system("gcc a.c -o -Wall a.exe");printf("%s",i?"false":"true");}

Ungolfed:

#include <stdio.h>
#include <stdlib.h>
char c[256];int i;
int main()
{
FILE *f=fopen("a.c","w");
while(fgets(c,256,stdin)!=NULL)
{
fputs(c,f);
}
fclose(f);
i=system("gcc a.c -o -Wall a.exe");
printf("%s",i?"false":"true");
}

bacchusbeale

Posted 2014-07-04T18:30:38.230

Reputation: 1 235

0

T-SQL - 110

Fairly simple, I've been wanting to try a challenge on here for a while, finally got round to doing it. This is not the fanciest code, but I had fun nontheless.

The 'golfed' version.

BEGIN TRY DECLARE @ VARCHAR(MAX)='SET NOEXEC ON'+'//CODE GOES HERE//'EXEC(@)PRINT'TRUE'END TRY BEGIN CATCH PRINT'FALSE'END CATCH

A better formatted version.

BEGIN TRY 
    DECLARE @SQL VARCHAR(MAX)='SET NOEXEC ON'+'//CODE GOES HERE//'
    EXEC(@SQL)
    PRINT'TRUE'
END TRY 

BEGIN CATCH 
    PRINT'FALSE'
END CATCH

It's fairly self explanatory, it uses SET NOEXEC on which makes it just parse the query instead of returning any results. the rest is mostly the try/catch that I use to determine what I need to print.

EDIT: I should have added that this will technically fail for itself. because it uses dynamic SQL any single quotes in the input have to be doubled ' -> ''

PenutReaper

Posted 2014-07-04T18:30:38.230

Reputation: 1