Fully Modular C: Grading

You are a Computer Science professor teaching the C programming language. One principle you seek to impart to the students is modularity. Unfortunately, past classes have tended not to get the message, submitting assignments with the entire program inside main(). Therefore, for this semester you have issued strict modularity guidelines upon which students will be graded.

A subset of C grammar and rules for "well-formed" translation unit are defined below. Code that follows these rules should be valid C89, UNLESS an identifier that is a keyword is used.

Task

You will receive as input a string purportedly containing C code. You may assume that this string contains only spaces, newlines, and the characters abcdefghijklmnopqrstuvwxyz123456789(){},+-*/%;=.

Your code must output the number of points the student's assignment should receive according to the following rubric:

Input is not a valid translation-unit according to the grammar: 0 points
Input follows the grammar but is not "well-formed" according to the rules below: 1 point
Input is a well-formed translation unit but not fully modular: 2 points
Input is a fully modular well-formed translation unit: 3 points

Token definitions

identifier: Any sequence of 1 or more lowercase English letters. If an identifier is a C89 reserved word¹, you may optionally return 0 instead of whatever the result would have been ignoring reserved words. You do not have to be consistent about detecting the use of reserved words as identifiers; you may flag them in some instances and let them pass in others.
integer-literal: A sequence of 1 or more of the digits 1-9 (recall that the character 0 is guaranteed not to appear in the input)
Other valid tokens are defined literally in the grammar.
A character must belong to a token if and only if it is not whitespace.
Two consecutive alphanumeric characters must be part of the same token.

EBNF grammar

var-expr = identifier
literal-expr = integer-literal
binary-op = "+" | "-" | "*" | "/" | "%"
binary-expr = expr binary-op expr
paren-expr = "(" expr ")"
call-expr = identifier "(" [ expr ( "," expr )* ] ")"
expr = var-expr | literal-expr | binary-expr | paren-expr | call-expr
assign-stmt = var-expr "=" expr ";"
if-stmt = "if" "(" expr ")" assign-stmt
return-stmt = "return" expr ";"
function-body = ( assign-stmt | if-stmt )* return-stmt
argument-list = [ identifier ( "," identifier )* ]
function-definition = identifier "(" argument-list ")" "{" function-body "}"
translation-unit = function-definition*

Well-formed program requirements

No two function definitions may have the same function name.
No two identifiers in an argument-list may be identical.
No identifier in an argument-list may be identical to a function name (whether from a function-definition or a call-expr).
The identifier in a var-expr must be included in the enclosing function's argument-list.
For a given function, all call-exprs and the function-definition (if any) must agree in number of arguments.

Fully modular

No more than 1 binary operator per function
No more than 1 assignment statement per function
No more than 1 function call per function

Examples (one per line)

Score 0

}}}}}
return 2;
f() { return -1; }
f() {}
f(x,) { return 1; }
f(x) { return 1 }
f(x) { returnx; }
f(x) { return1; }
f() { g(); return 1;}
f() { if(1) return 5; }
f(x) { if(1) if(1) x = 2; return x; }
f(x, y) { x = y = 2; return x; }

Score 1

f(){ return 1; } f(){ return 1; }
g(x, x) { return 1; }
g(f) { return 1; } f() { return 1; }
f(x) { x = write(); x = write(1); return 1; }
f() { return f(f); }
f() { return 1; } g() { return f(234567); }
f() { return(x); }
f() { j = 7; return 5; }

Score 2

f(x,y,zzzzz) { return x + y + zzzzz; }
f(x,a,b) { if(a) x = foo(); if(b) x = bar(); return x; }
f(j) { return g(h( i() / j, i() ), 1) ; }

Score 3

mod(x, y) { return ((x % y)); }
f() { return f(); }
f(c) { if(c) c = g(c) + 2; return c; }
fib(i){return bb(i,0,1);}aa(i,a,b){return bb(i,b,a+b);}bb(i,a,b){if(i)a=aa(i-1,a,b);return a;}

Score 0 or 1

h(auto, auto) { return 1; }

Score 0 or 3

if() { return 1; }

¹ _{Reserved word list: auto break case char const continue default do double else enum extern float for goto if int long register return short signed sizeof static struct switch typedef union unsigned void volatile while}

feersum

Posted 2018-12-01T23:28:03.563

Reputation: 29 566

May the input string be empty? (If so, I think it should score 3.) – Arnauld – 2018-12-02T02:05:58.167

@Arnauld Yes, that's correct. – feersum – 2018-12-02T02:47:13.120

How are we supposed to check the number of arguments of undeclared functions? Is the 4th example in "Score 1" not well-formed because write was called once with no argument and once with one argument? – Arnauld – 2018-12-02T11:24:10.013

@Arnauld Yes, you check that each call to a function which is not defined has the same number of arguments. – feersum – 2018-12-02T17:23:37.550

I see. Well, this is currently not compatible with my parser, so I'm deleting my answer for now. I may give it another try later. – Arnauld – 2018-12-02T17:30:10.030

Should it be well-formed to assign to variables not found in the argument list? The test cases don't cover that, and assign-stmt is currently defined using identifier, not var-expr which is required to be in the argument list. – PurkkaKoodari – 2018-12-05T14:35:45.537

@Pietu1998 Thanks, fixed. I changed the left side of an assignment to var-expr so this is no longer allowed by the spec. – feersum – 2018-12-06T04:18:04.280

What about the binary operators &&, ||, &, | and ^ – None – 2018-12-06T05:05:00.370

@Rogem The characters &|^ are guaranteed not to appear in the input. – feersum – 2018-12-06T05:13:59.437

Also, suggest adding the empty program as a test case. – None – 2018-12-06T06:13:34.450