PackCC

PackCC is a parser generator for C. Its main features are as follows:

  • Generates a parser in written C from a grammar described in a PEG,
  • Gives a parser great efficiency by packrat parsing,
  • Supports direct and indirect left-recursive grammar rules,
  • Generates a thread-safe and reentrant parser,
  • Consists of just a single compact source file.
PackCC
Developer(s)Arihiro Yoshida
Written inC
Operating systemCross-platform
TypeParser generator
LicenseMIT License
Websitegithub.com/arithy/packcc

The grammar of an output parser can be described in a PEG (Parsing Expression Grammar). The PEG is a top-down parsing language, and is similar to the regular expression grammar. Compared with a bottom-up parsing language, like Yacc's one, the PEG is much more intuitive and cannot be ambiguous. The PEG does not require tokenization to be a separate step, and tokenization rules can be written in the same way as any other grammar rules.

The generated parser can parse inputs very efficiently by packrat parsing. The packrat parsing is the recursive descent parsing algorithm that is accelerated using memoization. By using packrat parsing, any input can be parsed in linear time. Without it, however, the resulting parser could exhibit exponential time performance in the worst case due to the unlimited look-ahead capability.

Unlike common packrat parsers, PackCC can support direct and indirect left-recursive grammar rules.[1] This makes grammar rules much more intuitive.

The generated code is beautified and as ease-of-understanding as possible. Actually, it uses many goto statements, but the control flows are much more traceable than goto spaghetti storms generated by some other parser generators.

PackCC itself is under MIT license, but the generated code can be distributed under any license or can be used in proprietary software.

Input file example

A desktop calculator. Note that left-recursive grammar rules are included.

%prefix "calc"

statement <- _ e:expression _ EOL { printf("answer=%d\n", e); }
           / ( !EOL . )* EOL      { printf("error\n"); }

expression <- e:term { $$ = e; }

term <- l:term _ '+' _ r:factor { $$ = l + r; }
      / l:term _ '-' _ r:factor { $$ = l - r; }
      / e:factor                { $$ = e; }

factor <- l:factor _ '*' _ r:unary { $$ = l * r; }
        / l:factor _ '/' _ r:unary { $$ = l / r; }
        / e:unary                  { $$ = e; }

unary <- '+' _ e:unary { $$ = +e; }
       / '-' _ e:unary { $$ = -e; }
       / e:primary     { $$ = e; }

primary <- < [0-9]+ >               { $$ = atoi($1); }
         / '(' _ e:expression _ ')' { $$ = e; }

_      <- [ \t]*
EOL    <- '\n' / '\r\n' / '\r' / ';'

%%
int main() {
    calc_context_t *ctx = calc_create(NULL);
    while (calc_parse(ctx, NULL));
    calc_destroy(ctx);
    return 0;
}

Notes

  1. "Packrat Parsers Can Support Left Recursion" authored by A. Warth, J. R. Douglass, and T. Millstein
gollark: “As brand leader, my bandwidth is jammed with analysing flow-through and offering holistic solutions.” - GTech™ corporate management AI.
gollark: <@398575402865393665> is highly coolThose who dislike it have type Bool.
gollark: There is not.
gollark: Beware the beeFor GTech™️ will deploy apioforms, inevitablyYou cannot hide beneath a treeApioforms will track any escapeeThey do not respect caller IDApioforms control the capital of TenneseeThus none are safe to an adequate degreeChronoapioforms control realityApiothalassahazards monitor the seaFrom an apioform none can fleeDue to their use of multispectral imaging in order to seeApiointellectuoforms have a PhDAnd kiloapioecoekaorganohazards [DATA EXPUNGED] Holy See
gollark: ++remind 1y rap battles
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.