14
1
The challenge below requires you to be familiar with formal parser theory. If you don't know what the question is asking because you don't know what the terms mean, context-free grammars and first/follow sets are covered in a lot of university courses.
I can recommend this Stanford course, in particular handouts 08 and 09 (from page 7). I've extracted also extracted a cheat sheet from these handouts - I recommend anyone attempting this challenge to read it.
Write a program or function that given a context-free grammar finds the follow set of every nonterminal. Informally, the follow set of a nonterminal is a set of terminals and $
(meaning end-of-input) that you can possibly find after that terminal in a valid sentence.
The input is given as a single printable ASCII string or array of printable ASCII lines. You may output the sets in any reasonable format, using $
(either as literal output, or string inside a set, etc) to indicate the end of input. You may assume that the input is always valid according to the format below.
The context free grammar is given in a very simplified manner. Every line contains a single production. Every production is a space separated list of symbols. A terminal is a string of characters surrounded by apostrophes (e.g. '**'
). For simplicity you may assume that terminals do not contain spaces, but it would be nice if your program allows it. A nonterminal may be any string not containing spaces or $
. The empty production (normally indicated with ε) is simply a line containing only the left hand side nonterminal. The first line is the production defining the start symbol.
As an example, the following grammar:
S → aSa | bSb | ε
Will be given as:
S 'a' S 'a'
S 'b' S 'b'
S
Example input / outputs:
In:
S 'a' S 'a'
S 'b' S 'b'
S
Out:
S {'a', 'b', $}
In:
S A B C
A 'a'
A C 'b'
A
B C
B 'd' A
B
C 'e'
C 'f'
Out:
S {$}
A {'d', 'e', 'f'}
B {'e', 'f'}
C {'b', 'e', 'f', $}
In:
Start Alice Bob
Alice Charlie 'a'
Alice
Bob Bob 'a' Alice Charlie
Bob '!!!'
Charlie 'b'
Charlie
Out:
Start {$}
Alice {'a', '!!!', 'b', $}
Bob {'a', $}
Charlie {'a', $}
Shortest code in bytes wins.
4Assuming that people know what a context free grammar is seems fine, but I think it wouldn't hurt the challenge if you included the definition of a follow set right here instead of just linking to it. – Martin Ender – 2015-12-09T07:44:35.327
1This brings back some memories from "Compiler Construction" at university, where we had to solve lots of similar tasks. – insertusernamehere – 2015-12-09T12:49:52.943