Generalized context-free grammar

Generalized context-free grammar (GCFG) is a grammar formalism that expands on context-free grammars by adding potentially non-context free composition functions to rewrite rules.[1] Head grammar (and its weak equivalents) is an instance of such a GCFG which is known to be especially adept at handling a wide variety of non-CF properties of natural language.

Description

A GCFG consists of two components: a set of composition functions that combine string tuples, and a set of rewrite rules. The composition functions all have the form , where is either a single string tuple, or some use of a (potentially different) composition function which reduces to a string tuple. Rewrite rules look like , where , , ... are string tuples or non-terminal symbols.

The rewrite semantics of GCFGs is fairly straightforward. An occurrence of a non-terminal symbol is rewritten using rewrite rules as in a context-free grammar, eventually yielding just compositions (composition functions applied to string tuples or other compositions). The composition functions are then applied, successively reducing the tuples to a single tuple.

Example

A simple translation of a context-free grammar into a GCFG can be performed in the following fashion. Given the grammar in (1), which generates the palindrome language , where is the string reverse of , we can define the composition function conc as in (2a) and the rewrite rules as in (2b).

 

 

 

 

(1)

 

 

 

 

(2a)

 

 

 

 

(2b)

The CF production of abbbba is

S
aSa
abSba
abbSbba
abbbba

and the corresponding GCFG production is

Linear Context-free Rewriting Systems (LCFRSs)

Weir (1988)[1] describes two properties of composition functions, linearity and regularity. A function defined as is linear if and only if each variable appears at most once on either side of the =, making linear but not . A function defined as is regular if the left hand side and right hand side have exactly the same variables, making regular but not or .

A grammar in which all composition functions are both linear and regular is called a Linear Context-free Rewriting System (LCFRS). LCFRS is a proper subclass of the GCFGs, i.e. it has strictly less computational power than the GCFGs as a whole.

On the other hand, LCFRSs are strictly more expressive than linear-indexed grammars and their weakly equivalent variant tree adjoining grammars (TAGs).[2] Head grammar is another example of an LCFRS that is strictly less powerful than the class of LCFRSs as a whole.

LCFRS are weakly equivalent to (set-local) multicomponent TAGs (MCTAGs) and also with multiple context-free grammar (MCFGs ).[3] and minimalist grammars (MGs). The languages generated by LCFRS (and their weakly equivalents) can be parsed in polynomial time.[4]

gollark: Surely you can just pull a particular tag of the container.
gollark: I can come up with a thing to transmit ubqmachine™ details to osmarks.net or whatever which people can embed in their code.
gollark: It's an x86-64 system using debian or something.
gollark: > `import hashlib`Hashlib is still important!> `for entry, ubq323 in {**globals(), **__builtins__, **sys.__dict__, **locals(), CONSTANT: Entry()}.items():`Iterate over a bunch of things. I think only the builtins and globals are actually used.The stuff under here using `blake2s` stuff is actually written to be ridiculously unportable, to hinder analysis. This caused issues when trying to run it, so I had to hackily patch in the `/local` thing a few minutes before the deadline.> `for PyObject in gc.get_objects():`When I found out that you could iterate over all objects ever, this had to be incorporated somehow. This actually just looks for some random `os` function, and when it finds it loads the obfuscated code.> `F, G, H, I = typing(lookup[7]), typing(lookup[8]), __import__("functools"), lambda h, i, *a: F(G(h, i))`This is just a convoluted way to define `enumerate(range))` in one nice function.> `print(len(lookup), lookup[3], typing(lookup[3])) #`This is what actually loads the obfuscated stuff. I think.> `class int(typing(lookup[0])):`Here we subclass `complex`. `complex` is used for 2D coordinates within the thing, so I added some helper methods, such as `__iter__`, allowing unpacking of complex numbers into real and imaginary parts, `abs`, which generates a complex number a+ai, and `ℝ`, which provvides the floored real parts of two things.> `class Mаtrix:`This is where the magic happens. It actually uses unicode homoglyphs again, for purposes.> `self = typing("dab7d4733079c8be454e64192ce9d20a91571da25fc443249fc0be859b227e5d")`> `rows = gc`I forgot what exactly the `typing` call is looking up, but these aren't used for anything but making the fake type annotations work.> `def __init__(rows: self, self: rows):`This slightly nonidiomatic function simply initializes the matrix's internals from the 2D array used for inputs.> `if 1 > (typing(lookup[1]) in dir(self)):`A convoluted way to get whether something has `__iter__` or not.
gollark: If you guess randomly the chance of getting none right is 35%ish.

See also

References

  1. Weir, David Jeremy (Sep 1988). Characterizing mildly context-sensitive grammar formalisms (PDF) (Ph.D.). Paper. AAI8908403. University of Pennsylvania Ann Arbor.
  2. Laura Kallmeyer (2010). Parsing Beyond Context-Free Grammars. Springer Science & Business Media. p. 33. ISBN 978-3-642-14846-0.
  3. Laura Kallmeyer (2010). Parsing Beyond Context-Free Grammars. Springer Science & Business Media. p. 35-36. ISBN 978-3-642-14846-0.
  4. Johan F.A.K. van Benthem; Alice ter Meulen (2010). Handbook of Logic and Language (2nd ed.). Elsevier. p. 404. ISBN 978-0-444-53727-0.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.