Quotient automaton

In computer science, in particular in formal language theory, a quotient automaton can be obtained from a given nondeterministic finite automaton by joining some of its states. The quotient recognizes a superset of the given automaton; in some cases, handled by the Myhill–Nerode theorem, both languages are equal.

Formal definition

A (nondeterministic) finite automaton is a quintuple A = ⟨Σ, S, s0, δ, Sf⟩, where:

  • Σ is the input alphabet (a finite, non-empty set of symbols),
  • S is a finite, non-empty set of states,
  • s0 is the initial state, an element of S,
  • δ is the state-transition relation: δS × Σ × S, and
  • Sf is the set of final states, a (possibly empty) subset of S.[1][note 1]

A string a1...anΣ* is recognized by A if there exist states s1, ..., snS such that ⟨si-1,ai,si⟩ ∈ δ for i=1,...,n, and snSf. The set of all strings recognized by A is called the language recognized by A; it is denoted as L(A).

For an equivalence relation ≈ on the set S of A’s states, the quotient automaton A/ = ⟨Σ, S/, [s0], δ/, Sf/⟩ is defined by[2]:5

  • the input alphabet Σ being the same as that of A,
  • the state set S/ being the set of all equivalence classes of states from S,
  • the start state [s0] being the equivalence class of A’s start state,
  • the state-transition relation δ/ being defined by δ/([s],a,[t]) if δ(s,a,t) for some s ∈ [s] and t ∈ [t], and
  • the set of final states Sf/ being the set of all equivalence classes of final states from Sf.

The process of computing A/ is also called factoring A by ≈.

Example

Quotient examples
Automaton
diagram
Recognized
language
Is the quotient of
A by B by C by
A: 1+10+100
B: 1*+1*0+1*00 a≈b
C: 1*0* a≈b, c≈d c≈d
D: (0+1)* a≈b≈c≈d a≈c≈d a≈c

For example, the automaton A shown in the first row of the table[note 2] is formally defined by

  • ΣA = {0,1},
  • SA = {a,b,c,d},
  • sA
    0
    = a,
  • δA = { ⟨a,1,b⟩, ⟨b,0,c⟩, ⟨c,0,d⟩ }, and
  • SA
    f
    = { b,c,d }.

It recognizes the finite set of strings { 1, 10, 100 }; this set can also be denoted by the regular expression "1+10+100".

The relation (≈) = { ⟨a,a⟩, ⟨a,b⟩, ⟨b,a⟩, ⟨b,b⟩, ⟨c,c⟩, ⟨c,d⟩, ⟨d,c⟩, ⟨d,d⟩ }, more briefly denoted as a≈b,c≈d, is an equivalence relation on the set {a,b,c,d} of automaton A’s states. Building the quotient of A by that relation results in automaton C in the third table row; it is formally defined by

  • ΣC = {0,1},
  • SC = {a,c},[note 3]
  • sC
    0
    = a,
  • δC = { ⟨a,1,a⟩, ⟨a,0,c⟩, ⟨c,0,c⟩ }, and
  • SC
    f
    = { a,c }.

It recognizes the finite set of all strings composed of arbitrarily many 1s, followed by arbitrarily many 0s, i.e. { ε, 1, 10, 100, 1000, ..., 11, 110, 1100, 11000, ..., 111, ... }; this set can also be denoted by the regular expression "1*0*". Informally, C can be thought of resulting from A by glueing state a onto state b, and glueing state c onto state d.

The table shows some more quotient relations, such as B = A/a≈b, and D = C/a≈c.

Properties

  • For every automaton A and every equivalence relation ≈ on its states set, L(A/) is a superset of (or equal to) L(A).[2]:6
  • Given a finite automaton A over some alphabet Σ, an equivalence relation ≈ can be defined on Σ* by xy if ∀ zΣ*: xzL(A) ↔ yzL(A). By the Myhill–Nerode theorem, A/ is a deterministic automaton that recognizes the same language as A.[1]:65–66 As a consequence, the quotient of A by every refinement of ≈ also recognizes the same language as A.
gollark: See? Rust.
gollark: I just assume my users have infinite time and space, as well as the ability to infer what the bees I mean by things in the lack of docs.
gollark: Even my phone has it.
gollark: ALL are to have LLVM/rustc.
gollark: It wouldn't take that long unless you used async or something.

See also

Notes

  1. Hopcroft and Ullman (sect.2.3, p.20) use a slightly deviating definition of δ, viz. as a function from S × Σ to the power set of S.
  2. In the automaton diagrams in the table, symbols from the input alphabet and state names are colored in green and red, respectively; final states are drawn as double circles.
  3. Strictly formal, the set is SC = { [a], [b], [c], [d] } = { [a], [c] }. The class brackets are omitted for readability.

References

  1. John E. Hopcroft; Jeffrey D. Ullman (1979). Introduction to Automata Theory, Languages, and Computation. Reading/MA: Addison-Wesley. ISBN 0-201-02988-X.
  2. Tristan le Gall and Bertrand Jeannet (Mar 2007). Analysis of Communicating Infinite State Machines Using Lattice Automata (PDF) (Publication Interne). Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA) Campus Universitaire de Beaulieu. ISSN 1166-8687.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.