Quotient automaton

In computer science, in particular in formal language theory, a quotient automaton can be obtained from a given nondeterministic finite automaton by joining some of its states. The quotient recognizes a superset of the given automaton; in some cases, handled by the Myhill–Nerode theorem, both languages are equal.

Formal definition

A (nondeterministic) finite automaton is a quintuple A = ⟨Σ, S, s₀, δ, S_f⟩, where:

Σ is the input alphabet (a finite, non-empty set of symbols),
S is a finite, non-empty set of states,
s₀ is the initial state, an element of S,
δ is the state-transition relation: δ ⊆ S × Σ × S, and
S_f is the set of final states, a (possibly empty) subset of S.[1][note 1]

A string a₁...a_n ∈ Σ^* is recognized by A if there exist states s₁, ..., s_n ∈ S such that ⟨s_i-1,a_i,s_i⟩ ∈ δ for i=1,...,n, and s_n ∈ S_f. The set of all strings recognized by A is called the language recognized by A; it is denoted as L(A).

For an equivalence relation ≈ on the set S of A’s states, the quotient automaton A/_≈ = ⟨Σ, S/_≈, [s₀], δ/_≈, S_f/_≈⟩ is defined by[2]^:5

the input alphabet Σ being the same as that of A,
the state set S/_≈ being the set of all equivalence classes of states from S,
the start state [s₀] being the equivalence class of A’s start state,
the state-transition relation δ/_≈ being defined by δ/_≈([s],a,[t]) if δ(s,a,t) for some s ∈ [s] and t ∈ [t], and
the set of final states S_f/_≈ being the set of all equivalence classes of final states from S_f.

The process of computing A/_≈ is also called factoring A by ≈.

Example

Quotient examples
	Automaton diagram	Recognized language	Is the quotient of
	Automaton diagram	Recognized language	A by	B by	C by
A:		1+10+100
B:		1^+1^0+1^*00	a≈b
C:		1^0^	a≈b, c≈d	c≈d
D:		(0+1)^*	a≈b≈c≈d	a≈c≈d	a≈c

For example, the automaton A shown in the first row of the table[note 2] is formally defined by

Σ^A = {0,1},
S^A = {a,b,c,d},
s^A
₀ = a,
δ^A = { ⟨a,1,b⟩, ⟨b,0,c⟩, ⟨c,0,d⟩ }, and
S^A
_f = { b,c,d }.

It recognizes the finite set of strings { 1, 10, 100 }; this set can also be denoted by the regular expression "1+10+100".

The relation (≈) = { ⟨a,a⟩, ⟨a,b⟩, ⟨b,a⟩, ⟨b,b⟩, ⟨c,c⟩, ⟨c,d⟩, ⟨d,c⟩, ⟨d,d⟩ }, more briefly denoted as a≈b,c≈d, is an equivalence relation on the set {a,b,c,d} of automaton A’s states. Building the quotient of A by that relation results in automaton C in the third table row; it is formally defined by

Σ^C = {0,1},
S^C = {a,c},[note 3]
s^C
₀ = a,
δ^C = { ⟨a,1,a⟩, ⟨a,0,c⟩, ⟨c,0,c⟩ }, and
S^C
_f = { a,c }.

It recognizes the finite set of all strings composed of arbitrarily many 1s, followed by arbitrarily many 0s, i.e. { ε, 1, 10, 100, 1000, ..., 11, 110, 1100, 11000, ..., 111, ... }; this set can also be denoted by the regular expression "1^*0^*". Informally, C can be thought of resulting from A by glueing state a onto state b, and glueing state c onto state d.

The table shows some more quotient relations, such as B = A/_a≈b, and D = C/_a≈c.

Properties

For every automaton A and every equivalence relation ≈ on its states set, L(A/_≈) is a superset of (or equal to) L(A).[2]^:6
Given a finite automaton A over some alphabet Σ, an equivalence relation ≈ can be defined on Σ^* by x ≈ y if ∀ z ∈ Σ^*: xz ∈ L(A) ↔ yz ∈ L(A). By the Myhill–Nerode theorem, A/_≈ is a deterministic automaton that recognizes the same language as A.[1]^:65–66 As a consequence, the quotient of A by every refinement of ≈ also recognizes the same language as A.

gollark: It's probably practical if you're serving a model to a ton of people who actually pay for it, or something, but I'm not doing that.

gollark: https://cloud.google.com/tpu/pricing

gollark: A few $ per hour, or something, outside of the free things.

gollark: It's not *that* production since nobody uses my software very much, but still.

gollark: I mean things like semantic search and text generation in my eternally-WIP personal wiki software.(Which isn't researchy, has to work for more than a month, and should not have data be sent to random Google servers)

Notes

Hopcroft and Ullman (sect.2.3, p.20) use a slightly deviating definition of δ, viz. as a function from S × Σ to the power set of S.
In the automaton diagrams in the table, symbols from the input alphabet and state names are colored in green and red, respectively; final states are drawn as double circles.
Strictly formal, the set is S^C = { [a], [b], [c], [d] } = { [a], [c] }. The class brackets are omitted for readability.

References

John E. Hopcroft; Jeffrey D. Ullman (1979). Introduction to Automata Theory, Languages, and Computation. Reading/MA: Addison-Wesley. ISBN 0-201-02988-X.
Tristan le Gall and Bertrand Jeannet (Mar 2007). Analysis of Communicating Infinite State Machines Using Lattice Automata (PDF) (Publication Interne). Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA) — Campus Universitaire de Beaulieu. ISSN 1166-8687.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[2] Hopcroft and Ullman (sect.2.3, p.20) use a slightly deviating definition of δ, viz. as a function from S × Σ to the power set of S.

[4] In the automaton diagrams in the table, symbols from the input alphabet and state names are colored in green and red, respectively; final states are drawn as double circles.

[5] Strictly formal, the set is S^C = { [a], [b], [c], [d] } = { [a], [c] }. The class brackets are omitted for readability.

[Hopcroft.Ullman.1979-1] John E. Hopcroft; Jeffrey D. Ullman (1979). Introduction to Automata Theory, Languages, and Computation. Reading/MA: Addison-Wesley. ISBN 0-201-02988-X.

[Gall.Jeannet.2007-3] Tristan le Gall and Bertrand Jeannet (Mar 2007). Analysis of Communicating Infinite State Machines Using Lattice Automata (PDF) (Publication Interne). Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA) — Campus Universitaire de Beaulieu. ISSN 1166-8687.