Kleene's algorithm

In theoretical computer science, in particular in formal language theory, Kleene's algorithm transforms a given deterministic finite automaton (DFA) into a regular expression. Together with other conversion algorithms, it establishes the equivalence of several description formats for regular languages.

Algorithm description

According to Gross and Yellen (2004),^[1] the algorithm can be traced back to Kleene (1956).^[2]

This description follows Hopcroft and Ullman (1979).^[3] Given a deterministic finite automaton M = (Q, Σ, δ, q₀, F), with Q = { q₀,...,q_n } its set of states, the algorithm computes

the sets R^k
_ij of all strings that take M from state q_i to q_j without going through any state numbered higher than k.

Here, "going through a state" means entering and leaving it, so both i and j may be higher than k, but no intermediate state may. Each set R^k
_ij is represented by a regular expression; the algorithm computes them step by step for k = -1, 0, ..., n. Since there is no state numbered higher than n, the regular expression Rⁿ
_0j represents the set of all strings that take M from its start state q₀ to q_j. If F = { q₁,...,q_f } is the set of accept states, the regular expression Rⁿ
₀₁ | ... | Rⁿ
_0f represents the language accepted by M.

The initial regular expressions, for k = -1, are computed as

R⁻¹
_ij = a₁ | ... | a_m if i≠j, where δ(q_i,a₁) = ... = δ(q_i,a_m) = q_j

R⁻¹
_ij = a₁ | ... | a_m | ε, if i=j, where δ(q_i,a₁) = ... = δ(q_i,a_m) = q_j

After that, in each step the expressions R^k
_ij are computed from the previous ones by

R^k
_ij = R^k-1
_ik (R^k-1
_kk)^* R^k-1
_kj | R^k-1
_ij

By induction on k, it can be shown that the length^[4] of each expression R^k
_ij is at most 4^k+1(6s+7) - 4/3 symbols, where s denotes the number of characters in Σ. Therefore, the length of the regular expression representing the language accepted by M is at most 4ⁿ⁺¹(6s+7)f - f - 3/3 symbols, where f denotes the number of final states.

Example

Example DFA given to Kleene's algorithm

The automaton shown in the picture can be described as M = (Q, Σ, δ, q₀, F) with

the set of states Q = { q₀, q₁, q₂ },
the input alphabet Σ = { a, b },
the transition function δ with δ(q₀,a)=q₀, δ(q₀,b)=q₁, δ(q₁,a)=q₂, δ(q₁,b)=q₁, δ(q₂,a)=q₁, and δ(q₂,b)=q₁,
the start state q₀, and
set of accept states F = { q₁ }.

Kleene's algorithm computes the initial regular expressions as

R⁻¹ ₀₀	= a \| ε
R⁻¹ ₀₁	= b
R⁻¹ ₀₂	= ∅
R⁻¹ ₁₀	= ∅
R⁻¹ ₁₁	= b \| ε
R⁻¹ ₁₂	= a
R⁻¹ ₂₀	= ∅
R⁻¹ ₂₁	= a \| b
R⁻¹ ₂₂	= ε

After that, the R^k
_ij are computed from the R^k-1
_ij step by step for k = 0, 1, 2. Kleene algebra equalities are used to simplify the regular expressions as much as possible.

Step 0:

R⁰ ₀₀	= R⁻¹ ₀₀ (R⁻¹ ₀₀)^* R⁻¹ ₀₀ \| R⁻¹ ₀₀	= (a \| ε)	(a \| ε)^*	(a \| ε)	\| a \| ε	= a^*
R⁰ ₀₁	= R⁻¹ ₀₀ (R⁻¹ ₀₀)^* R⁻¹ ₀₁ \| R⁻¹ ₀₁	= (a \| ε)	(a \| ε)^*	b	\| b	= a^* b
R⁰ ₀₂	= R⁻¹ ₀₀ (R⁻¹ ₀₀)^* R⁻¹ ₀₂ \| R⁻¹ ₀₂	= (a \| ε)	(a \| ε)^*	∅	\| ∅	= ∅
R⁰ ₁₀	= R⁻¹ ₁₀ (R⁻¹ ₀₀)^* R⁻¹ ₀₀ \| R⁻¹ ₁₀	= ∅	(a \| ε)^*	(a \| ε)	\| ∅	= ∅
R⁰ ₁₁	= R⁻¹ ₁₀ (R⁻¹ ₀₀)^* R⁻¹ ₀₁ \| R⁻¹ ₁₁	= ∅	(a \| ε)^*	b	\| b \| ε	= b \| ε
R⁰ ₁₂	= R⁻¹ ₁₀ (R⁻¹ ₀₀)^* R⁻¹ ₀₂ \| R⁻¹ ₁₂	= ∅	(a \| ε)^*	∅	\| a	= a
R⁰ ₂₀	= R⁻¹ ₂₀ (R⁻¹ ₀₀)^* R⁻¹ ₀₀ \| R⁻¹ ₂₀	= ∅	(a \| ε)^*	(a \| ε)	\| ∅	= ∅
R⁰ ₂₁	= R⁻¹ ₂₀ (R⁻¹ ₀₀)^* R⁻¹ ₀₁ \| R⁻¹ ₂₁	= ∅	(a \| ε)^*	b	\| a \| b	= a \| b
R⁰ ₂₂	= R⁻¹ ₂₀ (R⁻¹ ₀₀)^* R⁻¹ ₀₂ \| R⁻¹ ₂₂	= ∅	(a \| ε)^*	∅	\| ε	= ε

Step 1:

R¹ ₀₀	= R⁰ ₀₁ (R⁰ ₁₁)^* R⁰ ₁₀ \| R⁰ ₀₀	= a^*b	(b \| ε)^*	∅	\| a^*	= a^*
R¹ ₀₁	= R⁰ ₀₁ (R⁰ ₁₁)^* R⁰ ₁₁ \| R⁰ ₀₁	= a^*b	(b \| ε)^*	(b \| ε)	\| a^* b	= a^* b^* b
R¹ ₀₂	= R⁰ ₀₁ (R⁰ ₁₁)^* R⁰ ₁₂ \| R⁰ ₀₂	= a^*b	(b \| ε)^*	a	\| ∅	= a^* b^* ba
R¹ ₁₀	= R⁰ ₁₁ (R⁰ ₁₁)^* R⁰ ₁₀ \| R⁰ ₁₀	= (b \| ε)	(b \| ε)^*	∅	\| ∅	= ∅
R¹ ₁₁	= R⁰ ₁₁ (R⁰ ₁₁)^* R⁰ ₁₁ \| R⁰ ₁₁	= (b \| ε)	(b \| ε)^*	(b \| ε)	\| b \| ε	= b^*
R¹ ₁₂	= R⁰ ₁₁ (R⁰ ₁₁)^* R⁰ ₁₂ \| R⁰ ₁₂	= (b \| ε)	(b \| ε)^*	a	\| a	= b^* a
R¹ ₂₀	= R⁰ ₂₁ (R⁰ ₁₁)^* R⁰ ₁₀ \| R⁰ ₂₀	= (a \| b)	(b \| ε)^*	∅	\| ∅	= ∅
R¹ ₂₁	= R⁰ ₂₁ (R⁰ ₁₁)^* R⁰ ₁₁ \| R⁰ ₂₁	= (a \| b)	(b \| ε)^*	(b \| ε)	\| a \| b	= (a \| b) b^*
R¹ ₂₂	= R⁰ ₂₁ (R⁰ ₁₁)^* R⁰ ₁₂ \| R⁰ ₂₂	= (a \| b)	(b \| ε)^*	a	\| ε	= (a \| b) b^* a \| ε

Step 2:

R² ₀₀	= R¹ ₀₂ (R¹ ₂₂)^* R¹ ₂₀ \| R¹ ₀₀	= a^b^ba	((a\|b)b^a \| ε)^	∅	\| a^*	= a^*
R² ₀₁	= R¹ ₀₂ (R¹ ₂₂)^* R¹ ₂₁ \| R¹ ₀₁	= a^b^ba	((a\|b)b^a \| ε)^	(a\|b)b^*	\| a^* b^* b	= a^* b (a (a \| b) \| b)^*
R² ₀₂	= R¹ ₀₂ (R¹ ₂₂)^* R¹ ₂₂ \| R¹ ₀₂	= a^b^ba	((a\|b)b^a \| ε)^	((a\|b)b^*a \| ε)	\| a^* b^* ba	= a^* b^* b (a (a \| b) b^)^ a
R² ₁₀	= R¹ ₁₂ (R¹ ₂₂)^* R¹ ₂₀ \| R¹ ₁₀	= b^* a	((a\|b)b^a \| ε)^	∅	\| ∅	= ∅
R² ₁₁	= R¹ ₁₂ (R¹ ₂₂)^* R¹ ₂₁ \| R¹ ₁₁	= b^* a	((a\|b)b^a \| ε)^	(a\|b)b^*	\| b^*	= (a (a \| b) \| b)^*
R² ₁₂	= R¹ ₁₂ (R¹ ₂₂)^* R¹ ₂₂ \| R¹ ₁₂	= b^* a	((a\|b)b^a \| ε)^	((a\|b)b^*a \| ε)	\| b^* a	= (a (a \| b) \| b)^* a
R² ₂₀	= R¹ ₂₂ (R¹ ₂₂)^* R¹ ₂₀ \| R¹ ₂₀	= ((a\|b)b^*a \| ε)	((a\|b)b^a \| ε)^	∅	\| ∅	= ∅
R² ₂₁	= R¹ ₂₂ (R¹ ₂₂)^* R¹ ₂₁ \| R¹ ₂₁	= ((a\|b)b^*a \| ε)	((a\|b)b^a \| ε)^	(a\|b)b^*	\| (a \| b) b^*	= (a \| b) (a (a \| b) \| b)^*
R² ₂₂	= R¹ ₂₂ (R¹ ₂₂)^* R¹ ₂₂ \| R¹ ₂₂	= ((a\|b)b^*a \| ε)	((a\|b)b^a \| ε)^	((a\|b)b^*a \| ε)	\| (a \| b) b^* a \| ε	= ((a \| b) b^* a)^*

Since q₀ is the start state and q₁ is the only accept state, the regular expression R²
₀₁ denotes the set of all strings accepted by the automaton.

References

↑ Jonathan L. Gross and Jay Yellen, ed. (2004). Handbook of Graph Theory. Discrete Mathematics and it Applications. CRC Press. ISBN 1-58488-090-2. Here: sect.2.1, remark R13 on p.65
↑ Kleene, Stephen C. (1956). "Representation of Events in Nerve Nets and Finite Automate" (PDF). Automata Studies, Annals of Math. Studies. Princeton Univ. Press. 34. Here: sect.9, p.37-40
↑ John E. Hopcroft, Jeffrey D. Ullman (1979). Introduction to Automata Theory, Languages, and Computation. Addison-Wesley. ISBN 0-201-02988-X. Here: Theorem 2.4, p.33-34
↑ More precisely, the number of regular-expression symbols, "a_i", "ε", "|", "^*", "·"; not counting parantheses.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[1] Jonathan L. Gross and Jay Yellen, ed. (2004). Handbook of Graph Theory. Discrete Mathematics and it Applications. CRC Press. ISBN 1-58488-090-2. Here: sect.2.1, remark R13 on p.65

[2] Kleene, Stephen C. (1956). "Representation of Events in Nerve Nets and Finite Automate" (PDF). Automata Studies, Annals of Math. Studies. Princeton Univ. Press. 34. Here: sect.9, p.37-40

[3] John E. Hopcroft, Jeffrey D. Ullman (1979). Introduction to Automata Theory, Languages, and Computation. Addison-Wesley. ISBN 0-201-02988-X. Here: Theorem 2.4, p.33-34

[4] More precisely, the number of regular-expression symbols, "a_i", "ε", "|", "^*", "·"; not counting parantheses.

R¹ ₀₀	= R⁰ ₀₁ (R⁰ ₁₁)^* R⁰ ₁₀ \| R⁰ ₀₀	= a^*b	(b \| ε)^*	∅	\| a^*	= a^*
R¹ ₀₁	= R⁰ ₀₁ (R⁰ ₁₁)^* R⁰ ₁₁ \| R⁰ ₀₁	= a^*b	(b \| ε)^*	(b \| ε)	\| a^* b	= a^* b^* b
R¹ ₀₂	= R⁰ ₀₁ (R⁰ ₁₁)^* R⁰ ₁₂ \| R⁰ ₀₂	= a^*b	(b \| ε)^*	a	\| ∅	= a^* b^* ba
R¹ ₁₀	= R⁰ ₁₁ (R⁰ ₁₁)^* R⁰ ₁₀ \| R⁰ ₁₀	= (b \| ε)	(b \| ε)^*	∅	\| ∅	= ∅
R¹ ₁₁	= R⁰ ₁₁ (R⁰ ₁₁)^* R⁰ ₁₁ \| R⁰ ₁₁	= (b \| ε)	(b \| ε)^*	(b \| ε)	\| b \| ε	= b^*
R¹ ₁₂	= R⁰ ₁₁ (R⁰ ₁₁)^* R⁰ ₁₂ \| R⁰ ₁₂	= (b \| ε)	(b \| ε)^*	a	\| a	= b^* a
R¹ ₂₀	= R⁰ ₂₁ (R⁰ ₁₁)^* R⁰ ₁₀ \| R⁰ ₂₀	= (a \| b)	(b \| ε)^*	∅	\| ∅	= ∅
R¹ ₂₁	= R⁰ ₂₁ (R⁰ ₁₁)^* R⁰ ₁₁ \| R⁰ ₂₁	= (a \| b)	(b \| ε)^*	(b \| ε)	\| a \| b	= (a \| b) b^*
R¹ ₂₂	= R⁰ ₂₁ (R⁰ ₁₁)^* R⁰ ₁₂ \| R⁰ ₂₂	= (a \| b)	(b \| ε)^*	a	\| ε	= (a \| b) b^* a \| ε

R² ₀₀	= R¹ ₀₂ (R¹ ₂₂)^* R¹ ₂₀ \| R¹ ₀₀	= a^b^ba	((a\|b)b^a \| ε)^	∅	\| a^*	= a^*
R² ₀₁	= R¹ ₀₂ (R¹ ₂₂)^* R¹ ₂₁ \| R¹ ₀₁	= a^b^ba	((a\|b)b^a \| ε)^	(a\|b)b^*	\| a^* b^* b	= a^* b (a (a \| b) \| b)^*
R² ₀₂	= R¹ ₀₂ (R¹ ₂₂)^* R¹ ₂₂ \| R¹ ₀₂	= a^b^ba	((a\|b)b^a \| ε)^	((a\|b)b^*a \| ε)	\| a^* b^* ba	= a^* b^* b (a (a \| b) b^)^ a
R² ₁₀	= R¹ ₁₂ (R¹ ₂₂)^* R¹ ₂₀ \| R¹ ₁₀	= b^* a	((a\|b)b^a \| ε)^	∅	\| ∅	= ∅
R² ₁₁	= R¹ ₁₂ (R¹ ₂₂)^* R¹ ₂₁ \| R¹ ₁₁	= b^* a	((a\|b)b^a \| ε)^	(a\|b)b^*	\| b^*	= (a (a \| b) \| b)^*
R² ₁₂	= R¹ ₁₂ (R¹ ₂₂)^* R¹ ₂₂ \| R¹ ₁₂	= b^* a	((a\|b)b^a \| ε)^	((a\|b)b^*a \| ε)	\| b^* a	= (a (a \| b) \| b)^* a
R² ₂₀	= R¹ ₂₂ (R¹ ₂₂)^* R¹ ₂₀ \| R¹ ₂₀	= ((a\|b)b^*a \| ε)	((a\|b)b^a \| ε)^	∅	\| ∅	= ∅
R² ₂₁	= R¹ ₂₂ (R¹ ₂₂)^* R¹ ₂₁ \| R¹ ₂₁	= ((a\|b)b^*a \| ε)	((a\|b)b^a \| ε)^	(a\|b)b^*	\| (a \| b) b^*	= (a \| b) (a (a \| b) \| b)^*
R² ₂₂	= R¹ ₂₂ (R¹ ₂₂)^* R¹ ₂₂ \| R¹ ₂₂	= ((a\|b)b^*a \| ε)	((a\|b)b^a \| ε)^	((a\|b)b^*a \| ε)	\| (a \| b) b^* a \| ε	= ((a \| b) b^* a)^*

Kleene's algorithm

Algorithm description

Example

See also

References