Non-deterministic finite automaton

Kapitoly: Finite automaton, Total automaton, Non-deterministic finite automaton, NKA Simulation, Converting NKA to DKA

Thefinite automaton introduced in the previous chapters (abbreviated KA or DKA) was always deterministic, which was manifested by the fact that it was clear what the automaton was going to do at any moment. We now look at the more general concept of nondeterministic finite automata (abbreviated NKA), in which there can be transitions that start from the same state and are for the same symbol.

Example

Look at the following diagram of an automaton:

What is special about it? That there are two edges leading from the node q₀ for the symbol 0. Thus, if we try to accept the word 001, it would seem that the automaton would not know whether to stay in the state q₀ or whether to follow the transition to q₁. Such an automaton would be non-deterministic.

Determinism in this sense means that in every situation it is clear what the automaton will do, there is no room for ambiguity. A deterministic automaton has only one transition for each state and each symbol. Non-determinism means ambiguity - there may be three transitions from one state for the same symbol.

So how does the automaton decide? A non-deterministic automaton always chooses (sometimes we also say "guesses") the transition that will lead it to accept the word; if possible.

If we put the word 001 into the automaton, the automaton has the following options for what to do with the word:

\begin{eqnarray} q_0 \rightarrow q_0 \rightarrow q_0 \rightarrow q_0 \rightarrow q_0 \rightarrow q_1 \rightarrow q_2 \end{eqnarray}

plus options where it doesn't read the whole word. In the first one, it still stays in the q₀ state, which it certainly can, the transition rules allow it. In the second branch, he ventured into other states and eventually reached the state q₂, which is terminal. In this branch, the automaton would accept the word. A non-deterministic automaton will always automatically choose the branch in which it accepts the word, if such a branch exists.

How does it automatically choose the "favorable" branch? We can think of it as a nondeterministic automaton traversing all possible branches, and if it finds a branch in which it accepts the word, then it accepts the word. How could it go through all the branches? In short, the moment it is in the state from which it leads for the current n edge symbol, the automaton will n-copy itself a few times, and each copy will take a different path. This will go through all the possibilities one by one.

Tree representation of the automaton computation

We modify the previous automaton slightly:

Non-determinism remains in the first state. The previous automaton accepted all words ending in 01. This modified automaton accepts all words that contain the subword 01. So it accepts, for example, the word 101 or 0101. We can easily visualize all the branches of the computation using a tree, e.g. for the word 01010 the tree might look like this:

At the beginning we are in the initial state q₀. In the input we have the word 01010, the first unread character is 0. From the state q₀ there are two transitions for the symbol 0, to the states q₀ and q₁. So the tree will have two children at this node: q₀ and q₁. Next we read the symbol 1. There is no ambiguity there, we move to the states q₀ and q₂. Next we read symbol 0, again there is ambiguity in the left branch, we split the computation into two more branches. In one we stay again in q₀ and in the other we go to q₁. And so on.

Note that a total of two branches end in q₂, the end state. There are two branches of computation, as the automaton can accept a given word. So this automaton would accept the word 01010. We can try to create a similar tree for the word 1000.

The computation either stays in the state q₀ or goes to the state q₁, but from there it has nowhere else to go because the input is already all zeros and the state q₁ has a transition only for symbol 1. No branch of the computation ends in the end state, so the automaton does not accept the word 1000.

Epsilon transitions

In non-deterministic automata we can additionally use epsilon transitions. These are transitions that we can perform whatever symbol is on the input without reading any symbol from the word. How could we write a non-deterministic automaton that would accept a blank word, all words consisting of only ones, and words of the form 10ⁿ, i.e. 10, 1010, ...?

We split the automaton into two "sub-automata". The upper one takes care of recognizing words consisting of 1, the lower one takes care of words of the form 10ⁿ. Let's try to accept the word 111. The automaton is in the state q₀ at the beginning and takes the word 111 as input. Since it has epsilon-transitions, it can choose to go to the state q₁ or q₂ as it sees fit. We see that it will be convenient for him to transition to the state q₁. The automaton will go there. Yet the input still has the word 111! It is only at this point that it will start reading symbols from the word and go from the q₁ state back to q₁, when it reads the whole word and accepts it.

If we tried to accept 1010, the automaton would go to the state q₂ at the beginning and only from there it would start processing the symbols and accept the word again.

For the word 1100 the automaton could do anything, it has no way to accept such a word.

Epsilon transitions and non-determinism make our job much easier in many ways, for example the proof that regular language unification is again a regular language can be written much more uniquely in epsilon transitions than in the case of finite automata. In this proof we had examples of such automata

and we further constructed a finite deterministic automaton that accepted the unification of both languages. With epsilon transitions we can construct such an automaton as follows:

In short, we add a new initial state, from which two epsilon transitions lead to the initial states of the original automata, and that's it. The automaton decides non-deterministically whether to try to accept the word with one or the other "sub-automata".

Definition of a non-deterministic automaton

We have already shown how deterministic and non-deterministic finite automata differ, it only increases to formalize it. We have defined a deterministic finite automaton as a quintuple $\left<Q, \Sigma, \delta, q_0, F\right>$, where Q is the set of states, $\Sigma$ is the alphabet, δ is the transition function, q₀ is the initial state, and F is the set of final states. The only thing that changes in the definition of non-deterministic automata is the definition of the transition function.

In the deterministic version, the function for calling δ(q, w) could return only one state, while in the non-deterministic version it can return multiple states - if multiple transitions are defined for a single symbol. Since we need to define epsilon-transitions as well, we mark the set $\Sigma\cup\left\{\varepsilon\right\}$ as the set $\Sigma_\varepsilon$, i.e. $\Sigma_\varepsilon=\Sigma\cup\left\{\varepsilon\right\}$. Thus, we define the function δ as

$$ \delta: Q \times \Sigma_\varepsilon\rightarrow 2^Q $$

That 2^Q denotes the potent set, the set of all subsets of the set Q. The formalization of the computation is very similar to that of finite automata. The configuration of the automaton is again the pair $\left<q, w\right> \in Q \times \Sigma^\ast$. The step of computing $\mapsto$ is a session on the set of configurations such that

$$ \left<q_i, w_0w_1\dots w_n\right> \mapsto \left<q_j, w_1\dots w_n\right>, $$

precisely if q_j ∈ δ(q_i, w₀). In this theorem was the biggest difference. In deterministic automata we had this condition written as q_j = δ(q_i, w₀), because the function δ always returned a single state. The nondeterministic version of the transition function returns a set of states, hence the symbol ∈. We will again denote the reflexive and transitive session closure by $\mapsto^\ast$. We will say that an automaton accepts the word w just if there exists q_f ∈ F such that

$$\left<q_0, w\right> \mapsto^\ast \left<q_f, \varepsilon\right>.$$

Resources

« Previous: Total automaton

Next: NKA Simulation »