Non-deterministic finite automaton

Kapitoly: Finite automaton, Total automaton, Non-deterministic finite automaton, NKA Simulation, Converting NKA to DKA

Thefinite automaton introduced in the previous chapters (abbreviated KA or DKA) was always deterministic, which was manifested by the fact that it was clear what the automaton was going to do at any moment. We now look at the more general concept of nondeterministic finite automata (abbreviated NKA), in which there can be transitions that start from the same state and are for the same symbol.

Example

Look at the following diagram of an automaton:

What is special about it? That there are two edges leading from the node q0 for the symbol 0. Thus, if we try to accept the word 001, it would seem that the automaton would not know whether to stay in the state q0 or whether to follow the transition to q1. Such an automaton would be non-deterministic.

Determinism in this sense means that in every situation it is clear what the automaton will do, there is no room for ambiguity. A deterministic automaton has only one transition for each state and each symbol. Non-determinism means ambiguity - there may be three transitions from one state for the same symbol.

So how does the automaton decide? A non-deterministic automaton always chooses (sometimes we also say "guesses") the transition that will lead it to accept the word; if possible.

If we put the word 001 into the automaton, the automaton has the following options for what to do with the word:

\begin{eqnarray} q_0 \rightarrow q_0 \rightarrow q_0 \rightarrow q_0 \rightarrow q_0 \rightarrow q_1 \rightarrow q_2 \end{eqnarray}

plus options where it doesn't read the whole word. In the first one, it still stays in the q0 state, which it certainly can, the transition rules allow it. In the second branch, he ventured into other states and eventually reached the state q2, which is terminal. In this branch, the automaton would accept the word. A non-deterministic automaton will always automatically choose the branch in which it accepts the word, if such a branch exists.

How does it automatically choose the "favorable" branch? We can think of it as a nondeterministic automaton traversing all possible branches, and if it finds a branch in which it accepts the word, then it accepts the word. How could it go through all the branches? In short, the moment it is in the state from which it leads for the current n edge symbol, the automaton will n-copy itself a few times, and each copy will take a different path. This will go through all the possibilities one by one.

Tree representation of the automaton computation

We modify the previous automaton slightly:

Non-determinism remains in the first state. The previous automaton accepted all words ending in 01. This modified automaton accepts all words that contain the subword 01. So it accepts, for example, the word 101 or 0101. We can easily visualize all the branches of the computation using a tree, e.g. for the word 01010 the tree might look like this:

At the beginning we are in the initial state q0. In the input we have the word 01010, the first unread character is 0. From the state q0 there are two transitions for the symbol 0, to the states q0 and q1. So the tree will have two children at this node: q0 and q1. Next we read the symbol 1. There is no ambiguity there, we move to the states q0 and q2. Next we read symbol 0, again there is ambiguity in the left branch, we split the computation into two more branches. In one we stay again in q0 and in the other we go to q1. And so on.

Note that a total of two branches end in q2, the end state. There are two branches of computation, as the automaton can accept a given word. So this automaton would accept the word 01010. We can try to create a similar tree for the word 1000.

The computation either stays in the state q0 or goes to the state q1, but from there it has nowhere else to go because the input is already all zeros and the state q1 has a transition only for symbol 1. No branch of the computation ends in the end state, so the automaton does not accept the word 1000.

Epsilon transitions

In non-deterministic automata we can additionally use epsilon transitions. These are transitions that we can perform whatever symbol is on the input without reading any symbol from the word. How could we write a non-deterministic automaton that would accept a blank word, all words consisting of only ones, and words of the form 10n, i.e. 10, 1010, ...?

We split the automaton into two "sub-automata". The upper one takes care of recognizing words consisting of 1, the lower one takes care of words of the form 10n. Let's try to accept the word 111. The automaton is in the state q0 at the beginning and takes the word 111 as input. Since it has epsilon-transitions, it can choose to go to the state q1 or q2 as it sees fit. We see that it will be convenient for him to transition to the state q1. The automaton will go there. Yet the input still has the word 111! It is only at this point that it will start reading symbols from the word and go from the q1 state back to q1, when it reads the whole word and accepts it.

If we tried to accept 1010, the automaton would go to the state q2 at the beginning and only from there it would start processing the symbols and accept the word again.

For the word 1100 the automaton could do anything, it has no way to accept such a word.

Epsilon transitions and non-determinism make our job much easier in many ways, for example the proof that regular language unification is again a regular language can be written much more uniquely in epsilon transitions than in the case of finite automata. In this proof we had examples of such automata

a

and we further constructed a finite deterministic automaton that accepted the unification of both languages. With epsilon transitions we can construct such an automaton as follows:

In short, we add a new initial state, from which two epsilon transitions lead to the initial states of the original automata, and that's it. The automaton decides non-deterministically whether to try to accept the word with one or the other "sub-automata".

Definition of a non-deterministic automaton

We have already shown how deterministic and non-deterministic finite automata differ, it only increases to formalize it. We have defined a deterministic finite automaton as a quintuple $\left<Q, \Sigma, \delta, q_0, F\right>$, where Q is the set of states, $\Sigma$ is the alphabet, δ is the transition function, q0 is the initial state, and F is the set of final states. The only thing that changes in the definition of non-deterministic automata is the definition of the transition function.

In the deterministic version, the function for calling δ(q, w) could return only one state, while in the non-deterministic version it can return multiple states - if multiple transitions are defined for a single symbol. Since we need to define epsilon-transitions as well, we mark the set $\Sigma\cup\left\{\varepsilon\right\}$ as the set $\Sigma_\varepsilon$, i.e. $\Sigma_\varepsilon=\Sigma\cup\left\{\varepsilon\right\}$. Thus, we define the function δ as

$$ \delta: Q \times \Sigma_\varepsilon\rightarrow 2^Q $$

That 2Q denotes the potent set, the set of all subsets of the set Q. The formalization of the computation is very similar to that of finite automata. The configuration of the automaton is again the pair $\left<q, w\right> \in Q \times \Sigma^\ast$. The step of computing $\mapsto$ is a session on the set of configurations such that

$$ \left<q_i, w_0w_1\dots w_n\right> \mapsto \left<q_j, w_1\dots w_n\right>, $$

precisely if qj ∈ δ(qi, w0). In this theorem was the biggest difference. In deterministic automata we had this condition written as qj = δ(qi, w0), because the function δ always returned a single state. The nondeterministic version of the transition function returns a set of states, hence the symbol . We will again denote the reflexive and transitive session closure by $\mapsto^\ast$. We will say that an automaton accepts the word w just if there exists qf ∈ F such that

$$\left<q_0, w\right> \mapsto^\ast \left<q_f, \varepsilon\right>.$$

Resources