Nondeterministic finite automaton

Last updated
NFA for (0|1) 1 (0|1).
A DFA for that language has at least 16 states. Relatively small NFA.svg
NFA for (0|1) 1 (0|1).
A DFA for that language has at least 16 states.

In automata theory, a finite-state machine is called a deterministic finite automaton (DFA), if

Contents

A nondeterministic finite automaton (NFA), or nondeterministic finite-state machine, does not need to obey these restrictions. In particular, every DFA is also an NFA. Sometimes the term NFA is used in a narrower sense, referring to an NFA that is not a DFA, but not in this article.

Using the subset construction algorithm, each NFA can be translated to an equivalent DFA; i.e., a DFA recognizing the same formal language. [1] Like DFAs, NFAs only recognize regular languages.

NFAs were introduced in 1959 by Michael O. Rabin and Dana Scott, [2] who also showed their equivalence to DFAs. NFAs are used in the implementation of regular expressions: Thompson's construction is an algorithm for compiling a regular expression to an NFA that can efficiently perform pattern matching on strings. Conversely, Kleene's algorithm can be used to convert an NFA into a regular expression (whose size is generally exponential in the input automaton).

NFAs have been generalized in multiple ways, e.g., nondeterministic finite automata with ε-moves, finite-state transducers, pushdown automata, alternating automata, ω-automata, and probabilistic automata. Besides the DFAs, other known special cases of NFAs are unambiguous finite automata (UFA) and self-verifying finite automata (SVFA).

Informal introduction

There are two ways to describe the behavior of an NFA, and both of them are equivalent. The first way makes use of the nondeterminism in the name of an NFA. For each input symbol, the NFA transitions to a new state until all input symbols have been consumed. In each step, the automaton nondeterministically "chooses" one of the applicable transitions. If there exists at least one "lucky run", i.e. some sequence of choices leading to an accepting state after completely consuming the input, it is accepted. Otherwise, i.e. if no choice sequence at all can consume all the input [3] and lead to an accepting state, the input is rejected. [4] :19 [5] :319

In the second way, the NFA consumes a string of input symbols, one by one. In each step, whenever two or more transitions are applicable, it "clones" itself into appropriately many copies, each one following a different transition. If no transition is applicable, the current copy is in a dead end, and it "dies". If, after consuming the complete input, any of the copies is in an accept state, the input is accepted, else, it is rejected. [4] :19–20 [6] :48 [7] :56

Formal definition

For a more elementary introduction of the formal definition, see automata theory.

Automaton

An NFA is represented formally by a 5-tuple, , consisting of

Here, denotes the power set of .

Recognized language

Given an NFA , its recognized language is denoted by , and is defined as the set of all strings over the alphabet that are accepted by .

Loosely corresponding to the above informal explanations, there are several equivalent formal definitions of a string being accepted by :

In words, the first condition says that the machine starts in the start state . The second condition says that given each character of string , the machine will transition from state to state according to the transition function . The last condition says that the machine accepts if the last input of causes the machine to halt in one of the accepting states. In order for to be accepted by , it is not required that every state sequence ends in an accepting state, it is sufficient if one does. Otherwise, i.e. if it is impossible at all to get from to a state from by following , it is said that the automaton rejects the string. The set of strings accepts is the language recognized by and this language is denoted by . [5] :320 [6] :54
In words, is the set of all states reachable from state by consuming the string . The string is accepted if some accepting state in can be reached from the start state by consuming . [4] :21 [7] :59

Initial state

The above automaton definition uses a single initial state, which is not necessary. Sometimes, NFAs are defined with a set of initial states. There is an easy construction that translates an NFA with multiple initial states to an NFA with a single initial state, which provides a convenient notation.

Example

The state diagram for M. It is not deterministic since in state p reading a 1 can lead to p or to q. NFASimpleExample.svg
The state diagram for M. It is not deterministic since in state p reading a 1 can lead to p or to q.
All possible runs of M on input string "10" NFASimpleExample Runs10.gif
All possible runs of M on input string "10"
All possible runs of M on input string "1011".
Arc label: input symbol, node label: state, green: start state, red: accepting state(s). NFASimpleExample Runs1011.gif
All possible runs of M on input string "1011".
Arc label: input symbol, node label: state, green: start state, red: accepting state(s).

The following automaton , with a binary alphabet, determines if the input ends with a 1. Let where the transition function can be defined by this state transition table (cf. upper left picture):

Input
State
01

Since the set contains more than one state, is nondeterministic. The language of can be described by the regular language given by the regular expression (0|1)*1.

All possible state sequences for the input string "1011" are shown in the lower picture. The string is accepted by since one state sequence satisfies the above definition; it does not matter that other sequences fail to do so. The picture can be interpreted in a couple of ways:

The feasibility to read the same picture in two ways also indicates the equivalence of both above explanations.

In contrast, the string "10" is rejected by (all possible state sequences for that input are shown in the upper right picture), since there is no way to reach the only accepting state, , by reading the final 0 symbol. While can be reached after consuming the initial "1", this does not mean that the input "10" is accepted; rather, it means that an input string "1" would be accepted.

Equivalence to DFA

A deterministic finite automaton (DFA) can be seen as a special kind of NFA, in which for each state and symbol, the transition function has exactly one state. Thus, it is clear that every formal language that can be recognized by a DFA can be recognized by an NFA.

Conversely, for each NFA, there is a DFA such that it recognizes the same formal language. The DFA can be constructed using the powerset construction.

This result shows that NFAs, despite their additional flexibility, are unable to recognize languages that cannot be recognized by some DFA. It is also important in practice for converting easier-to-construct NFAs into more efficiently executable DFAs. However, if the NFA has n states, the resulting DFA may have up to 2n states, which sometimes makes the construction impractical for large NFAs.

NFA with ε-moves

Nondeterministic finite automaton with ε-moves (NFA-ε) is a further generalization to NFA. In this kind of automaton, the transition function is additionally defined on the empty string ε. A transition without consuming an input symbol is called an ε-transition and is represented in state diagrams by an arrow labeled "ε". ε-transitions provide a convenient way of modeling systems whose current states are not precisely known: i.e., if we are modeling a system and it is not clear whether the current state (after processing some input string) should be q or q', then we can add an ε-transition between these two states, thus putting the automaton in both states simultaneously.

Formal definition

An NFA-ε is represented formally by a 5-tuple, , consisting of

Here, denotes the power set of and denotes empty string.

ε-closure of a state or set of states

For a state , let denote the set of states that are reachable from by following ε-transitions in the transition function , i.e., if there is a sequence of states such that

is known as the epsilon closure, (also ε-closure) of .

The ε-closure of a set of states of an NFA is defined as the set of states reachable from any state in following ε-transitions. Formally, for , define .

Extended transition function

Similar to NFA without ε-moves, the transition function of an NFA-ε can be extended to strings. Informally, denotes the set of all states the automaton may have reached when starting in state and reading the string The function can be defined recursively as follows.

Informally: Reading the empty string may drive the automaton from state to any state of the epsilon closure of
Informally: Reading the string may drive the automaton from state to any state in the recursively computed set ; after that, reading the symbol may drive it from to any state in the epsilon closure of

The automaton is said to accept a string if

that is, if reading may drive the automaton from its start state to some accepting state in [4] :25

Example

The state diagram for M NFAexample.svg
The state diagram for M

Let be a NFA-ε, with a binary alphabet, that determines if the input contains an even number of 0s or an even number of 1s. Note that 0 occurrences is an even number of occurrences as well.

In formal notation, let

where

the transition relation can be defined by this state transition table:

Input
State
01ε
S0{}{}{S1, S3}
S1{S2}{S1}{}
S2{S1}{S2}{}
S3{S3}{S4}{}
S4{S4}{S3}{}

can be viewed as the union of two DFAs: one with states and the other with states . The language of can be described by the regular language given by this regular expression . We define using ε-moves but can be defined without using ε-moves.

Equivalence to NFA

To show NFA-ε is equivalent to NFA, first note that NFA is a special case of NFA-ε, so it remains to show for every NFA-ε, there exists an equivalent NFA.

Given an NFA with epsilon moves define an NFA where

and

for each state and each symbol using the extended transition function defined above.

One has to distinguish the transition functions of and viz. and and their extensions to strings, and respectively. By construction, has no ε-transitions.

One can prove that for each string , by induction on the length of

Based on this, one can show that if, and only if, for each string

From and we have
we still have to show the "" direction.
  • If contains a state in then contains the same state, which lies in .
  • If contains and then also contains a state in viz.
  • If contains and then the state in [ clarify ] must be in [4] :26–27

Since NFA is equivalent to DFA, NFA-ε is also equivalent to DFA.

Closure properties

Composed NFA accepting the union of the languages of some given NFAs N(s) and N(t). For an input string w in the language union, the composed automaton follows an e-transition from q to the start state (left colored circle) of an appropriate subautomaton -- N(s) or N(t) -- which, by following w, may reach an accepting state (right colored circle); from there, state f can be reached by another e-transition. Due to the e-transitions, the composed NFA is properly nondeterministic even if both N(s) and N(t) were DFAs; vice versa, constructing a DFA for the union language (even of two DFAs) is much more complicated. Thompson-or.svg
Composed NFA accepting the union of the languages of some given NFAs N(s) and N(t). For an input string w in the language union, the composed automaton follows an ε-transition from q to the start state (left colored circle) of an appropriate subautomaton N(s) or N(t) which, by following w, may reach an accepting state (right colored circle); from there, state f can be reached by another ε-transition. Due to the ε-transitions, the composed NFA is properly nondeterministic even if both N(s) and N(t) were DFAs; vice versa, constructing a DFA for the union language (even of two DFAs) is much more complicated.

The set of languages recognized by NFAs is closed under the following operations. These closure operations are used in Thompson's construction algorithm, which constructs an NFA from any regular expression. They can also be used to prove that NFAs recognize exactly the regular languages.

Since NFAs are equivalent to nondeterministic finite automaton with ε-moves (NFA-ε), the above closures are proved using closure properties of NFA-ε.

Properties

The machine starts in the specified initial state and reads in a string of symbols from its alphabet. The automaton uses the state transition function Δ to determine the next state using the current state, and the symbol just read or the empty string. However, "the next state of an NFA depends not only on the current input event, but also on an arbitrary number of subsequent input events. Until these subsequent events occur it is not possible to determine which state the machine is in". [8] If, when the automaton has finished reading, it is in an accepting state, the NFA is said to accept the string, otherwise it is said to reject the string.

The set of all strings accepted by an NFA is the language the NFA accepts. This language is a regular language.

For every NFA a deterministic finite automaton (DFA) can be found that accepts the same language. Therefore, it is possible to convert an existing NFA into a DFA for the purpose of implementing a (perhaps) simpler machine. This can be performed using the powerset construction, which may lead to an exponential rise in the number of necessary states. For a formal proof of the powerset construction, please see the Powerset construction article.

Implementation

There are many ways to implement a NFA:

Complexity

Application of NFA

NFAs and DFAs are equivalent in that if a language is recognized by an NFA, it is also recognized by a DFA and vice versa. The establishment of such equivalence is important and useful. It is useful because constructing an NFA to recognize a given language is sometimes much easier than constructing a DFA for that language. It is important because NFAs can be used to reduce the complexity of the mathematical work required to establish many important properties in the theory of computation. For example, it is much easier to prove closure properties of regular languages using NFAs than DFAs.

See also

Notes

  1. Martin, John (2010). Introduction to Languages and the Theory of Computation. McGraw Hill. p. 108. ISBN   978-0071289429.
  2. Rabin, M. O.; Scott, D. (April 1959). "Finite Automata and Their Decision Problems". IBM Journal of Research and Development. 3 (2): 114–125. doi:10.1147/rd.32.0114.
  3. A choice sequence may lead into a "dead end" where no transition is applicable for the current input symbol; in this case it is considered unsuccessful.
  4. 1 2 3 4 5 John E. Hopcroft and Jeffrey D. Ullman (1979). Introduction to Automata Theory, Languages, and Computation. Reading/MA: Addison-Wesley. ISBN   0-201-02988-X.
  5. 1 2 Alfred V. Aho and John E. Hopcroft and Jeffrey D. Ullman (1974). The Design and Analysis of Computer Algorithms . Reading/MA: Addison-Wesley. ISBN   0-201-00029-6.
  6. 1 2 Michael Sipser (1997). Introduction to the Theory of Computation. Boston/MA: PWS Publishing Co. ISBN   0-534-94728-X.
  7. 1 2 3 John E. Hopcroft and Rajeev Motwani and Jeffrey D. Ullman (2003). Introduction to Automata Theory, Languages, and Computation (PDF). Upper Saddle River/NJ: Addison Wesley. ISBN   0-201-44124-1.
  8. FOLDOC Free Online Dictionary of Computing, Finite-State Machine
  9. Chris Calabro (February 27, 2005). "NFA to DFA blowup" (PDF). cseweb.ucsd.edu. Retrieved 6 March 2023.
  10. Allan, C., Avgustinov, P., Christensen, A. S., Hendren, L., Kuzins, S., Lhoták, O., de Moor, O., Sereni, D., Sittampalam, G., and Tibble, J. 2005. Adding trace matching with free variables to AspectJ Archived 2009-09-18 at the Wayback Machine . In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages, and Applications (San Diego, CA, USA, October 16–20, 2005). OOPSLA '05. ACM, New York, NY, 345-364.
  11. Historically shown in: Meyer, A. R.; Stockmeyer, L. J. (1972-10-25). "The equivalence problem for regular expressions with squaring requires exponential space". Proceedings of the 13th Annual Symposium on Switching and Automata Theory (SWAT). USA: IEEE Computer Society: 125–129. doi:10.1109/SWAT.1972.29. For a modern presentation, see
  12. Álvarez, Carme; Jenner, Birgit (1993-01-04). "A very hard log-space counting class". Theoretical Computer Science. 107 (1): 3–30. doi:10.1016/0304-3975(93)90252-O. ISSN   0304-3975.

Related Research Articles

<span class="mw-page-title-main">Finite-state machine</span> Mathematical model of computation

A finite-state machine (FSM) or finite-state automaton, finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number of states at any given time. The FSM can change from one state to another in response to some inputs; the change from one state to another is called a transition. An FSM is defined by a list of its states, its initial state, and the inputs that trigger each transition. Finite-state machines are of two types—deterministic finite-state machines and non-deterministic finite-state machines. For any non-deterministic finite-state machine, an equivalent deterministic one can be constructed.

<span class="mw-page-title-main">Pushdown automaton</span> Type of automaton

In the theory of computation, a branch of theoretical computer science, a pushdown automaton (PDA) is a type of automaton that employs a stack.

<span class="mw-page-title-main">Automata theory</span> Study of abstract machines and automata

Automata theory is the study of abstract machines and automata, as well as the computational problems that can be solved using them. It is a theory in theoretical computer science with close connections to mathematical logic. The word automata comes from the Greek word αὐτόματος, which means "self-acting, self-willed, self-moving". An automaton is an abstract self-propelled computing device which follows a predetermined sequence of operations automatically. An automaton with a finite number of states is called a Finite Automaton (FA) or Finite-State Machine (FSM). The figure on the right illustrates a finite-state machine, which is a well-known type of automaton. This automaton consists of states and transitions. As the automaton sees a symbol of input, it makes a transition to another state, according to its transition function, which takes the previous state and current input symbol as its arguments.

<span class="mw-page-title-main">Büchi automaton</span>

In computer science and automata theory, a deterministic Büchi automaton is a theoretical machine which either accepts or rejects infinite inputs. Such a machine has a set of states and a transition function, which determines which state the machine should move to from its current state when it reads the next input character. Some states are accepting states and one state is the start state. The machine accepts an input if and only if it will pass through an accepting state infinitely many times as it reads the input.

<span class="mw-page-title-main">Deterministic finite automaton</span> Finite-state machine

In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state automaton (DFSA)—is a finite-state machine that accepts or rejects a given string of symbols, by running through a state sequence uniquely determined by the string. Deterministic refers to the uniqueness of the computation run. In search of the simplest models to capture finite-state machines, Warren McCulloch and Walter Pitts were among the first researchers to introduce a concept similar to finite automata in 1943.

In automata theory, an alternating finite automaton (AFA) is a nondeterministic finite automaton whose transitions are divided into existential and universal transitions. For example, let A be an alternating automaton.

A finite-state transducer (FST) is a finite-state machine with two memory tapes, following the terminology for Turing machines: an input tape and an output tape. This contrasts with an ordinary finite-state automaton, which has a single tape. An FST is a type of finite-state automaton (FSA) that maps between two sets of symbols. An FST is more general than an FSA. An FSA defines a formal language by defining a set of accepted strings, while an FST defines a relation between sets of strings.

In the theory of computation and automata theory, the powerset construction or subset construction is a standard method for converting a nondeterministic finite automaton (NFA) into a deterministic finite automaton (DFA) which recognizes the same formal language. It is important in theory because it establishes that NFAs, despite their additional flexibility, are unable to recognize any language that cannot be recognized by some DFA. It is also important in practice for converting easier-to-construct NFAs into more efficiently executable DFAs. However, if the NFA has n states, the resulting DFA may have up to 2n states, an exponentially larger number, which sometimes makes the construction impractical for large NFAs.

In automata theory, a deterministic pushdown automaton is a variation of the pushdown automaton. The class of deterministic pushdown automata accepts the deterministic context-free languages, a proper subset of context-free languages.

In computer science, in particular in automata theory, a two-way finite automaton is a finite automaton that is allowed to re-read its input.

In quantum computing, quantum finite automata (QFA) or quantum state machines are a quantum analog of probabilistic automata or a Markov decision process. They provide a mathematical abstraction of real-world quantum computers. Several types of automata may be defined, including measure-once and measure-many automata. Quantum finite automata can also be understood as the quantization of subshifts of finite type, or as a quantization of Markov chains. QFAs are, in turn, special cases of geometric finite automata or topological finite automata.

In mathematics and computer science, the probabilistic automaton (PA) is a generalization of the nondeterministic finite automaton; it includes the probability of a given transition into the transition function, turning it into a transition matrix. Thus, the probabilistic automaton also generalizes the concepts of a Markov chain and of a subshift of finite type. The languages recognized by probabilistic automata are called stochastic languages; these include the regular languages as a subset. The number of stochastic languages is uncountable.

A read-only Turing machine or two-way deterministic finite-state automaton (2DFA) is class of models of computability that behave like a standard Turing machine and can move in both directions across input, except cannot write to its input tape. The machine in its bare form is equivalent to a deterministic finite automaton in computational power, and therefore can only parse a regular language.

An embedded pushdown automaton or EPDA is a computational model for parsing languages generated by tree-adjoining grammars (TAGs). It is similar to the context-free grammar-parsing pushdown automaton, but instead of using a plain stack to store symbols, it has a stack of iterated stacks that store symbols, giving TAGs a generative capacity between context-free and context-sensitive grammars, or a subset of mildly context-sensitive grammars. Embedded pushdown automata should not be confused with nested stack automata which have more computational power.

<span class="mw-page-title-main">DFA minimization</span>

In automata theory, DFA minimization is the task of transforming a given deterministic finite automaton (DFA) into an equivalent DFA that has a minimum number of states. Here, two DFAs are called equivalent if they recognize the same regular language. Several different algorithms accomplishing this task are known and described in standard textbooks on automata theory.

In computer science, more specifically in automata and formal language theory, nested words are a concept proposed by Alur and Madhusudan as a joint generalization of words, as traditionally used for modelling linearly ordered structures, and of ordered unranked trees, as traditionally used for modelling hierarchical structures. Finite-state acceptors for nested words, so-called nested word automata, then give a more expressive generalization of finite automata on words. The linear encodings of languages accepted by finite nested word automata gives the class of visibly pushdown languages. The latter language class lies properly between the regular languages and the deterministic context-free languages. Since their introduction in 2004, these concepts have triggered much research in that area.

In computer science and mathematical logic, an infinite-tree automaton is a state machine that deals with infinite tree structures. It can be seen as an extension of top-down finite-tree automata to infinite trees or as an extension of infinite-word automata to infinite trees.

In computer science, Thompson's construction algorithm, also called the McNaughton–Yamada–Thompson algorithm, is a method of transforming a regular expression into an equivalent nondeterministic finite automaton (NFA). This NFA can be used to match strings against the regular expression. This algorithm is credited to Ken Thompson.

<span class="mw-page-title-main">Weighted automaton</span> Finite-state machine where edges carry weights

In theoretical computer science and formal language theory, a weighted automaton or weighted finite-state machine is a generalization of a finite-state machine in which the edges have weights, for example real numbers or integers. Finite-state machines are only capable of answering decision problems; they take as input a string and produce a Boolean output, i.e. either "accept" or "reject". In contrast, weighted automata produce a quantitative output, for example a count of how many answers are possible on a given input string, or a probability of how likely the input string is according to a probability distribution. They are one of the simplest studied models of quantitative automata.

In automata theory, an unambiguous finite automaton (UFA) is a nondeterministic finite automaton (NFA) such that each word has at most one accepting path. Each deterministic finite automaton (DFA) is an UFA, but not vice versa. DFA, UFA, and NFA recognize exactly the same class of formal languages. On the one hand, an NFA can be exponentially smaller than an equivalent DFA. On the other hand, some problems are easily solved on DFAs and not on UFAs. For example, given an automaton A, an automaton A which accepts the complement of A can be computed in linear time when A is a DFA, whereas it is known that this cannot be done in polynomial time for UFAs. Hence UFAs are a mix of the worlds of DFA and of NFA; in some cases, they lead to smaller automata than DFA and quicker algorithms than NFA.

References