Unambiguous finite automaton

Last updated

In automata theory, an unambiguous finite automaton (UFA) is a nondeterministic finite automaton (NFA) such that each word has at most one accepting path. Each deterministic finite automaton (DFA) is an UFA, but not vice versa. DFA, UFA, and NFA recognize exactly the same class of formal languages. On the one hand, an NFA can be exponentially smaller than an equivalent DFA. On the other hand, some problems are easily solved on DFAs and not on UFAs. For example, given an automaton A, an automaton A which accepts the complement of A can be computed in linear time when A is a DFA, whereas it is known that this cannot be done in polynomial time for UFAs. Hence UFAs are a mix of the worlds of DFA and of NFA; in some cases, they lead to smaller automata than DFA and quicker algorithms than NFA.

Contents

Formal definition

An NFA is represented formally by a 5-tuple, . An UFA is an NFA such that, for each word , there exists at most one sequence of states , in with the following conditions:

  1. ;
  2. for ;
  3. .

In words, those conditions state that, if is accepted by , there is exactly one accepting path, that is, one path from an initial state to a final state that is labelled by .

Example

Let be the set of words over the alphabet {a,b} whose nth last letter is an . The figures show a DFA and a UFA accepting this language for n=2.

Deterministic automaton (DFA) for the language L for n=2 Unambiguous finite automaton.svg
Deterministic automaton (DFA) for the language L for n=2
Unambiguous finite automaton (UFA) for the language L for n=2 Unambiguous finite autaton for (a+b)*a(a+b)^2.svg
Unambiguous finite automaton (UFA) for the language L for n=2

The minimal DFA accepting has 2n states, one for each subset of {1...n}. There is an UFA of states which accepts : it guesses the nth last letter, and then verifies that only letters remain. It is indeed unambiguous as there exists only one nth last letter.

Inclusion, universality, equivalence

Three PSPACE-hard problems for general NFA belong to PTIME for DFA and are now considered.

Inclusion

It is decidable in polynomial-time whether an UFA's language is a subset of another UFA's language.

Sketch of the proof of inclusion

Let A and B be two UFAs. Let L(A) and L(B) be the languages accepted by those automata. Then L(A)⊆L(B) if and only if L(AB)=L(A), where AB denotes the Cartesian product automaton, which can be proven to be also unambiguous. Now, L(AB) is a subset of L(A) by construction; hence both sets are equal if and only if for each length n, the number of words of length n in L(AB) is equal to the number of words of length n in L(A). It can be proved that is sufficient to check each n up to the product of the number of states of A and B.

The number of words of length n accepted by an automaton can be computed in polynomial time using dynamic programming, which ends the proof. [1]

Universality, equivalence

The problem of universality [note 1] and of equivalence, [note 2] also belong to PTIME, by reduction to the inclusion problem.

Checking whether an automaton is unambiguous

For a nondeterministic finite automaton with states and an letter alphabet, it is decidable in time whether is unambiguous. [2]

Sketch of the proof of unambiguity

It suffices to use a fixpoint algorithm to compute the set of pairs of states q and q' such that there exists a word w which leads both to q and to q' . The automaton is unambiguous if and only if there is no such a pair such that both states are accepting. There are Θ(n2) state pairs, and for each pair there are m letters to consider to resume the fixpoint algorithm, hence the computation time.

Some properties

State complexity

Mathematical proofs that every UFA for a language needs a certain number of states were pioneered by Schmidt. [4] Leung proved that a DFA equivalent to an -state UFA requires states in the worst case, and that a UFA equivalent to a finitely ambiguous [note 3] -state NFA requires states in the worst case. [5]

Jirásek, Jirásková and Šebej [6] researched state complexity of basic regular operations on languages represented by UFA. They proved in particular that for every -state UFA where , the complement of the language it accepts is accepted by a UFA with at most states. This result was later improved by Indzhev and Kiefer [7] to at most states for all .

Raskin [8] showed that UFAs cannot be complemented in polynomial time, even into NFAs: he shows that, in the worst case, complementing a UFA with n states into an NFA requires a superpolynomial number of states. This lower bound was later improved by Göös, Kiefer, and Yuan. [9]

For a one-letter alphabet Okhotin proved that a DFA equivalent to an -state UFA requires states in the worst case. [10]

Notes

  1. i.e.: given a UFA, does it accept every string of Σ*?
  2. i.e.: given two UFAs, do they accept the same set of strings?
  3. Having finitely many accepting paths for every accepted word.

Related Research Articles

<span class="mw-page-title-main">Pushdown automaton</span> Type of automaton

In the theory of computation, a branch of theoretical computer science, a pushdown automaton (PDA) is a type of automaton that employs a stack.

In theoretical computer science and formal language theory, a regular language is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science.

<span class="mw-page-title-main">Automata theory</span> Study of abstract machines and automata

Automata theory is the study of abstract machines and automata, as well as the computational problems that can be solved using them. It is a theory in theoretical computer science with close connections to mathematical logic. The word automata comes from the Greek word αὐτόματος, which means "self-acting, self-willed, self-moving". An automaton is an abstract self-propelled computing device which follows a predetermined sequence of operations automatically. An automaton with a finite number of states is called a finite automaton (FA) or finite-state machine (FSM). The figure on the right illustrates a finite-state machine, which is a well-known type of automaton. This automaton consists of states and transitions. As the automaton sees a symbol of input, it makes a transition to another state, according to its transition function, which takes the previous state and current input symbol as its arguments.

<span class="mw-page-title-main">Büchi automaton</span>

In computer science and automata theory, a deterministic Büchi automaton is a theoretical machine which either accepts or rejects infinite inputs. Such a machine has a set of states and a transition function, which determines which state the machine should move to from its current state when it reads the next input character. Some states are accepting states and one state is the start state. The machine accepts an input if and only if it will pass through an accepting state infinitely many times as it reads the input.

<span class="mw-page-title-main">Deterministic finite automaton</span> Finite-state machine

In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state automaton (DFSA)—is a finite-state machine that accepts or rejects a given string of symbols, by running through a state sequence uniquely determined by the string. Deterministic refers to the uniqueness of the computation run. In search of the simplest models to capture finite-state machines, Warren McCulloch and Walter Pitts were among the first researchers to introduce a concept similar to finite automata in 1943.

In automata theory, a finite-state machine is called a deterministic finite automaton (DFA), if

In theoretical computer science, more precisely in the theory of formal languages, the star height is a measure for the structural complexity of regular expressions and regular languages. The star height of a regular expression equals the maximum nesting depth of stars appearing in that expression. The star height of a regular language is the least star height of any regular expression for that language. The concept of star height was first defined and studied by Eggan (1963).

In automata theory, an alternating finite automaton (AFA) is a nondeterministic finite automaton whose transitions are divided into existential and universal transitions. For example, let A be an alternating automaton.

In the theory of computation and automata theory, the powerset construction or subset construction is a standard method for converting a nondeterministic finite automaton (NFA) into a deterministic finite automaton (DFA) which recognizes the same formal language. It is important in theory because it establishes that NFAs, despite their additional flexibility, are unable to recognize any language that cannot be recognized by some DFA. It is also important in practice for converting easier-to-construct NFAs into more efficiently executable DFAs. However, if the NFA has n states, the resulting DFA may have up to 2n states, an exponentially larger number, which sometimes makes the construction impractical for large NFAs.

In automata theory, a deterministic pushdown automaton is a variation of the pushdown automaton. The class of deterministic pushdown automata accepts the deterministic context-free languages, a proper subset of context-free languages.

In computer science, in particular in automata theory, a two-way finite automaton is a finite automaton that is allowed to re-read its input.

In quantum computing, quantum finite automata (QFA) or quantum state machines are a quantum analog of probabilistic automata or a Markov decision process. They provide a mathematical abstraction of real-world quantum computers. Several types of automata may be defined, including measure-once and measure-many automata. Quantum finite automata can also be understood as the quantization of subshifts of finite type, or as a quantization of Markov chains. QFAs are, in turn, special cases of geometric finite automata or topological finite automata.

In mathematics and computer science, the probabilistic automaton (PA) is a generalization of the nondeterministic finite automaton; it includes the probability of a given transition into the transition function, turning it into a transition matrix. Thus, the probabilistic automaton also generalizes the concepts of a Markov chain and of a subshift of finite type. The languages recognized by probabilistic automata are called stochastic languages; these include the regular languages as a subset. The number of stochastic languages is uncountable.

<span class="mw-page-title-main">DFA minimization</span> Task of transforming a deterministic finite automaton

In automata theory, DFA minimization is the task of transforming a given deterministic finite automaton (DFA) into an equivalent DFA that has a minimum number of states. Here, two DFAs are called equivalent if they recognize the same regular language. Several different algorithms accomplishing this task are known and described in standard textbooks on automata theory.

In computer science, Thompson's construction algorithm, also called the McNaughton–Yamada–Thompson algorithm, is a method of transforming a regular expression into an equivalent nondeterministic finite automaton (NFA). This NFA can be used to match strings against the regular expression. This algorithm is credited to Ken Thompson.

In computational learning theory, induction of regular languages refers to the task of learning a formal description of a regular language from a given set of example strings. Although E. Mark Gold has shown that not every regular language can be learned this way, approaches have been investigated for a variety of subclasses. They are sketched in this article. For learning of more general grammars, see Grammar induction.

In theoretical computer science, in particular in formal language theory, Kleene's algorithm transforms a given nondeterministic finite automaton (NFA) into a regular expression. Together with other conversion algorithms, it establishes the equivalence of several description formats for regular languages. Alternative presentations of the same method include the "elimination method" attributed to Brzozowski and McCluskey, the algorithm of McNaughton and Yamada, and the use of Arden's lemma.

<span class="mw-page-title-main">Weighted automaton</span> Finite-state machine where edges carry weights

In theoretical computer science and formal language theory, a weighted automaton or weighted finite-state machine is a generalization of a finite-state machine in which the edges have weights, for example real numbers or integers. Finite-state machines are only capable of answering decision problems; they take as input a string and produce a Boolean output, i.e. either "accept" or "reject". In contrast, weighted automata produce a quantitative output, for example a count of how many answers are possible on a given input string, or a probability of how likely the input string is according to a probability distribution. They are one of the simplest studied models of quantitative automata.

In automata theory, a self-verifying finite automaton (SVFA) is a special kind of a nondeterministic finite automaton (NFA) with a symmetric kind of nondeterminism introduced by Hromkovič and Schnitger. Generally, in self-verifying nondeterminism, each computation path is concluded with any of the three possible answers: yes, no, and I do not know. For each input string, no two paths may give contradictory answers, namely both answers yes and no on the same input are not possible. At least one path must give answer yes or no, and if it is yes then the string is considered accepted. SVFA accept the same class of languages as deterministic finite automata (DFA) and NFA but have different state complexity.

State complexity is an area of theoretical computer science dealing with the size of abstract automata, such as different kinds of finite automata. The classical result in the area is that simulating an -state nondeterministic finite automaton by a deterministic finite automaton requires exactly states in the worst case.

References

  1. Christof Löding, Unambiguous Finite Automata, Slide 8
  2. Sakarovitch, Jacques; Thomas, Reuben (October 2009). Elements of Automata Theory. Cambridge: Cambridge university press. p. 75. ISBN   978-0-521-84425-3.
  3. Christof Löding, Unambiguous Finite Automata, Slide 8
  4. Schmidt, Erik M. (1978). Succinctness of Description of Context-Free, Regular and Unambiguous Languages (Ph.D.). Cornell University.
  5. Leung, Hing (2005). "Descriptional complexity of NFA of different ambiguity". International Journal of Foundations of Computer Science. 16 (5): 975–984. doi:10.1142/S0129054105003418. ISSN   0129-0541.
  6. Jirásek, Jozef; Jirásková, Galina; Šebej, Juraj (2016). "Operations on Unambiguous Finite Automata". Developments in Language Theory. Lecture Notes in Computer Science. Vol. 9840. pp. 243–255. doi:10.1007/978-3-662-53132-7_20. ISBN   978-3-662-53131-0. ISSN   0302-9743.
  7. Indzhev, Emil; Kiefer, Stefan (2021). "On Complementing Unambiguous Automata and Graphs With Many Cliques and Cocliques". arXiv: 2105.07470 [cs.FL].
  8. Raskin, Mikhail (2018). "A Superpolynomial Lower Bound for the Size of Non-Deterministic Complement of an Unambiguous Automaton". DROPS-IDN/V2/Document/10.4230/LIPIcs.ICALP.2018.138. Schloss-Dagstuhl - Leibniz Zentrum für Informatik. doi: 10.4230/LIPIcs.ICALP.2018.138 .
  9. Göös, Mika; Kiefer, Stefan; Yuan, Weiqiang (2022). "Lower Bounds for Unambiguous Automata via Communication Complexity". DROPS-IDN/V2/Document/10.4230/LIPIcs.ICALP.2022.126. Schloss-Dagstuhl - Leibniz Zentrum für Informatik. doi: 10.4230/LIPIcs.ICALP.2022.126 .
  10. Okhotin, Alexander (2012). "Unambiguous finite automata over a unary alphabet". Information and Computation. 212: 15–36. doi: 10.1016/j.ic.2012.01.003 . ISSN   0890-5401.