Tree stack automaton

Last updated December 21, 2024

A tree stack automaton^[a] (plural: tree stack automata) is a formalism considered in automata theory. It is a finite-state automaton with the additional ability to manipulate a tree-shaped stack. It is an automaton with storage^[2] whose storage roughly resembles the configurations of a thread automaton. A restricted class of tree stack automata recognises exactly the languages generated by multiple context-free grammars ^[3] (or linear context-free rewriting systems).

Definition

Tree stack

For a finite and non-empty set $Γ$ , a tree stack over $Γ$ is a tuple $(t, p)$ where

$t$ is a partial function from strings of positive integers to the set $Γ \cup {@$ } with prefix-closed^[b] domain (called tree),
$@$ (called bottom symbol) is not in $Γ$ and appears exactly at the root of $t$ , and
$p$ is an element of the domain of $t$ (called stack pointer).

The set of all tree stacks over $Γ$ is denoted by $TS(Γ)$ .

The set of predicates on $TS(Γ)$ , denoted by $Pred(Γ)$ , contains the following unary predicates:

$true$ which is true for any tree stack over $Γ$ ,
$bottom$ which is true for tree stacks whose stack pointer points to the bottom symbol, and
$equals(γ)$ which is true for some tree stack $(t, p)$ if $t (p) = γ$ ,

for every $γ \in Γ$ .

The set of instructions on $TS(Γ)$ , denoted by $Instr(Γ)$ , contains the following partial functions:

$id: TS(Γ) \to TS(Γ)$ which is the identity function on $TS(Γ)$ ,
$push n, γ : TS(Γ) \to TS(Γ)$ which adds for a given tree stack $(t, p)$ a pair $(pn \mapsto γ)$ to the tree $t$ and sets the stack pointer to $pn$ (i.e. it pushes $γ$ to the $n$ -th child position) if $pn$ is not yet in the domain of $t$ ,
$up n : TS(Γ) \to TS(Γ)$ which replaces the current stack pointer $p$ by $pn$ (i.e. it moves the stack pointer to the $n$ -th child position) if $pn$ is in the domain of $t$ ,
$down: TS(Γ) \to TS(Γ)$ which removes the last symbol from the stack pointer (i.e. it moves the stack pointer to the parent position), and
$set γ : TS(Γ) \to TS(Γ)$ which replaces the symbol currently under the stack pointer by $γ$ ,

for every positive integer $n$ and every $γ \in Γ$ .

Tree stack automata

A tree stack automaton is a 6-tuple $A = (Q, Γ, Σ, q i, δ, Q f)$ where

$Q$ , $Γ$ , and $Σ$ are finite sets (whose elements are called states, stack symbols, and input symbols, respectively),
$q i \in Q$ (the initial state),
$δ \subseteq fin. Q \times (Σ \cup {ε}) \times Pred(Γ) \times Instr(Γ) \times Q$ (whose elements are called transitions), and
$Q f \subseteq TS(Γ)$ (whose elements are called final states).

A configuration of $A$ is a tuple $(q, c, w)$ where

$q$ is a state (the current state),
$c$ is a tree stack (the current tree stack), and
$w$ is a word over $Σ$ (the remaining word to be read).

A transition $τ = (q 1, u, p, f, q 2)$ is applicable to a configuration $(q, c, w)$ if

$q 1 = q$ ,
$p$ is true on $c$ ,
$f$ is defined for $c$ , and
$u$ is a prefix of $w$ .

The transition relation of $A$ is the binary relation $⊢$ on configurations of $A$ that is the union of all the relations $⊢ τ$ for a transition $τ = (q 1, u, p, f, q 2)$ where, whenever $τ$ is applicable to $(q, c, w)$ , we have $(q, c, w) ⊢ τ (q 2, f (c), v)$ and $v$ is obtained from $w$ by removing the prefix $u$ .

The language of $A$ is the set of all words $w$ for which there is some state $q \in Q f$ and some tree stack $c$ such that $(q i, c i, w) ⊢ * (q, c, ε)$ where

$⊢ *$ is the reflexive transitive closure of $⊢$ and
$c i = (t i, ε)$ such that $t i$ assigns for $ε$ the symbol $@$ and is undefined otherwise.

Related formalisms

Tree stack automata are equivalent to Turing machines.

A tree stack automaton is called $k$ -restricted for some positive natural number $k$ if, during any run of the automaton, any position of the tree stack is accessed at most $k$ times from below.

1-restricted tree stack automata are equivalent to pushdown automata and therefore also to context-free grammars. $k$ -restricted tree stack automata are equivalent to linear context-free rewriting systems and multiple context-free grammars of fan-out at most $k$ (for every positive integer $k$ ).^[3]

Notes

↑ not to be confused with a device with the same name introduced in 1990 by Wolfgang Golubski and Wolfram-M. Lippe ^[1]
↑ A set of strings is prefix-closed if for every element $w$ in the set, all prefixes of $w$ are also in the set.

Related Research Articles

In the theory of computation, a branch of theoretical computer science, a pushdown automaton (PDA) is a type of automaton that employs a stack.

In computer science, an LL parser is a top-down parser for a restricted context-free language. It parses the input from Left to right, performing Leftmost derivation of the sentence.

A tree automaton is a type of state machine. Tree automata deal with tree structures, rather than the strings of more conventional state machines.

Automata theory is the study of abstract machines and automata, as well as the computational problems that can be solved using them. It is a theory in theoretical computer science with close connections to mathematical logic. The word automata comes from the Greek word αὐτόματος, which means "self-acting, self-willed, self-moving". An automaton is an abstract self-propelled computing device which follows a predetermined sequence of operations automatically. An automaton with a finite number of states is called a finite automaton (FA) or finite-state machine (FSM). The figure on the right illustrates a finite-state machine, which is a well-known type of automaton. This automaton consists of states and transitions. As the automaton sees a symbol of input, it makes a transition to another state, according to its transition function, which takes the previous state and current input symbol as its arguments.

In computer science and automata theory, a deterministic Büchi automaton is a theoretical machine which either accepts or rejects infinite inputs. Such a machine has a set of states and a transition function, which determines which state the machine should move to from its current state when it reads the next input character. Some states are accepting states and one state is the start state. The machine accepts an input if and only if it will pass through an accepting state infinitely many times as it reads the input.

In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state automaton (DFSA)—is a finite-state machine that accepts or rejects a given string of symbols, by running through a state sequence uniquely determined by the string. Deterministic refers to the uniqueness of the computation run. In search of the simplest models to capture finite-state machines, Warren McCulloch and Walter Pitts were among the first researchers to introduce a concept similar to finite automata in 1943.

In automata theory, an alternating finite automaton (AFA) is a nondeterministic finite automaton whose transitions are divided into existential and universal transitions. For example, let A be an alternating automaton.

A finite-state transducer (FST) is a finite-state machine with two memory tapes, following the terminology for Turing machines: an input tape and an output tape. This contrasts with an ordinary finite-state automaton, which has a single tape. An FST is a type of finite-state automaton (FSA) that maps between two sets of symbols. An FST is more general than an FSA. An FSA defines a formal language by defining a set of accepted strings, while an FST defines a relation between sets of strings.

In the theory of computation and automata theory, the powerset construction or subset construction is a standard method for converting a nondeterministic finite automaton (NFA) into a deterministic finite automaton (DFA) which recognizes the same formal language. It is important in theory because it establishes that NFAs, despite their additional flexibility, are unable to recognize any language that cannot be recognized by some DFA. It is also important in practice for converting easier-to-construct NFAs into more efficiently executable DFAs. However, if the NFA has n states, the resulting DFA may have up to 2ⁿ states, an exponentially larger number, which sometimes makes the construction impractical for large NFAs.

In automata theory, a deterministic pushdown automaton is a variation of the pushdown automaton. The class of deterministic pushdown automata accepts the deterministic context-free languages, a proper subset of context-free languages.

<span class="mw-page-title-main">Nested stack automaton</span>

In automata theory, a nested stack automaton is a finite automaton that can make use of a stack containing data which can be additional stacks. Like a stack automaton, a nested stack automaton may step up or down in the stack, and read the current symbol; in addition, it may at any place create a new stack, operate on that one, eventually destroy it, and continue operating on the old stack. This way, stacks can be nested recursively to an arbitrary depth; however, the automaton always operates on the innermost stack only.

Indexed grammars are a generalization of context-free grammars in that nonterminals are equipped with lists of flags, or index symbols. The language produced by an indexed grammar is called an indexed language.

A queue machine, queue automaton, or pullup automaton (PUA) is a finite state machine with the ability to store and retrieve data from an infinite-memory queue. Its design is similar to a pushdown automaton but differs by replacing the stack with this queue. A queue machine is a model of computation equivalent to a Turing machine, and therefore it can process the same class of formal languages.

A read-only Turing machine or two-way deterministic finite-state automaton (2DFA) is class of models of computability that behave like a standard Turing machine and can move in both directions across input, except cannot write to its input tape. The machine in its bare form is equivalent to a deterministic finite automaton in computational power, and therefore can only parse a regular language.

An embedded pushdown automaton or EPDA is a computational model for parsing languages generated by tree-adjoining grammars (TAGs). It is similar to the context-free grammar-parsing pushdown automaton, but instead of using a plain stack to store symbols, it has a stack of iterated stacks that store symbols, giving TAGs a generative capacity between context-free and context-sensitive grammars, or a subset of mildly context-sensitive grammars. Embedded pushdown automata should not be confused with nested stack automata which have more computational power.

An abstract family of acceptors (AFA) is a grouping of generalized acceptors. Informally, an acceptor is a device with a finite state control, a finite number of input symbols, and an internal store with a read and write function. Each acceptor has a start state and a set of accepting states. The device reads a sequence of symbols, transitioning from state to state for each input symbol. If the device ends in an accepting state, the device is said to accept the sequence of symbols. A family of acceptors is a set of acceptors with the same type of internal store. The study of AFA is part of AFL (abstract families of languages) theory.

In computer science, more specifically in automata and formal language theory, nested words are a concept proposed by Alur and Madhusudan as a joint generalization of words, as traditionally used for modelling linearly ordered structures, and of ordered unranked trees, as traditionally used for modelling hierarchical structures. Finite-state acceptors for nested words, so-called nested word automata, then give a more expressive generalization of finite automata on words. The linear encodings of languages accepted by finite nested word automata gives the class of visibly pushdown languages. The latter language class lies properly between the regular languages and the deterministic context-free languages. Since their introduction in 2004, these concepts have triggered much research in that area.

In automata theory, the thread automaton is an extended type of finite-state automata that recognizes a mildly context-sensitive language class above the tree-adjoining languages.

In theoretical computer science, in particular in formal language theory, Kleene's algorithm transforms a given nondeterministic finite automaton (NFA) into a regular expression. Together with other conversion algorithms, it establishes the equivalence of several description formats for regular languages. Alternative presentations of the same method include the "elimination method" attributed to Brzozowski and McCluskey, the algorithm of McNaughton and Yamada, and the use of Arden's lemma.

In computer science, a suffix automaton is an efficient data structure for representing the substring index of a given string which allows the storage, processing, and retrieval of compressed information about all its substrings. The suffix automaton of a string $is the smallest directed acyclic graph with a dedicated initial vertex and a set of "final" vertices, such that paths from the initial vertex to final vertices represent the suffixes of the string.$

References

↑ Golubski, Wolfgang and Lippe, Wolfram-M. (1990). Tree-stack automata. Proceedings of the 15th Symposium on Mathematical Foundations of Computer Science (MFCS 1990). Lecture Notes in Computer Science, Vol. 452, pages 313–321, doi:10.1007/BFb0029624.
↑ Scott, Dana (1967). Some Definitional Suggestions for Automata Theory. Journal of Computer and System Sciences, Vol. 1(2), pages 187–212, doi:10.1016/s0022-0000(67)80014-x.
1 2 Denkinger, Tobias (2016). An automata characterisation for multiple context-free languages. Proceedings of the 20th International Conference on Developments in Language Theory (DLT 2016). Lecture Notes in Computer Science, Vol. 9840, pages 138–150, doi:10.1007/978-3-662-53132-7_12.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[2] t to be confused with a device with the same name introduced in 1990 by Wolfgang Golubski and Wolfram-M. Lippe ^[1]

[5] A set of strings is prefix-closed if for every element $w$ in the set, all prefixes of $w$ are also in the set.

[GolLip90-1] Golubski, Wolfgang and Lippe, Wolfram-M. (1990). Tree-stack automata. Proceedings of the 15th Symposium on Mathematical Foundations of Computer Science (MFCS 1990). Lecture Notes in Computer Science, Vol. 452, pages 313–321, doi:10.1007/BFb0029624.

[Sco67-3] Scott, Dana (1967). Some Definitional Suggestions for Automata Theory. Journal of Computer and System Sciences, Vol. 1(2), pages 187–212, doi:10.1016/s0022-0000(67)80014-x.

[Den16-4] 1 2 Denkinger, Tobias (2016). An automata characterisation for multiple context-free languages. Proceedings of the 20th International Conference on Developments in Language Theory (DLT 2016). Lecture Notes in Computer Science, Vol. 9840, pages 138–150, doi:10.1007/978-3-662-53132-7_12.

[a]

[2]

[3]

[b]

[1]