Generalized star-height problem

Last updated
Unsolved problem in computer science:

Can all regular languages be expressed using generalized regular expressions with a limited nesting depth of Kleene stars?

Contents

The generalized star-height problem in formal language theory is the open question whether all regular languages can be expressed using generalized regular expressions with a limited nesting depth of Kleene stars. Here, generalized regular expressions are defined like regular expressions, but they have a built-in complement operator. For a regular language, its generalized star height is defined as the minimum nesting depth of Kleene stars needed in order to describe the language by means of a generalized regular expression, hence the name of the problem.

More specifically, it is an open question whether a nesting depth of more than 1 is required, and if so, whether there is an algorithm to determine the minimum required star height. [1]

Regular languages of star-height 0 are also known as star-free languages. The theorem of Schützenberger provides an algebraic characterization of star-free languages by means of aperiodic syntactic monoids. In particular star-free languages are a proper decidable subclass of regular languages.

See also

Related Research Articles

In theoretical computer science and formal language theory, a regular language is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science.

The star height problem in formal language theory is the question whether all regular languages can be expressed using regular expressions of limited star height, i.e. with a limited nesting depth of Kleene stars. Specifically, is a nesting depth of one always sufficient? If not, is there an algorithm to determine how many are required? The problem was raised by Eggan (1963).

In computational complexity theory, EXPSPACE is the set of all decision problems solvable by a deterministic Turing machine in exponential space, i.e., in space, where is a polynomial function of . Some authors restrict to be a linear function, but most authors instead call the resulting class ESPACE. If we use a nondeterministic machine instead, we get the class NEXPSPACE, which is equal to EXPSPACE by Savitch's theorem.

In mathematics, a Kleene algebra is an idempotent semiring endowed with a closure operator. It generalizes the operations known from regular expressions.

<span class="mw-page-title-main">Semiring</span> Algebraic ring that need not have additive negative elements

In abstract algebra, a semiring is an algebraic structure similar to a ring, but without the requirement that each element must have an additive inverse.

In abstract algebra, the free monoid on a set is the monoid whose elements are all the finite sequences of zero or more elements from that set, with string concatenation as the monoid operation and with the unique sequence of zero elements, often called the empty string and denoted by ε or λ, as the identity element. The free monoid on a set A is usually denoted A. The free semigroup on A is the subsemigroup of A containing all elements except the empty string. It is usually denoted A+.

<span class="mw-page-title-main">Deterministic finite automaton</span> Finite-state machine

In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state automaton (DFSA)—is a finite-state machine that accepts or rejects a given string of symbols, by running through a state sequence uniquely determined by the string. Deterministic refers to the uniqueness of the computation run. In search of the simplest models to capture finite-state machines, Warren McCulloch and Walter Pitts were among the first researchers to introduce a concept similar to finite automata in 1943.

In theoretical computer science, more precisely in the theory of formal languages, the star height is a measure for the structural complexity of regular expressions and regular languages. The star height of a regular expression equals the maximum nesting depth of stars appearing in that expression. The star height of a regular language is the least star height of any regular expression for that language. The concept of star height was first defined and studied by Eggan (1963).

In mathematics, an aperiodic semigroup is a semigroup S such that every element xS is aperiodic, that is, for each x there exists a positive integer n such that xn = xn + 1. An aperiodic monoid is an aperiodic semigroup which is a monoid.

In mathematics and computer science, the syntactic monoid of a formal language is the smallest monoid that recognizes the language .

A regular language is said to be star-free if it can be described by a regular expression constructed from the letters of the alphabet, the empty set symbol, all boolean operators – including complementation – and concatenation but no Kleene star. For instance, the language of words over the alphabet that do not have consecutive a's can be defined by , where denotes the complement of a subset of . The condition is equivalent to having generalized star height zero.

ACC<sup>0</sup>

ACC0, sometimes called ACC, is a class of computational models and problems defined in circuit complexity, a field of theoretical computer science. The class is defined by augmenting the class AC0 of constant-depth "alternating circuits" with the ability to count; the acronym ACC stands for "AC with counters". Specifically, a problem belongs to ACC0 if it can be solved by polynomial-size, constant-depth circuits of unbounded fan-in gates, including gates that count modulo a fixed integer. ACC0 corresponds to computation in any solvable monoid. The class is very well studied in theoretical computer science because of the algebraic connections and because it is one of the largest concrete computational models for which computational impossibility results, so-called circuit lower bounds, can be proved.

An aperiodic finite-state automaton is a finite-state automaton whose transition monoid is aperiodic.

In graph theory, the cycle rank of a directed graph is a digraph connectivity measure proposed first by Eggan and Büchi. Intuitively, this concept measures how close a digraph is to a directed acyclic graph (DAG), in the sense that a DAG has cycle rank zero, while a complete digraph of order n with a self-loop at each vertex has cycle rank n. The cycle rank of a directed graph is closely related to the tree-depth of an undirected graph and to the star height of a regular language. It has also found use in sparse matrix computations and logic (Rossman 2008).

In mathematics, a factorisation of a free monoid is a sequence of subsets of words with the property that every word in the free monoid can be written as a concatenation of elements drawn from the subsets. The Chen–Fox–Lyndon theorem states that the Lyndon words furnish a factorisation. The Schützenberger theorem relates the definition in terms of a multiplicative property to an additive property.

In computer science, more precisely in automata theory, a recognizable set of a monoid is a subset that can be distinguished by some morphism to a finite monoid. Recognizable sets are useful in automata theory, formal languages and algebra.

In computer science, more precisely in automata theory, a rational set of a monoid is an element of the minimal class of subsets of this monoid that contains all finite subsets and is closed under union, product and Kleene star. Rational sets are useful in automata theory, formal languages and algebra.

In theoretical computer science, in particular in formal language theory, Kleene's algorithm transforms a given nondeterministic finite automaton (NFA) into a regular expression. Together with other conversion algorithms, it establishes the equivalence of several description formats for regular languages. Alternative presentations of the same method include the "elimination method" attributed to Brzozowski and McCluskey, the algorithm of McNaughton and Yamada, and the use of Arden's lemma.

In mathematics, a rational monoid is a monoid, an algebraic structure, for which each element can be represented in a "normal form" that can be computed by a finite transducer: multiplication in such a monoid is "easy", in the sense that it can be described by a rational function.

References

  1. Sakarovitch (2009) p.171