Generalized star-height problem

Last updated December 13, 2022

Unsolved problem in computer science:

Can all regular languages be expressed using generalized regular expressions with a limited nesting depth of Kleene stars?

The generalized star-height problem in formal language theory is the open question whether all regular languages can be expressed using generalized regular expressions with a limited nesting depth of Kleene stars. Here, generalized regular expressions are defined like regular expressions, but they have a built-in complement operator. For a regular language, its generalized star height is defined as the minimum nesting depth of Kleene stars needed in order to describe the language by means of a generalized regular expression, hence the name of the problem.

More specifically, it is an open question whether a nesting depth of more than 1 is required, and if so, whether there is an algorithm to determine the minimum required star height.^[1]

Regular languages of star-height 0 are also known as star-free languages. The theorem of Schützenberger provides an algebraic characterization of star-free languages by means of aperiodic syntactic monoids. In particular star-free languages are a proper decidable subclass of regular languages.

Related Research Articles

In theoretical computer science and formal language theory, a regular language is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science.

The star height problem in formal language theory is the question whether all regular languages can be expressed using regular expressions of limited star height, i.e. with a limited nesting depth of Kleene stars. Specifically, is a nesting depth of one always sufficient? If not, is there an algorithm to determine how many are required? The problem was raised by Eggan (1963).

In computational complexity theory, EXPSPACE is the set of all decision problems solvable by a deterministic Turing machine in exponential space, i.e., in $space, where is a polynomial function of . Some authors restrict to be a linear function, but most authors instead call the resulting class ESPACE . If we use a nondeterministic machine instead, we get the class NEXPSPACE, which is equal to EXPSPACE by Savitch's theorem.$

In mathematics, a Kleene algebra is an idempotent semiring endowed with a closure operator. It generalizes the operations known from regular expressions.

In abstract algebra, a semiring is an algebraic structure similar to a ring, but without the requirement that each element must have an additive inverse.

In abstract algebra, the free monoid on a set is the monoid whose elements are all the finite sequences of zero or more elements from that set, with string concatenation as the monoid operation and with the unique sequence of zero elements, often called the empty string and denoted by ε or λ, as the identity element. The free monoid on a set A is usually denoted A^∗. The free semigroup on A is the subsemigroup of A^∗ containing all elements except the empty string. It is usually denoted A⁺.

In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state automaton (DFSA)—is a finite-state machine that accepts or rejects a given string of symbols, by running through a state sequence uniquely determined by the string. Deterministic refers to the uniqueness of the computation run. In search of the simplest models to capture finite-state machines, Warren McCulloch and Walter Pitts were among the first researchers to introduce a concept similar to finite automata in 1943.

In theoretical computer science, more precisely in the theory of formal languages, the star height is a measure for the structural complexity of regular expressions and regular languages. The star height of a regular expression equals the maximum nesting depth of stars appearing in that expression. The star height of a regular language is the least star height of any regular expression for that language. The concept of star height was first defined and studied by Eggan (1963).

In mathematics, an aperiodic semigroup is a semigroup S such that every element x ∈ S is aperiodic, that is, for each x there exists a positive integer n such that xⁿ = x^{n + 1}. An aperiodic monoid is an aperiodic semigroup which is a monoid.

In mathematics and computer science, the syntactic monoid $of a formal language is the smallest monoid that recognizes the language .$

A regular language is said to be star-free if it can be described by a regular expression constructed from the letters of the alphabet, the empty set symbol, all boolean operators – including complementation – and concatenation but no Kleene star. For instance, the language of words over the alphabet $that do not have consecutive a's can be defined by, where denotes the complement of a subset of . The condition is equivalent to having generalized star height zero.$

ACC⁰, sometimes called ACC, is a class of computational models and problems defined in circuit complexity, a field of theoretical computer science. The class is defined by augmenting the class AC⁰ of constant-depth "alternating circuits" with the ability to count; the acronym ACC stands for "AC with counters". Specifically, a problem belongs to ACC⁰ if it can be solved by polynomial-size, constant-depth circuits of unbounded fan-in gates, including gates that count modulo a fixed integer. ACC⁰ corresponds to computation in any solvable monoid. The class is very well studied in theoretical computer science because of the algebraic connections and because it is one of the largest concrete computational models for which computational impossibility results, so-called circuit lower bounds, can be proved.

An aperiodic finite-state automaton is a finite-state automaton whose transition monoid is aperiodic.

In graph theory, the cycle rank of a directed graph is a digraph connectivity measure proposed first by Eggan and Büchi. Intuitively, this concept measures how close a digraph is to a directed acyclic graph (DAG), in the sense that a DAG has cycle rank zero, while a complete digraph of order n with a self-loop at each vertex has cycle rank n. The cycle rank of a directed graph is closely related to the tree-depth of an undirected graph and to the star height of a regular language. It has also found use in sparse matrix computations and logic (Rossman 2008).

In mathematics, a factorisation of a free monoid is a sequence of subsets of words with the property that every word in the free monoid can be written as a concatenation of elements drawn from the subsets. The Chen–Fox–Lyndon theorem states that the Lyndon words furnish a factorisation. The Schützenberger theorem relates the definition in terms of a multiplicative property to an additive property.

In computer science, more precisely in automata theory, a recognizable set of a monoid is a subset that can be distinguished by some morphism to a finite monoid. Recognizable sets are useful in automata theory, formal languages and algebra.

In computer science, more precisely in automata theory, a rational set of a monoid is an element of the minimal class of subsets of this monoid that contains all finite subsets and is closed under union, product and Kleene star. Rational sets are useful in automata theory, formal languages and algebra.

In theoretical computer science, in particular in formal language theory, Kleene's algorithm transforms a given nondeterministic finite automaton (NFA) into a regular expression. Together with other conversion algorithms, it establishes the equivalence of several description formats for regular languages. Alternative presentations of the same method include the "elimination method" attributed to Brzozowski and McCluskey, the algorithm of McNaughton and Yamada, and the use of Arden's lemma.

In mathematics, a rational monoid is a monoid, an algebraic structure, for which each element can be represented in a "normal form" that can be computed by a finite transducer: multiplication in such a monoid is "easy", in the sense that it can be described by a rational function.

References

↑ Sakarovitch (2009) p.171

Janusz A. Brzozowski (1980). "Open problems about regular languages". In Ronald V. Book (ed.). Formal Language Theory: Perspectives and Open Problems. Academic Press. pp. 23–47.
Wolfgang Thomas (1981). "Remark on the star-height-problem". Theoretical Computer Science . 13 (2): 231–237. doi: 10.1016/0304-3975(81)90041-4 . MR 0594062.
Jean-Eric Pin; Howard Straubing; Denis Thérien (1992). "Some results on the generalized star-height problem" (PDF). Information and Computation . 101 (2): 219–250. doi:10.1016/0890-5401(92)90063-L.
Sakarovitch, Jacques (2009). Elements of automata theory. Translated from the French by Reuben Thomas. Cambridge: Cambridge University Press. ISBN 978-0-521-84425-3. Zbl 1188.68177.
Marcel-Paul Schützenberger (1965). "On finite monoids having only trivial subgroups". Information and Control . 8 (2): 190–194. doi: 10.1016/S0019-9958(65)90108-7 . Zbl 0131.02001.

External links

Jean-Eric Pin: The star-height problem

P ≟ NP

This theoretical computer science–related article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Sak171-1] Sakarovitch (2009) p.171

[1]

Generalized star-height problem

Contents

See also

Related Research Articles

References

External links