Arden's rule

Last updated

In theoretical computer science, Arden's rule, also known as Arden's lemma, is a mathematical statement about a certain form of language equations.

Contents

Background

A (formal) language is simply a set of strings. Such sets can be specified by means of some language equation, which in turn is based on operations on languages. Language equations are mathematical statements that resemble numerical equations, but the variables assume values of formal languages rather than numbers. Among the most common operations on two languages A and B are the set union AB, and their concatenation AB. Finally, as an operation taking a single operand, the set A* denotes the Kleene star of the language A.

Statement of Arden's rule

Arden's rule states that the set A*B is the smallest language that is a solution for X in the linear equation X = AXB where X, A, B are sets of strings. Moreover, if the set A does not contain the empty word, then this solution is unique. [1] [2]

Equivalently, the set BA* is the smallest language that is a solution for X in X = XAB.

Application

Arden's rule can be used to help convert some finite automatons to regular expressions, as in Kleene's algorithm.

See also

Notes

  1. Daintith, John (2004). "Arden's Rule". Archived from the original on 13 February 2010. Retrieved 10 March 2010.
  2. Sutner, Klaus. "Algebra of Regular Languages" (PDF). Archived from the original (PDF) on 2011-07-08. Retrieved 15 Feb 2011.

Related Research Articles

Finite-state machine Mathematical model of computation

A finite-state machine (FSM) or finite-state automaton, finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number of states at any given time. The FSM can change from one state to another in response to some inputs; the change from one state to another is called a transition. An FSM is defined by a list of its states, its initial state, and the inputs that trigger each transition. Finite-state machines are of two types—deterministic finite-state machines and non-deterministic finite-state machines. A deterministic finite-state machine can be constructed equivalent to any non-deterministic one.

Formal language Words whose letters are taken from an alphabet and are well-formed according to a specific set of rules

In mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules.

In mathematical logic and computer science, the Kleene star is a unary operation, either on sets of strings or on sets of symbols or characters. In mathematics it is more commonly known as the free monoid construction. The application of the Kleene star to a set V is written as V*. It is widely used for regular expressions, which is the context in which it was introduced by Stephen Kleene to characterize certain automata, where it means "zero or more repetitions".

  1. If V is a set of strings, then V* is defined as the smallest superset of V that contains the empty string ε and is closed under the string concatenation operation.
  2. If V is a set of symbols or characters, then V* is the set of all strings over symbols in V, including the empty string ε.
Regular expression Sequence of characters that forms a search pattern

A regular expression is a sequence of characters that define a search pattern. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. It is a technique developed in theoretical computer science and formal language theory.

In theoretical computer science and formal language theory, a regular language is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science.

Theory of computation Academic subfield of computer science

In theoretical computer science and mathematics, the theory of computation is the branch that deals with what problems can be solved on a model of computation, using an algorithm, how efficiently they can be solved or to what degree. The field is divided into three major branches: automata theory and formal languages, computability theory, and computational complexity theory, which are linked by the question: "What are the fundamental capabilities and limitations of computers?".

Automata theory Study of abstract machines and automata

Automata theory is the study of abstract machines and automata, as well as the computational problems that can be solved using them. It is a theory in theoretical computer science. The word automata comes from the Greek word αὐτόματα, which means "self-making". An automaton is an abstract self-propelled computing device which follows a predetermined sequence of operations automatically. An automaton with a finite number of states is called a Finite Automaton (FA) or Finite State Machine (FSM).

In mathematics, a Kleene algebra is an idempotent semiring endowed with a closure operator. It generalizes the operations known from regular expressions.

Computability is the ability to solve a problem in an effective manner. It is a key topic of the field of computability theory within mathematical logic and the theory of computation within computer science. The computability of a problem is closely linked to the existence of an algorithm to solve the problem.

Deterministic finite automaton

In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state automaton (DFSA)—is a finite-state machine that accepts or rejects a given string of symbols, by running through a state sequence uniquely determined by the string. Deterministic refers to the uniqueness of the computation run. In search of the simplest models to capture finite-state machines, Warren McCulloch and Walter Pitts were among the first researchers to introduce a concept similar to finite automata in 1943.

In automata theory, a finite-state machine is called a deterministic finite automaton (DFA), if

In mathematics and computer science, the syntactic monoidM(L) of a formal language L is the smallest monoid that recognizes the language L.

A finite-state transducer (FST) is a finite-state machine with two memory tapes, following the terminology for Turing machines: an input tape and an output tape. This contrasts with an ordinary finite-state automaton, which has a single tape. An FST is a type of finite-state automaton that maps between two sets of symbols. An FST is more general than a finite-state automaton (FSA). An FSA defines a formal language by defining a set of accepted strings, while an FST defines relations between sets of strings.

In formal language theory, a string is defined as a finite sequence of members of an underlying base set; this set is called the alphabet of a string or collection of strings. The members of the set are called symbols, and are typically thought of as representing letters, characters, or digits. For example, a common alphabet is {0,1}, the binary alphabet, and a binary string is a string drawn from the alphabet {0,1}. An infinite sequence of letters may be constructed from elements of an alphabet as well.

Language equations are mathematical statements that resemble numerical equations, but the variables assume values of formal languages rather than numbers. Instead of arithmetic operations in numerical equations, the variables are joined by language operations. Among the most common operations on two languages A and B are the set union AB, the set intersection AB, and the concatenation AB. Finally, as an operation taking a single operand, the set A* denotes the Kleene star of the language A. Therefore language equations can be used to represent formal grammars, since the languages generated by the grammar must be the solution of a system of language equations.

In algebraic logic, an action algebra is an algebraic structure which is both a residuated semilattice and a Kleene algebra. It adds the star or reflexive transitive closure operation of the latter to the former, while adding the left and right residuation or implication operations of the former to the latter. Unlike dynamic logic and other modal logics of programs, for which programs and propositions form two distinct sorts, action algebra combines the two into a single sort. It can be thought of as a variant of intuitionistic logic with star and with a noncommutative conjunction whose identity need not be the top element. Unlike Kleene algebras, action algebras form a variety, which furthermore is finitely axiomatizable, the crucial axiom being a•(aa)* ≤ a. Unlike models of the equational theory of Kleene algebras, the star operation of action algebras is reflexive transitive closure in every model of the equations.

In computer science, more specifically in automata and formal language theory, nested words are a concept proposed by Alur and Madhusudan as a joint generalization of words, as traditionally used for modelling linearly ordered structures, and of ordered unranked trees, as traditionally used for modelling hierarchical structures. Finite-state acceptors for nested words, so-called nested word automata, then give a more expressive generalization of finite automata on words. The linear encodings of languages accepted by finite nested word automata gives the class of visibly pushdown languages. The latter language class lies properly between the regular languages and the deterministic context-free languages. Since their introduction in 2004, these concepts have triggered much research in that area.

In computer science, Thompson's construction algorithm, also called the McNaughton-Yamada-Thompson algorithm, is a method of transforming a regular expression into an equivalent nondeterministic finite automaton (NFA). This NFA can be used to match strings against the regular expression. This algorithm is credited to Ken Thompson.

In computational learning theory, induction of regular languages refers to the task of learning a formal description of a regular language from a given set of example strings. Although E. Mark Gold has shown that not every regular language can be learned this way, approaches have been investigated for a variety of subclasses. They are sketched in this article. For learning of more general grammars, see Grammar induction.

In theoretical computer science, in particular in formal language theory, Kleene's algorithm transforms a given nondeterministic finite automaton (NFA) into a regular expression. Together with other conversion algorithms, it establishes the equivalence of several description formats for regular languages. Alternative presentations of the same method include the "elimination method" attributed to Brzozowski and McCluskey, the algorithm of McNaughton and Yamada, and the use of Arden's lemma.

References