Local language (formal language)

Last updated November 02, 2024

In mathematics, a local language is a formal language for which membership of a word in the language can be determined by looking at the first and last symbol and each two-symbol substring of the word.^[1] Equivalently, it is a language recognised by a local automaton, a particular kind of deterministic finite automaton.^[2]

Formally, a language L over an alphabet A is defined to be local if there are subsets R and S of A and a subset F of A×A such that a word w is in L if and only if the first letter of w is in R, the last letter of w is in S and no factor of length 2 in w is in F.^[3] This corresponds to the regular expression ^[1]^[4]

(RA^{*}\cap A^{*}S)\setminus A^{*}FA^{*}\ .

More generally, a k-testable language L is one for which membership of a word w in L depends only on the prefix and suffix of length k and the set of factors of w of length k;^[5] a language is locally testable if it is k-testable for some k.^[6] A local language is 2-testable.^[1]

Examples

Over the alphabet {a,b,[,]}^[4]

aa^{*},\ [ab]\ .

Properties

The family of local languages over A is closed under intersection and Kleene star, but not complement, union or concatenation.^[4]
Every regular language not containing the empty string is the image of a local language under a strictly alphabetic morphism.^[1]^[7]^[8]

Related Research Articles

In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules called a formal grammar.

In theoretical computer science and formal language theory, a regular language is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science.

The star height problem in formal language theory is the question whether all regular languages can be expressed using regular expressions of limited star height, i.e. with a limited nesting depth of Kleene stars. Specifically, is a nesting depth of one always sufficient? If not, is there an algorithm to determine how many are required? The problem was first introduced by Eggan in 1963.

Automata theory is the study of abstract machines and automata, as well as the computational problems that can be solved using them. It is a theory in theoretical computer science with close connections to mathematical logic. The word automata comes from the Greek word αὐτόματος, which means "self-acting, self-willed, self-moving". An automaton is an abstract self-propelled computing device which follows a predetermined sequence of operations automatically. An automaton with a finite number of states is called a finite automaton (FA) or finite-state machine (FSM). The figure on the right illustrates a finite-state machine, which is a well-known type of automaton. This automaton consists of states and transitions. As the automaton sees a symbol of input, it makes a transition to another state, according to its transition function, which takes the previous state and current input symbol as its arguments.

In abstract algebra, the free monoid on a set is the monoid whose elements are all the finite sequences of zero or more elements from that set, with string concatenation as the monoid operation and with the unique sequence of zero elements, often called the empty string and denoted by ε or λ, as the identity element. The free monoid on a set A is usually denoted A^∗. The free semigroup on A is the subsemigroup of A^∗ containing all elements except the empty string. It is usually denoted A⁺.

In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state automaton (DFSA)—is a finite-state machine that accepts or rejects a given string of symbols, by running through a state sequence uniquely determined by the string. Deterministic refers to the uniqueness of the computation run. In search of the simplest models to capture finite-state machines, Warren McCulloch and Walter Pitts were among the first researchers to introduce a concept similar to finite automata in 1943.

In theoretical computer science, more precisely in the theory of formal languages, the star height is a measure for the structural complexity of regular expressions and regular languages. The star height of a regular expression equals the maximum nesting depth of stars appearing in that expression. The star height of a regular language is the least star height of any regular expression for that language. The concept of star height was first defined and studied by Eggan (1963).

In mathematics and computer science, the syntactic monoid $of a formal language is the smallest monoid that recognizes the language .$

In automata theory, a deterministic pushdown automaton is a variation of the pushdown automaton. The class of deterministic pushdown automata accepts the deterministic context-free languages, a proper subset of context-free languages.

In formal language theory within theoretical computer science, an infinite word is an infinite-length sequence of symbols, and an ω-language is a set of infinite words. Here, ω refers to the first infinite ordinal number, modeling a set of natural numbers.

In theoretical computer science and formal language theory, a regular language is said to be star-free if it can be described by a regular expression constructed from the letters of the alphabet, the empty word, the empty set symbol, all boolean operators – including complementation – and concatenation but no Kleene star. The condition is equivalent to having generalized star height zero.

In mathematics and theoretical computer science, an automatic sequence (also called a k-automatic sequence or a k-recognizable sequence when one wants to indicate that the base of the numerals used is k) is an infinite sequence of terms characterized by a finite automaton. The n-th term of an automatic sequence a(n) is a mapping of the final state reached in a finite automaton accepting the digits of the number n in some fixed base k.

In automata theory, DFA minimization is the task of transforming a given deterministic finite automaton (DFA) into an equivalent DFA that has a minimum number of states. Here, two DFAs are called equivalent if they recognize the same regular language. Several different algorithms accomplishing this task are known and described in standard textbooks on automata theory.

An aperiodic finite-state automaton is a finite-state automaton whose transition monoid is aperiodic.

In computer science, more specifically in automata and formal language theory, nested words are a concept proposed by Alur and Madhusudan as a joint generalization of words, as traditionally used for modelling linearly ordered structures, and of ordered unranked trees, as traditionally used for modelling hierarchical structures. Finite-state acceptors for nested words, so-called nested word automata, then give a more expressive generalization of finite automata on words. The linear encodings of languages accepted by finite nested word automata gives the class of visibly pushdown languages. The latter language class lies properly between the regular languages and the deterministic context-free languages. Since their introduction in 2004, these concepts have triggered much research in that area.

In graph theory, the cycle rank of a directed graph is a digraph connectivity measure proposed first by Eggan and Büchi. Intuitively, this concept measures how close a digraph is to a directed acyclic graph (DAG), in the sense that a DAG has cycle rank zero, while a complete digraph of order n with a self-loop at each vertex has cycle rank n. The cycle rank of a directed graph is closely related to the tree-depth of an undirected graph and to the star height of a regular language. It has also found use in sparse matrix computations and logic (Rossman 2008).

In mathematics and computer science, a rational series is a generalisation of the concept of formal power series over a ring to the case when the basic algebraic structure is no longer a ring but a semiring, and the indeterminates adjoined are not assumed to commute. They can be regarded as algebraic expressions of a formal language over a finite alphabet.

In computer science theory – particularly formal language theory – Glushkov's construction algorithm, invented by Victor Mikhailovich Glushkov, transforms a given regular expression into an equivalent nondeterministic finite automaton (NFA). Thus, it forms a bridge between regular expressions and nondeterministic finite automata: two abstract representations of the same class of formal languages.

In automata theory, an unambiguous finite automaton (UFA) is a nondeterministic finite automaton (NFA) such that each word has at most one accepting path. Each deterministic finite automaton (DFA) is an UFA, but not vice versa. DFA, UFA, and NFA recognize exactly the same class of formal languages. On the one hand, an NFA can be exponentially smaller than an equivalent DFA. On the other hand, some problems are easily solved on DFAs and not on UFAs. For example, given an automaton A, an automaton A′ which accepts the complement of A can be computed in linear time when A is a DFA, whereas it is known that this cannot be done in polynomial time for UFAs. Hence UFAs are a mix of the worlds of DFA and of NFA; in some cases, they lead to smaller automata than DFA and quicker algorithms than NFA.

State complexity is an area of theoretical computer science dealing with the size of abstract automata, such as different kinds of finite automata. The classical result in the area is that simulating an $-state nondeterministic finite automaton by a deterministic finite automaton requires exactly states in the worst case.$

References

1 2 3 4 Salomaa (1981) p.97
↑ Lawson (2004) p.130
↑ Lawson (2004) p.129
1 2 3 Sakarovitch (2009) p.228
↑ Caron, Pascal (2000-07-06). "Families of locally testable languages". Theoretical Computer Science. 242 (1): 361–376. doi:10.1016/S0304-3975(98)00332-6. ISSN 0304-3975.
↑ McNaughton & Papert (1971) p.14
↑ Lawson (2004) p.132
↑ McNaughton & Papert (1971) p.18

Lawson, Mark V. (2004). Finite automata. Chapman and Hall/CRC. ISBN 1-58488-255-7. Zbl 1086.68074.
McNaughton, Robert; Papert, Seymour (1971). Counter-free Automata . Research Monograph. Vol. 65. With an appendix by William Henneman. MIT Press. ISBN 0-262-13076-9. Zbl 0232.94024.
Sakarovitch, Jacques (2009). Elements of automata theory. Translated from the French by Reuben Thomas. Cambridge: Cambridge University Press. ISBN 978-0-521-84425-3. Zbl 1188.68177.
Salomaa, Arto (1981). Jewels of Formal Language Theory. Pitman Publishing. ISBN 0-273-08522-0. Zbl 0487.68064.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Sal97-1] 1 2 3 4 Salomaa (1981) p.97

[Law130-2] Lawson (2004) p.130

[Law129-3] Lawson (2004) p.129

[Sak228-4] 1 2 3 Sakarovitch (2009) p.228

[5] Caron, Pascal (2000-07-06). "Families of locally testable languages". Theoretical Computer Science. 242 (1): 361–376. doi:10.1016/S0304-3975(98)00332-6. ISSN 0304-3975.

[6] McNaughton & Papert (1971) p.14

[Law132-7] Lawson (2004) p.132

[MP18-8] McNaughton & Papert (1971) p.18

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

Local language (formal language)

Contents

Examples

Properties

Related Research Articles

References