Syntactic monoid

Last updated June 12, 2024

In mathematics and computer science, the syntactic monoid $M(L)$ of a formal language $L$ is the smallest monoid that recognizes the language $L$ .

Syntactic quotient

The free monoid on a given set is the monoid whose elements are all the strings of zero or more elements from that set, with string concatenation as the monoid operation and the empty string as the identity element. Given a subset $S$ of a free monoid $M$ , one may define sets that consist of formal left or right inverses of elements in $S$ . These are called quotients, and one may define right or left quotients, depending on which side one is concatenating. Thus, the right quotient of $S$ by an element $m$ from $M$ is the set

S\ /\ m=\{u\in M\;\vert \;um\in S\}.

Similarly, the left quotient is

m\setminus S=\{u\in M\;\vert \;mu\in S\}.

Syntactic equivalence

The syntactic quotient^{[ clarification needed ]} induces an equivalence relation on $M$ , called the syntactic relation, or syntactic equivalence (induced by $S$ ).

The right syntactic equivalence is the equivalence relation

s\sim _{S}t\ \Leftrightarrow \ S\,/\,s\;=\;S\,/\,t\ \Leftrightarrow \ (\forall x\in M\colon \ xs\in S\Leftrightarrow xt\in S)

.

Similarly, the left syntactic equivalence is

s\;{}_{S}{\sim }\;t\ \Leftrightarrow \ s\setminus S\;=\;t\setminus S\ \Leftrightarrow \ (\forall y\in M\colon \ sy\in S\Leftrightarrow ty\in S)

.

Observe that the right syntactic equivalence is a left congruence with respect to string concatenation and vice versa; i.e., $s\sim _{S}t\ \Rightarrow \ xs\sim _{S}xt\$ for all $x\in M$ .

The syntactic congruence or Myhill congruence^[1] is defined as^[2]

s\equiv _{S}t\ \Leftrightarrow \ (\forall x,y\in M\colon \ xsy\in S\Leftrightarrow xty\in S)

.

The definition extends to a congruence defined by a subset $S$ of a general monoid $M$ . A disjunctive set is a subset $S$ such that the syntactic congruence defined by $S$ is the equality relation.^[3]

Let us call $[s]_{S}$ the equivalence class of $s$ for the syntactic congruence. The syntactic congruence is compatible with concatenation in the monoid, in that one has

[s]_{S}[t]_{S}=[st]_{S}

for all $s,t\in M$ . Thus, the syntactic quotient is a monoid morphism, and induces a quotient monoid

M(S)=M\ /\ {\equiv _{S}}

.

This monoid $M(S)$ is called the syntactic monoid of $S$ . It can be shown that it is the smallest monoid that recognizes $S$ ; that is, $M(S)$ recognizes $S$ , and for every monoid $N$ recognizing $S$ , $M(S)$ is a quotient of a submonoid of $N$ . The syntactic monoid of $S$ is also the transition monoid of the minimal automaton of $S$ .^[1]^[2]^[4]

A group language is one for which the syntactic monoid is a group.^[5]

Myhill–Nerode theorem

The Myhill–Nerode theorem states: a language $L$ is regular if and only if the family of quotients $\{m\setminus L\,\vert \;m\in M\}$ is finite, or equivalently, the left syntactic equivalence ${}_{S}{\sim }$ has finite index (meaning it partitions $M$ into finitely many equivalence classes).^[6]

This theorem was first proved by Anil Nerode ( Nerode 1958 ) and the relation ${}_{S}{\sim }$ is thus referred to as Nerode congruence by some authors.^[7]^[8]

Proof

The proof of the "only if" part is as follows. Assume that a finite automaton recognizing $L$ reads input $x$ , which leads to state $p$ . If $y$ is another string read by the machine, also terminating in the same state $p$ , then clearly one has $x\setminus L\,=y\setminus L$ . Thus, the number of elements in $\{m\setminus L\,\vert \;m\in M\}$ is at most equal to the number of states of the automaton and $\{m\setminus L\,\vert \;m\in L\}$ is at most the number of final states.

For a proof of the "if" part, assume that the number of elements in $\{m\setminus L\,\vert \;m\in M\}$ is finite. One can then construct an automaton where $Q=\{m\setminus L\,\vert \;m\in M\}$ is the set of states, $F=\{m\setminus L\,\vert \;m\in L\}$ is the set of final states, the language $L$ is the initial state, and the transition function is given by $\delta _{y}\colon x\setminus L\to y\setminus (x\setminus L)=(xy)\setminus L$ . Clearly, this automaton recognizes $L$ .

Thus, a language $L$ is recognizable if and only if the set $\{m\setminus L\,\vert \;m\in M\}$ is finite. Note that this proof also builds the minimal automaton.

Examples

Let $L$ be the language over $A=\{a,b\}$ of words of even length. The syntactic congruence has two classes, $L$ itself and $L_{1}$ , the words of odd length. The syntactic monoid is the group of order 2 on $\{L,L_{1}\}$ .^[9]
For the language $(ab+ba)^{*}$ , the minimal automaton has 4 states and the syntactic monoid has 15 elements.^[10]
The bicyclic monoid is the syntactic monoid of the Dyck language (the language of balanced sets of parentheses).
The free monoid on $A$ (where $\left|A\right|>1$ ) is the syntactic monoid of the language $\{ww^{R}\mid w\in A^{*}\}$ , where $w^{R}$ is the reversal of the word $w$ . (For $\left|A\right|=1$ , one can use the language of square powers of the letter.)
Every non-trivial finite monoid is homomorphic^{[ clarification needed ]} to the syntactic monoid of some non-trivial language,^[11] but not every finite monoid is isomorphic to a syntactic monoid.^[12]
Every finite group is isomorphic to the syntactic monoid of some regular language.^[11]
The language over $\{a,b\}$ in which the number of occurrences of $a$ and $b$ are congruent modulo $2^{n}$ is a group language with syntactic monoid $\mathbb {Z} /2^{n}\mathbb {Z}$ .^[5]
Trace monoids are examples of syntactic monoids.
Marcel-Paul Schützenberger ^[13] characterized star-free languages as those with finite aperiodic syntactic monoids.^[14]

Related Research Articles

In mathematics, when the elements of some set $have a notion of equivalence, then one may naturally split the set into equivalence classes . These equivalence classes are constructed so that elements and belong to the same equivalence class if, and only if, they are equivalent.$

In abstract algebra, a branch of mathematics, a monoid is a set equipped with an associative binary operation and an identity element. For example, the nonnegative integers with addition form a monoid, the identity element being $0$ .

In theoretical computer science and formal language theory, a regular language is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science.

In mathematics, a semigroup is an algebraic structure consisting of a set together with an associative internal binary operation on it.

Automata theory is the study of abstract machines and automata, as well as the computational problems that can be solved using them. It is a theory in theoretical computer science with close connections to mathematical logic. The word automata comes from the Greek word αὐτόματος, which means "self-acting, self-willed, self-moving". An automaton is an abstract self-propelled computing device which follows a predetermined sequence of operations automatically. An automaton with a finite number of states is called a finite automaton (FA) or finite-state machine (FSM). The figure on the right illustrates a finite-state machine, which is a well-known type of automaton. This automaton consists of states and transitions. As the automaton sees a symbol of input, it makes a transition to another state, according to its transition function, which takes the previous state and current input symbol as its arguments.

In the theory of formal languages, the Myhill–Nerode theorem provides a necessary and sufficient condition for a language to be regular. The theorem is named for John Myhill and Anil Nerode, who proved it at the University of Chicago in 1957.

In abstract algebra, the free monoid on a set is the monoid whose elements are all the finite sequences of zero or more elements from that set, with string concatenation as the monoid operation and with the unique sequence of zero elements, often called the empty string and denoted by ε or λ, as the identity element. The free monoid on a set A is usually denoted A^∗. The free semigroup on A is the subsemigroup of A^∗ containing all elements except the empty string. It is usually denoted A⁺.

In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state automaton (DFSA)—is a finite-state machine that accepts or rejects a given string of symbols, by running through a state sequence uniquely determined by the string. Deterministic refers to the uniqueness of the computation run. In search of the simplest models to capture finite-state machines, Warren McCulloch and Walter Pitts were among the first researchers to introduce a concept similar to finite automata in 1943.

In mathematics, Green's relations are five equivalence relations that characterise the elements of a semigroup in terms of the principal ideals they generate. The relations are named for James Alexander Green, who introduced them in a paper of 1951. John Mackintosh Howie, a prominent semigroup theorist, described this work as "so all-pervading that, on encountering a new semigroup, almost the first question one asks is 'What are the Green relations like?'". The relations are useful for understanding the nature of divisibility in a semigroup; they are also valid for groups, but in this case tell us nothing useful, because groups always have divisibility.

In theoretical computer science and mathematical logic a string rewriting system (SRS), historically called a semi-Thue system, is a rewriting system over strings from a alphabet. Given a binary relation $between fixed strings over the alphabet, called rewrite rules, denoted by, an SRS extends the rewriting relation to all strings in which the left- and right-hand side of the rules appear as substrings, that is, where,,, and are strings.$

In theoretical computer science and formal language theory, a regular language is said to be star-free if it can be described by a regular expression constructed from the letters of the alphabet, the empty word, the empty set symbol, all boolean operators – including complementation – and concatenation but no Kleene star. The condition is equivalent to having generalized star height zero.

In algebra, a presentation of a monoid is a description of a monoid in terms of a set $Σ$ of generators and a set of relations on the free monoid $Σ *$ generated by $Σ$ . The monoid is then presented as the quotient of the free monoid by these relations. This is an analogue of a group presentation in group theory.

In computer science, a trace is a set of strings, wherein certain letters in the string are allowed to commute, but others are not. It generalizes the concept of a string, by not forcing the letters to always be in a fixed order, but allowing certain reshufflings to take place. Traces were introduced by Pierre Cartier and Dominique Foata in 1969 to give a combinatorial proof of MacMahon's master theorem. Traces are used in theories of concurrent computation, where commuting letters stand for portions of a job that can execute independently of one another, while non-commuting letters stand for locks, synchronization points or thread joins.

In computer science, in the area of formal language theory, frequent use is made of a variety of string functions; however, the notation used is different from that used for computer programming, and some commonly used functions in the theoretical realm are rarely used when programming. This article defines some of these basic terms.

In automata theory, DFA minimization is the task of transforming a given deterministic finite automaton (DFA) into an equivalent DFA that has a minimum number of states. Here, two DFAs are called equivalent if they recognize the same regular language. Several different algorithms accomplishing this task are known and described in standard textbooks on automata theory.

In algebra, a transformation semigroup is a collection of transformations that is closed under function composition. If it includes the identity function, it is a monoid, called a transformationmonoid. This is the semigroup analogue of a permutation group.

In computer science, more precisely in automata theory, a recognizable set of a monoid is a subset that can be distinguished by some homomorphism to a finite monoid. Recognizable sets are useful in automata theory, formal languages and algebra.

In mathematics, a local language is a formal language for which membership of a word in the language can be determined by looking at the first and last symbol and each two-symbol substring of the word. Equivalently, it is a language recognised by a local automaton, a particular kind of deterministic finite automaton.

In mathematics, a rational monoid is a monoid, an algebraic structure, for which each element can be represented in a "normal form" that can be computed by a finite transducer: multiplication in such a monoid is "easy", in the sense that it can be described by a rational function.

In computer science, in particular in formal language theory, a quotient automaton can be obtained from a given nondeterministic finite automaton by joining some of its states. The quotient recognizes a superset of the given automaton; in some cases, handled by the Myhill–Nerode theorem, both languages are equal.

References

1 2 Holcombe (1982) p.160
1 2 Lawson (2004) p.210
↑ Lawson (2004) p.232
↑ Straubing (1994) p.55
1 2 Sakarovitch (2009) p.342
↑ Nerode, Anil (1958), "Linear Automaton Transformations", Proceedings of the American Mathematical Society , 9 (4): 541–544, doi: 10.1090/S0002-9939-1958-0135681-9 , JSTOR 2033204
↑ Brzozowski, Janusz; Szykuła, Marek; Ye, Yuli (2018), "Syntactic Complexity of Regular Ideals", Theory of Computing Systems, 62 (5): 1175–1202, doi:10.1007/s00224-017-9803-8, hdl: 10012/12499 , S2CID 2238325
↑ Crochemore, Maxime; et al. (2009), "From Nerode's congruence to suffix automata with mismatches", Theoretical Computer Science, 410 (37): 3471–3480, doi: 10.1016/j.tcs.2009.03.011 , S2CID 14277204
↑ Straubing (1994) p.54
↑ Lawson (2004) pp.211-212
1 2 McNaughton, Robert; Papert, Seymour (1971). Counter-free Automata . Research Monograph. Vol. 65. With an appendix by William Henneman. MIT Press. p. 48. ISBN 0-262-13076-9. Zbl 0232.94024.
↑ Lawson (2004) p.233
↑ Marcel-Paul Schützenberger (1965). "On finite monoids having only trivial subgroups" (PDF). Information and Computation. 8 (2): 190–194. doi: 10.1016/s0019-9958(65)90108-7 .
↑ Straubing (1994) p.60

Anderson, James A. (2006). Automata theory with modern applications. With contributions by Tom Head. Cambridge: Cambridge University Press. ISBN 0-521-61324-8. Zbl 1127.68049.
Holcombe, W.M.L. (1982). Algebraic automata theory. Cambridge Studies in Advanced Mathematics. Vol. 1. Cambridge University Press. ISBN 0-521-60492-3. Zbl 0489.68046.
Lawson, Mark V. (2004). Finite automata. Chapman and Hall/CRC. ISBN 1-58488-255-7. Zbl 1086.68074.
Pin, Jean-Éric (1997). "10. Syntactic semigroups". In Rozenberg, G.; Salomaa, A. (eds.). Handbook of Formal Language Theory (PDF). Vol. 1. Springer-Verlag. pp. 679–746. Zbl 0866.68057.
Sakarovitch, Jacques (2009). Elements of automata theory. Translated from the French by Reuben Thomas. Cambridge University Press. ISBN 978-0-521-84425-3. Zbl 1188.68177.
Straubing, Howard (1994). Finite automata, formal logic, and circuit complexity . Progress in Theoretical Computer Science. Basel: Birkhäuser. ISBN 3-7643-3719-2. Zbl 0816.68086.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Hol160-1] 1 2 Holcombe (1982) p.160

[Law210-2] 1 2 Lawson (2004) p.210

[Law232-3] Lawson (2004) p.232

[S55-4] Straubing (1994) p.55

[Sak342-5] 1 2 Sakarovitch (2009) p.342

[6] Nerode, Anil (1958), "Linear Automaton Transformations", Proceedings of the American Mathematical Society , 9 (4): 541–544, doi: 10.1090/S0002-9939-1958-0135681-9 , JSTOR 2033204

[7] Brzozowski, Janusz; Szykuła, Marek; Ye, Yuli (2018), "Syntactic Complexity of Regular Ideals", Theory of Computing Systems, 62 (5): 1175–1202, doi:10.1007/s00224-017-9803-8, hdl: 10012/12499 , S2CID 2238325

[8] Crochemore, Maxime; et al. (2009), "From Nerode's congruence to suffix automata with mismatches", Theoretical Computer Science, 410 (37): 3471–3480, doi: 10.1016/j.tcs.2009.03.011 , S2CID 14277204

[S54-9] Straubing (1994) p.54

[Law211-10] Lawson (2004) pp.211-212

[MP48-11] 1 2 McNaughton, Robert; Papert, Seymour (1971). Counter-free Automata . Research Monograph. Vol. 65. With an appendix by William Henneman. MIT Press. p. 48. ISBN 0-262-13076-9. Zbl 0232.94024.

[Law233-12] Lawson (2004) p.233

[13] Marcel-Paul Schützenberger (1965). "On finite monoids having only trivial subgroups" (PDF). Information and Computation. 8 (2): 190–194. doi: 10.1016/s0019-9958(65)90108-7 .

[S60-14] Straubing (1994) p.60

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]