Morphic word

Last updated

In mathematics and computer science, a morphic word or substitutive word is an infinite sequence of symbols which is constructed from a particular class of endomorphism of a free monoid.

Contents

Every automatic sequence is morphic. [1]

Definition

Let f be an endomorphism of the free monoid A on an alphabet A with the property that there is a letter a such that f(a) = as for a non-empty string s: we say that f is prolongable at a. The word

is a pure morphic or pure substitutive word. Note that it is the limit of the sequence a, f(a), f(f(a)), f(f(f(a))), ... It is clearly a fixed point of the endomorphism f: the unique such sequence beginning with the letter a. [2] [3] In general, a morphic word is the image of a pure morphic word under a coding, that is, a morphism that maps letter to letter. [1]

If a morphic word is constructed as the fixed point of a prolongable k-uniform morphism on A then the word is k-automatic. The n-th term in such a sequence can be produced by a finite state automaton reading the digits of n in base k. [1]

Examples

D0L system

A D0L system (deterministic context-free Lindenmayer system) is given by a word w of the free monoid A on an alphabet A together with a morphism σ prolongable at w. The system generates the infinite D0L word ω = limn→∞ σn(w). Purely morphic words are D0L words but not conversely. However, if ω = uν is an infinite D0L word with an initial segment u of length |u| ≥ |w|, then zν is a purely morphic word, where z is a letter not in A. [7]

See also

Related Research Articles

Semiring Algebraic ring that need not have additive negative elements

In abstract algebra, a semiring is an algebraic structure similar to a ring, but without the requirement that each element must have an additive inverse.

In mathematics, the Thue–Morse sequence, or Prouhet–Thue–Morse sequence, is the binary sequence obtained by starting with 0 and successively appending the Boolean complement of the sequence obtained thus far. The first few steps of this procedure yield the strings 0 then 01, 0110, 01101001, 0110100110010110, and so on, which are prefixes of the Thue–Morse sequence. The full sequence begins:

In abstract algebra, the free monoid on a set is the monoid whose elements are all the finite sequences of zero or more elements from that set, with string concatenation as the monoid operation and with the unique sequence of zero elements, often called the empty string and denoted by ε or λ, as the identity element. The free monoid on a set A is usually denoted A. The free semigroup on A is the subsemigroup of A containing all elements except the empty string. It is usually denoted A+.

Sturmian word Kind of infinitely long sequence of characters

In mathematics, a Sturmian word, named after Jacques Charles François Sturm, is a certain kind of infinitely long sequence of characters. Such a sequence can be generated by considering a game of English billiards on a square table. The struck ball will successively hit the vertical and horizontal edges labelled 0 and 1 generating a sequence of letters. This sequence is a Sturmian word.

Fibonacci word Infinite binary sequence generated by the Fibonacci recurrence with concatenation in place of addition

A Fibonacci word is a specific sequence of binary digits. The Fibonacci word is formed by repeated concatenation in the same way that the Fibonacci numbers are formed by repeated addition.

In combinatorics, a squarefree word is a word that does not contain any squares. A square is a word of the form XX, where X is not empty. Thus, a squarefree word can also be defined as a word that avoids the pattern XX.

In mathematics and theoretical computer science, an automatic sequence is an infinite sequence of terms characterized by a finite automaton. The n-th term of an automatic sequence a(n) is a mapping of the final state reached in a finite automaton accepting the digits of the number n in some fixed base k.

In computer science, a trace is a set of strings, wherein certain letters in the string are allowed to commute, but others are not. It generalizes the concept of a string, by not forcing the letters to always be in a fixed order, but allowing certain reshufflings to take place. Traces were introduced by Pierre Cartier and Dominique Foata in 1969 to give a combinatorial proof of MacMahon's Master theorem. Traces are used in theories of concurrent computation, where commuting letters stand for portions of a job that can execute independently of one another, while non-commuting letters stand for locks, synchronization points or thread joins.

In mathematics, the plactic monoid is the monoid of all words in the alphabet of positive integers modulo Knuth equivalence. Its elements can be identified with semistandard Young tableaux. It was discovered by Donald Knuth (1970), using an operation given by Craige Schensted (1961) in his study of the longest increasing subsequence of a permutation.

In computer science, the complexity function of a word or string is the function that counts the number of distinct factors of that string. More generally, the complexity function of a formal language counts the number of distinct words of given length.

M. Lothaire is the pseudonym of a group of mathematicians, many of whom were students of Marcel-Paul Schützenberger. The name is used as the author of several of their joint books about combinatorics on words. The group is named for Lothair I.

In mathematics, a shuffle algebra is a Hopf algebra with a basis corresponding to words on some set, whose product is given by the shuffle productXY of two words X, Y: the sum of all ways of interlacing them. The interlacing is given by the riffle shuffle permutation.

In mathematics, a factorisation of a free monoid is a sequence of subsets of words with the property that every word in the free monoid can be written as a concatenation of elements drawn from the subsets. The Chen–Fox–Lyndon theorem states that the Lyndon words furnish a factorisation. The Schützenberger theorem relates the definition in terms of a multiplicative property to an additive property.

In mathematics, a compact semigroup is a semigroup in which the sets of solutions to equations can be described by finite sets of equations. The term "compact" here does not refer to any topology on the semigroup.

In mathematics, a sesquipower or Zimin word is a string over an alphabet with identical prefix and suffix. Sesquipowers are unavoidable patterns, in the sense that all sufficiently long strings contain one.

In mathematics, a recurrent word or sequence is an infinite word over a finite alphabet in which every factor occurs infinitely many times. An infinite word is recurrent if and only if it is a sesquipower.

In mathematics and computer science, the critical exponent of a finite or infinite sequence of symbols over a finite alphabet describes the largest number of times a contiguous subsequence can be repeated. For example, the critical exponent of "Mississippi" is 7/3, as it contains the string "ississi", which is of length 7 and period 3.

Gesine Reinert Professor of Statistics

Gesine Reinert is a University Professor in Statistics at the University of Oxford. She is a Fellow of Keble College, Oxford, a Fellow of the Alan Turing Institute, and a Fellow of the Institute of Mathematical Statistics. Her research concerns the probability theory and statistics of biological sequences and biological networks. Gesine is also a huge fan of the cricketer Donald Gordon.

Valérie Berthé French mathematician

Valérie Berthé is a French mathematician who works as a director of research for the Centre national de la recherche scientifique (CNRS) at the Institut de Recherche en Informatique Fondamentale (IRIF), a joint project between CNRS and Paris Diderot University. Her research involves symbolic dynamics, combinatorics on words, discrete geometry, numeral systems, tessellations, and fractals.

Cobham's theorem is a theorem in combinatorics on words that has important connections with number theory, notably transcendental numbers, and automata theory. Informally, the theorem gives the condition for the members of a set S of natural numbers written in bases b1 and base b2 to be recognised by finite automata. Specifically, consider bases b1 and b2 such that they are not powers of the same integer. Cobham's theorem states that S written in bases b1 and b2 is recognised by finite automata if and only if S is a finite union of arithmetic progressions. The theorem was proved by Alan Cobham in 1969 and has since given rise to many extensions and generalisations.

References

  1. 1 2 3 4 Lothaire (2005) p.524
  2. Lothaire (2011) p. 10
  3. Honkala (2010) p.505
  4. 1 2 Lothaire (2011) p. 11
  5. 1 2 3 Lothaire (2005) p.525
  6. Lothaire (2005) p.526
  7. Honkala (2010) p.506

Further reading