Thue–Morse sequence

Last updated
This graphic demonstrates the repeating and complementary makeup of the Thue-Morse sequence. Morse-Thue sequence.gif
This graphic demonstrates the repeating and complementary makeup of the Thue–Morse sequence.

In mathematics, the Thue–Morse sequence, or Prouhet–Thue–Morse sequence, is the binary sequence (an infinite sequence of 0s and 1s) obtained by starting with 0 and successively appending the Boolean complement of the sequence obtained thus far. The first few steps of this procedure yield the strings 0 then 01, 0110, 01101001, 0110100110010110, and so on, which are prefixes of the Thue–Morse sequence. The full sequence begins:

Contents

01101001100101101001011001101001.... (sequence A010060 in the OEIS )

The sequence is named after Axel Thue and Marston Morse.

Definition

There are several equivalent ways of defining the Thue–Morse sequence.

Direct definition

When counting in binary, the digit sum modulo 2 is the Thue-Morse sequence Thue-Morse binary digit sum.svg
When counting in binary, the digit sum modulo 2 is the Thue-Morse sequence

To compute the nth element tn, write the number n in binary. If the number of ones in this binary expansion is odd then tn = 1, if even then tn = 0. [1] For this reason John H. Conway et al. call numbers n satisfying tn = 1 odious (for odd) numbers and numbers for which tn = 0 evil (for even) numbers. In other words, tn = 0 if n is an evil number and tn = 1 if n is an odious number.

Fast sequence generation

This method leads to a fast method for computing the Thue–Morse sequence: start with t0 = 0, and then, for each n, find the highest-order bit in the binary representation of n that is different from the same bit in the representation of n  1. (This bit can be isolated by letting x be the bitwise exclusive or of n and n  1, shifting x right by one bit, and computing the exclusive or of this shifted value with x.) If this bit is at an even index, tn differs from tn  1, and otherwise it is the same as tn  1.

In pseudo-code form:

generateSequence(seqLength):value=0forn=0toseqLength-1by1:x=n^(n-1)if((x^(x>>1))&0x55555555):value=1-valuereturnvalue

The resulting algorithm takes constant time to generate each sequence element, using only a logarithmic number of bits (constant number of words) of memory. [2]

Recurrence relation

The Thue–Morse sequence is the sequence tn satisfying the recurrence relation

for all non-negative integers n. [1]

L-system

Thue-Morse sequence generated by an L-System Thue-Morse L-System.svg
Thue–Morse sequence generated by an L-System

The Thue–Morse sequence is a morphic word: [3] it is the output of the following Lindenmayer system:

Variables 0, 1
Constants None
Start 0
Rules (0 → 01), (1 → 10)

Characterization using bitwise negation

The Thue–Morse sequence in the form given above, as a sequence of bits, can be defined recursively using the operation of bitwise negation. So, the first element is 0. Then once the first 2n elements have been specified, forming a string s, then the next 2n elements must form the bitwise negation of s. Now we have defined the first 2n+1 elements, and we recurse.

Spelling out the first few steps in detail:

So

Infinite product

The sequence can also be defined by:

where tj is the jth element if we start at j = 0.

Some properties

Because each new block in the Thue–Morse sequence is defined by forming the bitwise negation of the beginning, and this is repeated at the beginning of the next block, the Thue–Morse sequence is filled with squares: consecutive strings that are repeated. That is, there are many instances of XX, where X is some string. Indeed, is such a string if and only if or where for some and denotes the bitwise negation of (interchange 0s and 1s). [4] For instance, with , we have , and the square appears in starting at the 16th bit. (Thus, squares in have length either a power of 2 or 3 times a power of 2.) However, there are no cubes: instances of XXX. There are also no overlapping squares: instances of 0X0X0 or 1X1X1. [5] [6] The critical exponent is 2. [7]

The sequence T2n is palindrome for any n. Further, let qn be a word obtained from T2n by counting ones between consecutive zeros. For instance, q1 = 2 and q2 = 2102012 and so on. The words Tn do not contain overlapping squares in consequence, the words qn are palindrome squarefree words.

The statement above that the Thue–Morse sequence is "filled with squares" can be made precise: It is a uniformly recurrent word , meaning that given any finite string X in the sequence, there is some length nX (often much longer than the length of X) such that X appears in every block of length nX. [8] [9] The easiest way to make a recurrent sequence is to form a periodic sequence, one where the sequence repeats entirely after a given number m of steps. Then nX can be set to any multiple of m that is larger than twice the length of X. But the Morse sequence is uniformly recurrent without being periodic, not even eventually periodic (meaning periodic after some nonperiodic initial segment). [10]

We define the Thue–Morse morphism to be the function f from the set of binary sequences to itself by replacing every 0 in a sequence with 01 and every 1 with 10. [11] Then if T is the Thue–Morse sequence, then f(T) is T again; that is, T is a fixed point of f. The function f is a prolongable morphism on the free monoid {0,1} with T as fixed point: indeed, T is essentially the only fixed point of f; the only other fixed point is the bitwise negation of T, which is simply the Thue–Morse sequence on (1,0) instead of on (0,1). This property may be generalized to the concept of an automatic sequence.

The generating series of T over the binary field is the formal power series

This power series is algebraic over the field of formal power series, satisfying the equation [12]

In combinatorial game theory

The set of evil numbers (numbers with ) forms a subspace of the nonnegative integers under nim-addition (bitwise exclusive or). For the game of Kayles, evil nim-values occur for few (finitely many) positions in the game, with all remaining positions having odious nim-values.

The Prouhet–Tarry–Escott problem

The Prouhet–Tarry–Escott problem can be defined as: given a positive integer N and a non-negative integer k, partition the set S = { 0, 1, ..., N-1 } into two disjoint subsets S0 and S1 that have equal sums of powers up to k, that is:

for all integers i from 1 to k.

This has a solution if N is a multiple of 2k+1, given by:

For example, for N = 8 and k = 2,

0 + 3 + 5 + 6 = 1 + 2 + 4 + 7,
02 + 32 + 52 + 62 = 12 + 22 + 42 + 72.

The condition requiring that N be a multiple of 2k+1 is not strictly necessary: there are some further cases for which a solution exists. However, it guarantees a stronger property: if the condition is satisfied, then the set of kth powers of any set of N numbers in arithmetic progression can be partitioned into two sets with equal sums. This follows directly from the expansion given by the binomial theorem applied to the binomial representing the nth element of an arithmetic progression.

For generalizations of the Thue–Morse sequence and the Prouhet–Tarry–Escott problem to partitions into more than two parts, see Bolker, Offner, Richman and Zara, "The Prouhet–Tarry–Escott problem and generalized Thue–Morse sequences". [13]

Fractals and turtle graphics

Using turtle graphics, a curve can be generated if an automaton is programmed with a sequence. When Thue–Morse sequence members are used in order to select program states:

The resulting curve converges to the Koch curve, a fractal curve of infinite length containing a finite area. This illustrates the fractal nature of the Thue–Morse Sequence. [14]

It is also possible to draw the curve precisely using the following instructions: [15]

Equitable sequencing

In their book on the problem of fair division, Steven Brams and Alan Taylor invoked the Thue–Morse sequence but did not identify it as such. When allocating a contested pile of items between two parties who agree on the items' relative values, Brams and Taylor suggested a method they called balanced alternation, or taking turns taking turns taking turns . . . , as a way to circumvent the favoritism inherent when one party chooses before the other. An example showed how a divorcing couple might reach a fair settlement in the distribution of jointly-owned items. The parties would take turns to be the first chooser at different points in the selection process: Ann chooses one item, then Ben does, then Ben chooses one item, then Ann does. [16]

Lionel Levine and Katherine Stange, in their discussion of how to fairly apportion a shared meal such as an Ethiopian dinner, proposed the Thue–Morse sequence as a way to reduce the advantage of moving first. They suggested that “it would be interesting to quantify the intuition that the Thue-Morse order tends to produce a fair outcome.” [17]

Robert Richman addressed this problem, but he too did not identify the Thue–Morse sequence as such at the time of publication. [18] He presented the sequences Tn as step functions on the interval [0,1] and described their relationship to the Walsh and Rademacher functions. He showed that the nth derivative can be expressed in terms of Tn. As a consequence, the step function arising from Tn is orthogonal to polynomials of order n  1. A consequence of this result is that a resource whose value is expressed as a monotonically decreasing continuous function is most fairly allocated using a sequence that converges to Thue-Morse as the function becomes flatter. An example showed how to pour cups of coffee of equal strength from a carafe with a nonlinear concentration gradient, prompting a whimsical article in the popular press. [19]

Joshua Cooper and Aaron Dutle showed why the Thue-Morse order provides a fair outcome for discrete events. [20] They considered the fairest way to stage a Galois duel, in which each of the shooters has equally poor shooting skills. Cooper and Dutle postulated that each dueler would demand a chance to fire as soon as the other’s a priori probability of winning exceeded their own. They proved that, as the duelers’ hitting probability approaches zero, the firing sequence converges to the Thue–Morse sequence. In so doing, they demonstrated that the Thue-Morse order produces a fair outcome not only for sequences Tn of length 2n, but for sequences of any length.

Thus the mathematics supports using the Thue–Morse sequence instead of alternating turns when the goal is fairness but earlier turns differ monotonically from later turns in some meaningful quality, whether that quality varies continuously [18] or discretely. [20]

Sports competitions form an important class of equitable sequencing problems, because strict alternation often gives an unfair advantage to one team. Ignacio Palacios-Huerta proposed changing the sequential order to Thue-Morse to improve the ex post fairness of various tournament competitions, such as the kicking sequence of a penalty shoot-out in soccer (for which UEFA experimented with the ABBA, or T2, kicking sequence in 2017), the rotation of color of pieces in a chess match, and the serving order in a tennis tie-break. [21] In competitive rowing, T2 is the only arrangement of port- and starboard-rowing crew members that eliminates transverse forces (and hence sideways wiggle) on a four-membered coxless racing boat, while T3 is one of only four rigs to avoid wiggle on an eight-membered boat. [22]

Fairness is especially important in player drafts. Many professional sports leagues attempt to achieve competitive parity by giving earlier selections in each round to weaker teams. By contrast, fantasy football leagues have no pre-existing imbalance to correct, so they often use a “snake” draft (forward, backward, etc.; or T1). [23] Ian Allan argued that a “third-round reversal” (forward, backward, backward, forward, etc.; or T2) would be even more fair. [24] Richman suggested that the fairest way for “captain A” and “captain B” to choose sides for a pick-up game of basketball mirrors T3: captain A has the first, fourth, sixth, and seventh choices, while captain B has the second, third, fifth, and eighth choices. [18]

History

The Thue–Morse sequence was first studied by Eugène Prouhet in 1851, [25] who applied it to number theory. However, Prouhet did not mention the sequence explicitly; this was left to Axel Thue in 1906, who used it to found the study of combinatorics on words. The sequence was only brought to worldwide attention with the work of Marston Morse in 1921, when he applied it to differential geometry. The sequence has been discovered independently many times, not always by professional research mathematicians; for example, Max Euwe, a chess grandmaster, who held the World Championship title from 1935 to 1937, and mathematics teacher, discovered it in 1929 in an application to chess: by using its cube-free property (see above), he showed how to circumvent a rule aimed at preventing infinitely protracted games by declaring repetition of moves a draw.

See also

Notes

  1. 1 2 Allouche & Shallit (2003 , p. 15)
  2. Arndt (2011).
  3. Lothaire (2011 , p. 11)
  4. Brlek (1989).
  5. Lothaire (2011 , p. 113)
  6. Pytheas Fogg (2002 , p. 103)
  7. Krieger (2006).
  8. Lothaire (2011 , p. 30)
  9. Berthé & Rigo (2010).
  10. Lothaire (2011 , p. 31)
  11. Berstel et al. (2009 , p. 70)
  12. Berstel et al. (2009 , p. 80)
  13. Bolker, Ethan; Offner, Carl; Richman, Robert; Zara, Catalin (2016). "The Prouhet–Tarry–Escott problem and generalized Thue–Morse sequences". Journal of Combinatorics. 7 (1): 117–133. arXiv: 1304.6756 . doi:10.4310/JOC.2016.v7.n1.a5.
  14. Ma & Holdener (2005).
  15. Abel, Zachary (January 23, 2012). "Thue-Morse Navigating Turtles". Three-Cornered Things.
  16. Brams & Taylor (1999).
  17. Levine & Stange (2012).
  18. 1 2 3 Richman (2001)
  19. Abrahams (2010).
  20. 1 2 Cooper & Dutle (2013)
  21. Palacios-Huerta (2012).
  22. Barrow (2010).
  23. "Fantasy Draft Types". NFL.com . Archived from the original on October 12, 2018.
  24. Allan, Ian (July 16, 2014). "Third-Round Reversal Drafts". Fantasy Index. Retrieved September 1, 2020.
  25. The ubiquitous Prouhet-Thue-Morse sequence by Jean-Paul Allouche and Jeffrey Shallit

Related Research Articles

Semiring algebraic ring that need not have additive negative elements

In abstract algebra, a semiring is an algebraic structure similar to a ring, but without the requirement that each element must have an additive inverse.

In abstract algebra, the free monoid on a set is the monoid whose elements are all the finite sequences of zero or more elements from that set, with string concatenation as the monoid operation and with the unique sequence of zero elements, often called the empty string and denoted by ε or λ, as the identity element. The free monoid on a set A is usually denoted A. The free semigroup on A is the subsemigroup of A containing all elements except the empty string. It is usually denoted A+.

Klaus Roth British mathematician

Klaus Friedrich Roth was a German-born British mathematician who won the Fields Medal for proving Roth's theorem on the Diophantine approximation of algebraic numbers. He was also a winner of the De Morgan Medal and the Sylvester Medal, and a Fellow of the Royal Society.

In mathematics, the Prouhet–Thue–Morse constant, named for Eugène Prouhet, Axel Thue, and Marston Morse, is the number—denoted by —whose binary expansion .01101001100101101001011001101001... is given by the Thue–Morse sequence. That is,

De Bruijn sequence circular sequence of symbols that contains each possible length-k contiguous subsequence exactly once

In combinatorial mathematics, a de Bruijn sequence of order n on a size-k alphabet A is a cyclic sequence in which every possible length-n string on A occurs exactly once as a substring. Such a sequence is denoted by B(k, n) and has length kn, which is also the number of distinct strings of length n on A. Each of these distinct strings, when taken as a substring of B(k, n), must start at a different position, because substrings starting at the same position are not distinct. Therefore, B(k, n) must have at leastkn symbols. And since B(k, n) has exactlykn symbols, De Bruijn sequences are optimally short with respect to the property of containing every string of length n exactly once.

Sturmian word mathematical sequence of characters

In mathematics, a Sturmian word, named after Jacques Charles François Sturm, is a certain kind of infinitely long sequence of characters. Such a sequence can be generated by considering a game of English billiards on a square table. The struck ball will successively hit the vertical and horizontal edges labelled 0 and 1 generating a sequence of letters. This sequence is a Sturmian word.

In number theory, an odious number is a positive integer that has an odd number of 1s in its binary expansion.

In combinatorics, a squarefree word is a word that does not contain any squares. A square is a word of the form XX, where X is not empty. Thus, a squarefree word can also be defined as a word that avoids the pattern XX.

In number theory, an evil number is a non-negative integer that has an even number of 1s in its binary expansion. These numbers give the positions of the zero values in the Thue–Morse sequence, and for this reason they have also been called the Thue–Morse set. Non-negative integers that are not evil are called odious numbers.

In mathematics and theoretical computer science, an automatic sequence is an infinite sequence of terms characterized by a finite automaton. The n-th term of an automatic sequence a(n) is a mapping of the final state reached in a finite automaton accepting the digits of the number n in some fixed base k.

Combinatorics on words is a fairly new field of mathematics, branching from combinatorics, which focuses on the study of words and formal languages. The subject looks at letters or symbols, and the sequences they form. Combinatorics on words affects various areas of mathematical study, including algebra and computer science. There have been a wide range of contributions to the field. Some of the first work was on square-free words by Axel Thue in the early 1900s. He and colleagues observed patterns within words and tried to explain them. As time went on, combinatorics on words became useful in the study of algorithms and coding. It led to developments in abstract algebra and answering open questions.

In the mathematical theory of non-standard positional numeral systems, the Komornik–Loreti constant is a mathematical constant that represents the smallest base q for which the number 1 has a unique representation, called its q-development. The constant is named after Vilmos Komornik and Paola Loreti, who defined it in 1998.

In computer science, the complexity function of a string, a finite or infinite sequence of letters from some alphabet, is the function that counts the number of distinct factors from that string. More generally, the complexity function of a language, a set of finite words over an alphabet, counts the number of distinct words of given length.

In mathematics, a factorisation of a free monoid is a sequence of subsets of words with the property that every word in the free monoid can be written as a concatenation of elements drawn from the subsets. The Chen–Fox–Lyndon theorem states that the Lyndon words furnish a factorisation. The Schützenberger theorem relates the definition in terms of a multiplicative property to an additive property.

In mathematics and computer science, a morphic word or substitutive word is an infinite sequence of symbols which is constructed from a particular class of endomorphism of a free monoid.

In mathematics, a sesquipower or Zimin word is a string over an alphabet with identical prefix and suffix. Sesquipowers are unavoidable patterns, in the sense that all sufficiently long strings contain one.

In mathematics and theoretical computer science, a pattern is an unavoidable pattern if it is unavoidable on any finite alphabet.

In mathematics, a recurrent word or sequence is an infinite word over a finite alphabet in which every factor occurs infinitely many times. An infinite word is recurrent if and only if it is a sesquipower.

In mathematics and computer science, the critical exponent of a finite or infinite sequence of symbols over a finite alphabet describes the largest number of times a contiguous subsequence can be repeated. For example, the critical exponent of "Mississippi" is 7/3, as it contains the string "ississi", which is of length 7 and period 3.

In mathematics and theoretical computer science, a k-regular sequence is a sequence satisfying linear recurrence equations that reflect the base-k representations of the integers. The class of k-regular sequences generalizes the class of k-automatic sequences to alphabets of infinite size.

References

Further reading