# Thue–Morse sequence

Last updated

In mathematics, the Thue–Morse sequence, or Prouhet–Thue–Morse sequence, is the binary sequence (an infinite sequence of 0s and 1s) obtained by starting with 0 and successively appending the Boolean complement of the sequence obtained thus far. The first few steps of this procedure yield the strings 0 then 01, 0110, 01101001, 0110100110010110, and so on, which are prefixes of the Thue–Morse sequence. The full sequence begins:

## Contents

01101001100101101001011001101001.... (sequence in the OEIS )

The sequence is named after Axel Thue and Marston Morse.

## Definition

There are several equivalent ways of defining the Thue–Morse sequence.

### Direct definition

To compute the nth element tn, write the number n in binary. If the number of ones in this binary expansion is odd then tn = 1, if even then tn = 0. [1] For this reason John H. Conway et al. called numbers n satisfying tn = 1 odious (for odd) numbers and numbers for which tn = 0 evil (for even) numbers. In other words, tn = 0 if n is an evil number and tn = 1 if n is an odious number.

### Fast sequence generation

This method leads to a fast method for computing the Thue–Morse sequence: start with t0 = 0, and then, for each n, find the highest-order bit in the binary representation of n that is different from the same bit in the representation of n  1. If this bit is at an even index, tn differs from tn  1, and otherwise it is the same as tn  1.

In pseudo-code form:

generateSequence(seqLength):value=0forn=0toseqLength-1by1:# Note: assumes an even number of bits in the word size, and two's complement arithmetic so that when n == 0, x is odd (e.g. 31 or 63)x=indexOfHighestOneBit(n^(n-1))if((x&1)==0):# bit index is even, so toggle valuevalue=1-valueyieldvalue

The resulting algorithm takes constant time to generate each sequence element, using only a logarithmic number of bits (constant number of words) of memory. [2]

### Recurrence relation

The Thue–Morse sequence is the sequence tn satisfying the recurrence relation

{\displaystyle {\begin{aligned}t_{0}&=0,\\t_{2n}&=t_{n},\\t_{2n+1}&=1-t_{n},\end{aligned}}}

for all non-negative integers n. [1]

### L-system

The Thue–Morse sequence is a morphic word: [3] it is the output of the following Lindenmayer system:

Variables 0, 1 None 0 (0 → 01), (1 → 10)

### Characterization using bitwise negation

The Thue–Morse sequence in the form given above, as a sequence of bits, can be defined recursively using the operation of bitwise negation. So, the first element is 0. Then once the first 2n elements have been specified, forming a string s, then the next 2n elements must form the bitwise negation of s. Now we have defined the first 2n+1 elements, and we recurse.

Spelling out the first few steps in detail:

• We start with 0.
• The bitwise negation of 0 is 1.
• Combining these, the first 2 elements are 01.
• The bitwise negation of 01 is 10.
• Combining these, the first 4 elements are 0110.
• The bitwise negation of 0110 is 1001.
• Combining these, the first 8 elements are 01101001.
• And so on.

So

• T0 = 0.
• T1 = 01.
• T2 = 0110.
• T3 = 01101001.
• T4 = 0110100110010110.
• T5 = 01101001100101101001011001101001.
• T6 = 0110100110010110100101100110100110010110011010010110100110010110.
• And so on.

### Infinite product

The sequence can also be defined by:

${\displaystyle \prod _{i=0}^{\infty }\left(1-x^{2^{i}}\right)=\sum _{j=0}^{\infty }(-1)^{t_{j}}x^{j},}$

where tj is the jth element if we start at j = 0.

## Properties

The Thue–Morse sequence contains many squares: instances of the string ${\displaystyle XX}$, where ${\displaystyle X}$ denotes the string ${\displaystyle A}$, ${\displaystyle {\overline {A}}}$, ${\displaystyle A{\overline {A}}A}$, or ${\displaystyle {\overline {A}}A{\overline {A}}}$, where ${\displaystyle A=T_{k}}$ for some ${\displaystyle k\geq 0}$ and ${\displaystyle {\overline {A}}}$ is the bitwise negation of ${\displaystyle A}$. [4] For instance, if ${\displaystyle k=0}$, then ${\displaystyle A=T_{0}=0}$. The square ${\displaystyle A{\overline {A}}AA{\overline {A}}A=010010}$ appears in ${\displaystyle T}$ starting at the 16th bit. Since all squares in ${\displaystyle T}$ are obtained by repeating one of these 4 strings, they all have length ${\displaystyle 2^{n}}$ or ${\displaystyle 3\cdot 2^{n}}$ for some ${\displaystyle n\geq 0}$. ${\displaystyle T}$ contains no cubes: instances of ${\displaystyle XXX}$. There are also no overlapping squares: instances of ${\displaystyle 0X0X0}$ or ${\displaystyle 1X1X1}$. [5] [6] The critical exponent of ${\displaystyle T}$ is 2. [7]

The Thue–Morse sequence is a uniformly recurrent word: given any finite string X in the sequence, there is some length nX (often much longer than the length of X) such that X appears in every block of length nX. [8] [9] Notably, the Thue-Morse sequence is uniformly recurrent without being either periodic or eventually periodic (i.e., periodic after some initial nonperiodic segment). [10]

The sequence T2n is a palindrome for any n. Furthermore, let qn be a word obtained by counting the ones between consecutive zeros in T2n . For instance, q1 = 2 and q2 = 2102012. Since Tn does not contain overlapping squares, the words qn are palindromic squarefree words.

The Thue–Morse morphism μ is defined on alphabet {0,1} by the substitution map μ(0) = 01, μ(1) = 10: every 0 in a sequence is replaced with 01 and every 1 with 10. [11] If T is the Thue–Morse sequence, then μ(T) is also T. Thus, T is a fixed point of μ. The morphism μ is a prolongable morphism on the free monoid {0,1} with T as fixed point: T is essentially the only fixed point of μ; the only other fixed point is the bitwise negation of T, which is simply the Thue–Morse sequence on (1,0) instead of on (0,1). This property may be generalized to the concept of an automatic sequence.

The generating series of T over the binary field is the formal power series

${\displaystyle t(Z)=\sum _{n=0}^{\infty }T(n)Z^{n}\ .}$

This power series is algebraic over the field of rational functions, satisfying the equation [12]

${\displaystyle Z+(1+Z)^{2}t+(1+Z)^{3}t^{2}=0}$

### In combinatorial game theory

The set of evil numbers (numbers ${\displaystyle n}$ with ${\displaystyle t_{n}=0}$) forms a subspace of the nonnegative integers under nim-addition (bitwise exclusive or). For the game of Kayles, evil nim-values occur for few (finitely many) positions in the game, with all remaining positions having odious nim-values.

### The Prouhet–Tarry–Escott problem

The Prouhet–Tarry–Escott problem can be defined as: given a positive integer N and a non-negative integer k, partition the set S = { 0, 1, ..., N-1 } into two disjoint subsets S0 and S1 that have equal sums of powers up to k, that is:

${\displaystyle \sum _{x\in S_{0}}x^{i}=\sum _{x\in S_{1}}x^{i}}$ for all integers i from 1 to k.

This has a solution if N is a multiple of 2k+1, given by:

• S0 consists of the integers n in S for which tn = 0,
• S1 consists of the integers n in S for which tn = 1.

For example, for N = 8 and k = 2,

0 + 3 + 5 + 6 = 1 + 2 + 4 + 7,
02 + 32 + 52 + 62 = 12 + 22 + 42 + 72.

The condition requiring that N be a multiple of 2k+1 is not strictly necessary: there are some further cases for which a solution exists. However, it guarantees a stronger property: if the condition is satisfied, then the set of kth powers of any set of N numbers in arithmetic progression can be partitioned into two sets with equal sums. This follows directly from the expansion given by the binomial theorem applied to the binomial representing the nth element of an arithmetic progression.

For generalizations of the Thue–Morse sequence and the Prouhet–Tarry–Escott problem to partitions into more than two parts, see Bolker, Offner, Richman and Zara, "The Prouhet–Tarry–Escott problem and generalized Thue–Morse sequences". [13]

### Fractals and turtle graphics

Using turtle graphics, a curve can be generated if an automaton is programmed with a sequence. When Thue–Morse sequence members are used in order to select program states:

• If t(n) = 0, move ahead by one unit,
• If t(n) = 1, rotate by an angle of π/3 radians (60°)

The resulting curve converges to the Koch curve, a fractal curve of infinite length containing a finite area. This illustrates the fractal nature of the Thue–Morse Sequence. [14]

It is also possible to draw the curve precisely using the following instructions: [15]

• If t(n) = 0, rotate by an angle of π radians (180°),
• If t(n) = 1, move ahead by one unit, then rotate by an angle of π/3 radians.

### Equitable sequencing

In their book on the problem of fair division, Steven Brams and Alan Taylor invoked the Thue–Morse sequence but did not identify it as such. When allocating a contested pile of items between two parties who agree on the items' relative values, Brams and Taylor suggested a method they called balanced alternation, or taking turns taking turns taking turns . . . , as a way to circumvent the favoritism inherent when one party chooses before the other. An example showed how a divorcing couple might reach a fair settlement in the distribution of jointly-owned items. The parties would take turns to be the first chooser at different points in the selection process: Ann chooses one item, then Ben does, then Ben chooses one item, then Ann does. [16]

Lionel Levine and Katherine E. Stange, in their discussion of how to fairly apportion a shared meal such as an Ethiopian dinner, proposed the Thue–Morse sequence as a way to reduce the advantage of moving first. They suggested that “it would be interesting to quantify the intuition that the Thue–Morse order tends to produce a fair outcome.” [17]

Robert Richman addressed this problem, but he too did not identify the Thue–Morse sequence as such at the time of publication. [18] He presented the sequences Tn as step functions on the interval [0,1] and described their relationship to the Walsh and Rademacher functions. He showed that the nth derivative can be expressed in terms of Tn. As a consequence, the step function arising from Tn is orthogonal to polynomials of order n  1. A consequence of this result is that a resource whose value is expressed as a monotonically decreasing continuous function is most fairly allocated using a sequence that converges to Thue–Morse as the function becomes flatter. An example showed how to pour cups of coffee of equal strength from a carafe with a nonlinear concentration gradient, prompting a whimsical article in the popular press. [19]

Joshua Cooper and Aaron Dutle showed why the Thue–Morse order provides a fair outcome for discrete events. [20] They considered the fairest way to stage a Galois duel, in which each of the shooters has equally poor shooting skills. Cooper and Dutle postulated that each dueler would demand a chance to fire as soon as the other's a priori probability of winning exceeded their own. They proved that, as the duelers’ hitting probability approaches zero, the firing sequence converges to the Thue–Morse sequence. In so doing, they demonstrated that the Thue–Morse order produces a fair outcome not only for sequences Tn of length 2n, but for sequences of any length.

Thus the mathematics supports using the Thue–Morse sequence instead of alternating turns when the goal is fairness but earlier turns differ monotonically from later turns in some meaningful quality, whether that quality varies continuously [18] or discretely. [20]

Sports competitions form an important class of equitable sequencing problems, because strict alternation often gives an unfair advantage to one team. Ignacio Palacios-Huerta proposed changing the sequential order to Thue–Morse to improve the ex post fairness of various tournament competitions, such as the kicking sequence of a penalty shoot-out in soccer. [21] He did a set of field experiments with pro players and found that the team kicking first won 60% of games using ABAB (or T1), 54% using ABBA (or T2), and 51% using full Thue–Morse (or Tn).  As a result, ABBA is undergoing extensive trials in FIFA (European and World Championships) and English Federation professional soccer (EFL Cup). [22] An ABBA serving pattern has also been found to improve the fairness of tennis tie-breaks. [23] In competitive rowing, T2 is the only arrangement of port- and starboard-rowing crew members that eliminates transverse forces (and hence sideways wiggle) on a four-membered coxless racing boat, while T3 is one of only four rigs to avoid wiggle on an eight-membered boat. [24]

Fairness is especially important in player drafts. Many professional sports leagues attempt to achieve competitive parity by giving earlier selections in each round to weaker teams. By contrast, fantasy football leagues have no pre-existing imbalance to correct, so they often use a “snake” draft (forward, backward, etc.; or T1). [25] Ian Allan argued that a “third-round reversal” (forward, backward, backward, forward, etc.; or T2) would be even more fair. [26] Richman suggested that the fairest way for “captain A” and “captain B” to choose sides for a pick-up game of basketball mirrors T3: captain A has the first, fourth, sixth, and seventh choices, while captain B has the second, third, fifth, and eighth choices. [18]

### Hash collisions

The initial 2k bits of the Thue–Morse sequence are mapped to 0 by a wide class of polynomial hash functions modulo a power of two, which can lead to hash collisions. [27]

## History

The Thue–Morse sequence was first studied by Eugène Prouhet  [ fr ] in 1851, [28] who applied it to number theory. However, Prouhet did not mention the sequence explicitly; this was left to Axel Thue in 1906, who used it to found the study of combinatorics on words. The sequence was only brought to worldwide attention with the work of Marston Morse in 1921, when he applied it to differential geometry. The sequence has been discovered independently many times, not always by professional research mathematicians; for example, Max Euwe, a chess grandmaster and mathematics teacher, discovered it in 1929 in an application to chess: by using its cube-free property (see above), he showed how to circumvent the threefold repetition rule aimed at preventing infinitely protracted games by declaring repetition of moves a draw. At the time, consecutive identical board states were required to trigger the rule; the rule was later amended to the same board position being repeated three times in a row at any point, as the sequence shows that the consecutive criterion can be evaded forever.

## Notes

1. Allouche & Shallit (2003 , p. 15)
2. Lothaire (2011 , p. 11)
3. Lothaire (2011 , p. 113)
4. Pytheas Fogg (2002 , p. 103)
5. Lothaire (2011 , p. 30)
6. Lothaire (2011 , p. 31)
7. Berstel et al. (2009 , p. 70)
8. Berstel et al. (2009 , p. 80)
9. Abel, Zachary (January 23, 2012). "Thue-Morse Navigating Turtles". Three-Cornered Things.
10. "Fantasy Draft Types". NFL.com . Archived from the original on October 12, 2018.
11. Allan, Ian (July 16, 2014). "Third-Round Reversal Drafts". Fantasy Index. Retrieved September 1, 2020.
12. Pachocki, Jakub; Radoszewski, Jakub (2013). "Where to Use and How not to Use Polynomial String Hashing" (PDF). Olympiads in Informatics. 7: 90–100.
13. Fredricksen, Harold (1992). "Gray codes and the Thue-Morse-Hedlund sequence". Journal of Combinatorial Mathematics and Combinatorial Computing (JCMCC). Naval Postgraduate School, Department of Mathematics, Monterey, California, USA. 11: 3–11.
14. Erickson, John (2018-10-30). "On the Asymptotic Relative Change for Sequences of Permutations" . Retrieved 2021-01-31.
15. Plousos, George (2020-06-21). "What is the relationship between the Gray code and the Thue–Morse sequence". Quora. Archived from the original on 2020-12-17. Retrieved 2021-01-31.

## Related Research Articles

In abstract algebra, a semiring is an algebraic structure similar to a ring, but without the requirement that each element must have an additive inverse.

In abstract algebra, the free monoid on a set is the monoid whose elements are all the finite sequences of zero or more elements from that set, with string concatenation as the monoid operation and with the unique sequence of zero elements, often called the empty string and denoted by ε or λ, as the identity element. The free monoid on a set A is usually denoted A. The free semigroup on A is the subsemigroup of A containing all elements except the empty string. It is usually denoted A+.

In mathematics, the Champernowne constantC10 is a transcendental real constant whose decimal expansion has important properties. It is named after economist and mathematician D. G. Champernowne, who published it as an undergraduate in 1933.

In mathematics, the Prouhet–Thue–Morse constant, named for Eugène Prouhet, Axel Thue, and Marston Morse, is the number—denoted by τ—whose binary expansion 0.01101001100101101001011001101001... is given by the Thue–Morse sequence. That is,

In mathematics, a Sturmian word, named after Jacques Charles François Sturm, is a certain kind of infinitely long sequence of characters. Such a sequence can be generated by considering a game of English billiards on a square table. The struck ball will successively hit the vertical and horizontal edges labelled 0 and 1 generating a sequence of letters. This sequence is a Sturmian word.

In number theory, an odious number is a positive integer that has an odd number of 1s in its binary expansion.

In combinatorics, a squarefree word is a word that does not contain any squares. A square is a word of the form XX, where X is not empty. Thus, a squarefree word can also be defined as a word that avoids the pattern XX.

In mathematics and theoretical computer science, an automatic sequence is an infinite sequence of terms characterized by a finite automaton. The n-th term of an automatic sequence a(n) is a mapping of the final state reached in a finite automaton accepting the digits of the number n in some fixed base k.

Combinatorics on words is a fairly new field of mathematics, branching from combinatorics, which focuses on the study of words and formal languages. The subject looks at letters or symbols, and the sequences they form. Combinatorics on words affects various areas of mathematical study, including algebra and computer science. There have been a wide range of contributions to the field. Some of the first work was on square-free words by Axel Thue in the early 1900s. He and colleagues observed patterns within words and tried to explain them. As time went on, combinatorics on words became useful in the study of algorithms and coding. It led to developments in abstract algebra and answering open questions.

In the mathematical theory of non-standard positional numeral systems, the Komornik–Loreti constant is a mathematical constant that represents the smallest base q for which the number 1 has a unique representation, called its q-development. The constant is named after Vilmos Komornik and Paola Loreti, who defined it in 1998.

In computer science, the complexity function of a word or string is the function that counts the number of distinct factors of that string. More generally, the complexity function of a formal language counts the number of distinct words of given length.

In mathematics, a factorisation of a free monoid is a sequence of subsets of words with the property that every word in the free monoid can be written as a concatenation of elements drawn from the subsets. The Chen–Fox–Lyndon theorem states that the Lyndon words furnish a factorisation. The Schützenberger theorem relates the definition in terms of a multiplicative property to an additive property.

In mathematics and computer science, a morphic word or substitutive word is an infinite sequence of symbols which is constructed from a particular class of endomorphism of a free monoid.

In mathematics, a sesquipower or Zimin word is a string over an alphabet with identical prefix and suffix. Sesquipowers are unavoidable patterns, in the sense that all sufficiently long strings contain one.

In mathematics and theoretical computer science, a pattern is an unavoidable pattern if it is unavoidable on any finite alphabet.

In mathematics, a recurrent word or sequence is an infinite word over a finite alphabet in which every factor occurs infinitely many times. An infinite word is recurrent if and only if it is a sesquipower.

In mathematics and computer science, the critical exponent of a finite or infinite sequence of symbols over a finite alphabet describes the largest number of times a contiguous subsequence can be repeated. For example, the critical exponent of "Mississippi" is 7/3, as it contains the string "ississi", which is of length 7 and period 3.

In mathematics, Ostrowski numeration, named after Alexander Ostrowski, is either of two related numeration systems based on continued fractions: a non-standard positional numeral system for integers and a non-integer representation of real numbers.

In mathematics and theoretical computer science, a k-regular sequence is a sequence satisfying linear recurrence equations that reflect the base-k representations of the integers. The class of k-regular sequences generalizes the class of k-automatic sequences to alphabets of infinite size.

Gesine Reinert is a University Professor in Statistics at the University of Oxford. She is a Fellow of Keble College, Oxford, a Fellow of the Alan Turing Institute, and a Fellow of the Institute of Mathematical Statistics. Her research concerns the probability theory and statistics of biological sequences and biological networks.