Does the Collatz sequence eventually reach 1 for all positive integer initial values?
The Collatz conjecture is one of the most famous unsolved problems in mathematics. The conjecture asks whether repeating two simple arithmetic operations will eventually transform every positive integer into 1. It concerns sequences of integers in which each term is obtained from the previous term as follows: if the previous term is even, the next term is one half of the previous term. If the previous term is odd, the next term is 3 times the previous term plus 1. The conjecture is that these sequences always reach 1, no matter which positive integer is chosen to start the sequence.
It is named after mathematician Lothar Collatz, who introduced the idea in 1937, two years after receiving his doctorate. 3n + 1 problem, the 3n + 1 conjecture, the Ulam conjecture (after Stanisław Ulam), Kakutani's problem (after Shizuo Kakutani), the Thwaites conjecture (after Sir Bryan Thwaites), Hasse's algorithm (after Helmut Hasse), or the Syracuse problem. The sequence of numbers involved is sometimes referred to as the hailstone sequence, hailstone numbers or hailstone numerals (because the values are usually subject to multiple descents and ascents like hailstones in a cloud), or as wondrous numbers.It is also known as the
Paul Erdős said about the Collatz conjecture: "Mathematics may not be ready for such problems."He also offered US$500 for its solution. Jeffrey Lagarias stated in 2010 that the Collatz conjecture "is an extraordinarily difficult problem, completely out of reach of present day mathematics".
Consider the following operation on an arbitrary positive integer:
In modular arithmetic notation, define the function f as follows:
Now form a sequence by performing this operation repeatedly, beginning with any positive integer, and taking the result at each step as the input at the next.
(that is: ai is the value of f applied to n recursively i times; ai = fi(n)).
The Collatz conjecture is: This process will eventually reach the number 1, regardless of which positive integer is chosen initially.
If the conjecture is false, it can only be because there is some starting number which gives rise to a sequence that does not contain 1. Such a sequence would either enter a repeating cycle that excludes 1, or increase without bound. No such sequence has been found.
The smallest i such that ai < a0 is called the stopping time of n. Similarly, the smallest k such that ak = 1 is called the total stopping time of n. If one of the indexes i or k doesn't exist, we say that the stopping time or the total stopping time, respectively, is infinite.
The Collatz conjecture asserts that the total stopping time of every n is finite. It is also equivalent to saying that every n ≥ 2 has a finite stopping time.
Since 3n + 1 is even whenever n is odd, one may instead use the "shortcut" form of the Collatz function:
This definition yields smaller values for the stopping time and total stopping time without changing the overall dynamics of the process.
For instance, starting with n = 12 and applying the function f without "shortcut", one gets the sequence 12, 6, 3, 10, 5, 16, 8, 4, 2, 1.
The number n = 19 takes longer to reach 1: 19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1.
The sequence for n = 27, listed and graphed below, takes 111 steps (41 steps through odd numbers, in bold), climbing as high as 9232 before descending to 1.
Numbers with a total stopping time longer than that of any smaller starting value form a sequence beginning with:
The starting values whose maximum trajectory point is greater than that of any smaller starting value are as follows:
Number of steps for n to reach 1 are
The starting value having the largest total stopping time while being
These numbers are the lowest ones with the indicated step count, but not necessarily the only ones below the given limit. As an example, 9780657631 has 1132 steps, as does 9780657630.
The starting values having the smallest total stopping time with respect to their number of digits (in base 2) are the powers of two since 2n is halved n times to reach 1, and is never increased.
Although the conjecture has not been proven, most mathematicians who have looked into the problem think the conjecture is true because experimental evidence and heuristic arguments support it.
As of 2020 [update] , the conjecture has been checked by computer for all starting values up to 268 ≈ 2.95×1020. All initial values tested so far eventually end in the repeating cycle (4; 2; 1) of period 3.
This computer evidence is not sufficient to prove that the conjecture is true for all starting values. As in the case of some disproved conjectures, like the Pólya conjecture, counterexamples might be found when considering very large numbers.
However, such verifications may have other implications. For example, one can derive additional constraints on the period and structural form of a non-trivial cycle.
If one considers only the odd numbers in the sequence generated by the Collatz process, then each odd number is on average 3/4 of the previous one. (More precisely, the geometric mean of the ratios of outcomes is 3/4.) This yields a heuristic argument that every Hailstone sequence should decrease in the long run, although this is not evidence against other cycles, only against divergence. The argument is not a proof because it assumes that Hailstone sequences are assembled from uncorrelated probabilistic events. (It does rigorously establish that the 2-adic extension of the Collatz process has two division steps for every multiplication step for almost all 2-adic starting values.)
As proven by Riho Terras, almost every positive integer has a finite stopping time.In other words, almost every Collatz sequence reaches a point that is strictly below its initial value. The proof is based on the distribution of parity vectors and uses the central limit theorem.
In 2019, Terence Tao improved this result by showing, using logarithmic density, that almost all Collatz orbits are descending below any given function of the starting point, provided that this function diverges to infinity, no matter how slowly. Responding to this work, Quanta Magazine wrote that Tao "came away with one of the most significant results on the Collatz conjecture in decades".
In a computer-aided proof, Krasikov and Lagarias showed that the number of integers in the interval [1,x] that eventually reach 1 is at least equal to x0.84 for all sufficiently large x.
In this part, consider the shortcut form of the Collatz function
A cycle is a sequence (a0, a1, ..., aq) of distinct positive integers where f(a0) = a1, f(a1) = a2, ..., and f(aq) = a0.
The only known cycle is (1,2) of period 2, called the trivial cycle.
The length of a non-trivial cycle is known to be at least 17087915. In fact, Eliahou (1993) proved that the period p of any non-trivial cycle is of the form
where a, b and c are non-negative integers, b ≥ 1 and ac = 0. This result is based on the continued fraction expansion of ln 3/ln 2.
A k-cycle is a cycle that can be partitioned into 2k contiguous subsequences: k increasing sequences of odd numbers alternating with k decreasing sequences of even numbers. For instance, if the cycle consists of a single increasing sequence of odd numbers followed by a decreasing sequence of even numbers, it is called a 1-cycle.
Steiner (1977) proved that there is no 1-cycle other than the trivial (1; 2). Simons (2005) used Steiner's method to prove that there is no 2-cycle. Simons & de Weger (2005) extended this proof up to 68-cycles: there is no k-cycle up to k = 68. For each k beyond 68, this method gives an upper bound for the smallest term of a k-cycle: for example, if there is a 77-cycle, then at least one element of the cycle is less than 38137×250. Along with the verification of the conjecture up to 268, this implies the nonexistence of a non-trivial k-cycle up to k = 77. As exhaustive computer searches continue, larger k values may be ruled out. To state the argument more intuitively: we need not look for cycles that have at most 77 circuits, where each circuit consists of consecutive ups followed by consecutive downs.
There is another approach to prove the conjecture, which considers the bottom-up method of growing the so-called Collatz graph. The Collatz graph is a graph defined by the inverse relation
So, instead of proving that all positive integers eventually lead to 1, we can try to prove that 1 leads backwards to all positive integers. For any integer n, n ≡ 1 (mod 2) if and only if 3n + 1 ≡ 4 (mod 6). Equivalently, n − 1/3 ≡ 1 (mod 2) if and only if n ≡ 4 (mod 6). Conjecturally, this inverse relation forms a tree except for the 1–2–4 loop (the inverse of the 4–2–1 loop of the unaltered function f defined in the Statement of the problem section of this article).
When the relation 3n + 1 of the function f is replaced by the common substitute "shortcut" relation 3n + 1/2, the Collatz graph is defined by the inverse relation,
For any integer n, n ≡ 1 (mod 2) if and only if 3n + 1/2 ≡ 2 (mod 3). Equivalently, 2n − 1/3 ≡ 1 (mod 2) if and only if n ≡ 2 (mod 3). Conjecturally, this inverse relation forms a tree except for a 1–2 loop (the inverse of the 1–2 loop of the function f(n) revised as indicated above).
Alternatively, replace the 3n + 1 with n′/H(n′) where n′ = 3n + 1 and H(n′) is the highest power of 2 that divides n′ (with no remainder). The resulting function f maps from odd numbers to odd numbers. Now suppose that for some odd number n, applying this operation k times yields the number 1 (that is, fk(n) = 1). Then in binary, the number n can be written as the concatenation of strings wkwk−1 ... w1 where each wh is a finite and contiguous extract from the representation of 1/3h. The representation of n therefore holds the repetends of 1/3h, where each repetend is optionally rotated and then replicated up to a finite number of bits. It is only in binary that this occurs. Conjecturally, every binary string s that ends with a '1' can be reached by a representation of this form (where we may add or delete leading '0's to s).
Repeated applications of the Collatz function can be represented as an abstract machine that handles strings of bits. The machine will perform the following three steps on any odd number until only one 1 remains:
The starting number 7 is written in base two as 111. The resulting Collatz sequence is:
111 1111 1011
010111 10001 0100011 1101 0011011 101 0001011 1 0000
For this section, consider the Collatz function in the slightly modified form
This can be done because when n is odd, 3n + 1 is always even.
If P(...) is the parity of a number, that is P(2n) = 0 and P(2n + 1) = 1, then we can define the Collatz parity sequence (or parity vector) for a number n as pi = P(ai), where a0 = n, and ai+1 = f(ai).
Which operation is performed, 3n + 1/2 or n/2, depends on the parity. The parity sequence is the same as the sequence of operations.
Using this form for f(n), it can be shown that the parity sequences for two numbers m and n will agree in the first k terms if and only if m and n are equivalent modulo 2k. This implies that every number is uniquely identified by its parity sequence, and moreover that if there are multiple Hailstone cycles, then their corresponding parity cycles must be different.
Applying the f function k times to the number n = 2ka + b will give the result 3ca + d, where d is the result of applying the f function k times to b, and c is how many increases were encountered during that sequence. For example, for 25a + 1 there are 3 increases as 1 iterates to 2, 1, 2, 1, and finally to 2 so the result is 33a + 2; for 22a + 1 there is only 1 increase as 1 rises to 2 and falls to 1 so the result is 3a + 1. When b is 2k − 1 then there will be k rises and the result will be 2 × 3ka − 1. The factor of 3 multiplying a is independent of the value of a; it depends only on the behavior of b. This allows one to predict that certain forms of numbers will always lead to a smaller number after a certain number of iterations: for example, 4a + 1 becomes 3a + 1 after two applications of f and 16a + 3 becomes 9a + 2 after 4 applications of f. Whether those smaller numbers continue to 1, however, depends on the value of a.
For the Collatz function in the form
Hailstone sequences can be computed by the extremely simple 2-tag system with production rules
In this system, the positive integer n is represented by a string of n copies of a, and iteration of the tag operation halts on any word of length less than 2. (Adapted from De Mol.)
The Collatz conjecture equivalently states that this tag system, with an arbitrary finite string of a as the initial word, eventually halts (see Tag system#Example: Computation of Collatz sequences for a worked example).
An extension to the Collatz conjecture is to include all integers, not just positive integers. Leaving aside the cycle 0 → 0 which cannot be entered from outside, there are a total of four known cycles, which all nonzero integers seem to eventually fall into under iteration of f. These cycles are listed here, starting with the well-known cycle for positive n:
Odd values are listed in large bold. Each cycle is listed with its member of least absolute value (which is always odd) first.
|Cycle||Odd-value cycle length||Full cycle length|
|1 → 4 → 2 → 1...||1||3|
|−1 → −2 → −1...||1||2|
|−5 → −14 → −7 → −20 → −10 → −5...||2||5|
|−17 → −50 → −25 → −74 → −37 → −110 → −55 → −164 → −82 → −41 → −122 → −61 → −182 → −91 → −272 → −136 → −68 → −34 → −17...||7||18|
The generalized Collatz conjecture is the assertion that every integer, under iteration by f, eventually falls into one of the four cycles above or the cycle 0 → 0. (The 0 → 0 cycle is only included for the sake of completeness.)
The Collatz map can be extended to (positive or negative) rational numbers which have odd denominators when written in lowest terms. The number is taken to be 'odd' or 'even' according to whether its numerator is odd or even. Then the formula for the map is exactly the same as when the domain is the integers: an 'even' such rational is divided by 2; an 'odd' such rational is multiplied by 3 and then 1 is added. A closely related fact is that the Collatz map extends to the ring of 2-adic integers, which contains the ring of rationals with odd denominators as a subring.
When using the "shortcut" definition of the Collatz map, it is known that any periodic parity sequence is generated by exactly one rational.Conversely, it is conjectured that every rational with an odd denominator has an eventually cyclic parity sequence (Periodicity Conjecture ).
If a parity cycle has length n and includes odd numbers exactly m times at indices k0 < ⋯ < km−1, then the unique rational which generates immediately and periodically this parity cycle is
For example, the parity cycle (1 0 1 1 0 0 1) has length 7 and four odd terms at indices 0, 2, 3, and 6. It is repeatedly generated by the fraction
as the latter leads to the rational cycle
Any cyclic permutation of (1 0 1 1 0 0 1) is associated to one of the above fractions. For instance, the cycle (0 1 1 0 0 1 1) is produced by the fraction
For a one-to-one correspondence, a parity cycle should be irreducible, that is, not partitionable into identical sub-cycles. As an illustration of this, the parity cycle (1 1 0 0 1 1 0 0) and its sub-cycle (1 1 0 0) are associated to the same fraction 5/7 when reduced to lowest terms.
In this context, assuming the validity of the Collatz conjecture implies that (1 0) and (0 1) are the only parity cycles generated by positive whole numbers (1 and 2, respectively).
If the odd denominator d of a rational is not a multiple of 3, then all the iterates have the same denominator and the sequence of numerators can be obtained by applying the "3n + d " generalization of the Collatz function
is well-defined on the ring of 2-adic integers, where it is continuous and measure-preserving with respect to the 2-adic measure. Moreover, its dynamics is known to be ergodic.
Define the parity vector function Q acting on as
The function Q is a 2-adic isometry. Consequently, every infinite parity sequence occurs for exactly one 2-adic integer, so that almost all trajectories are acyclic in .
An equivalent formulation of the Collatz conjecture is that
The Collatz map (with shortcut) can be viewed as the restriction to the integers of the smooth map
The iterations of this map on the real line lead to a dynamical system, further investigated by Chamberland. f has two attracting cycles of period 2, (1; 2) and (1.1925...; 2.1386...). Moreover, the set of unbounded orbits is conjectured to be of measure 0.He showed that the conjecture does not hold for positive real numbers since there are infinitely many fixed points, as well as orbits escaping monotonically to infinity. The function
Letherman, Schleicher, and Wood extended the study to the complex plane, where most of the points have orbits that diverge to infinity (colored region on the illustration). f, is a fractal pattern, sometimes called the "Collatz fractal".The boundary between the colored region and the black components, namely the Julia set of
The section As a parity sequence above gives a way to speed up simulation of the sequence. To jump ahead k steps on each iteration (using the f function from that section), break up the current number into two parts, b (the k least significant bits, interpreted as an integer), and a (the rest of the bits as an integer). The result of jumping ahead k steps can be found as:
The c (or better 3c) and d arrays are precalculated for all possible k-bit numbers b, where d(b) is the result of applying the f function k times to b, and c(b) is the number of odd numbers encountered on the way. For example, if k = 5, one can jump ahead 5 steps on each iteration by separating out the 5 least significant bits of a number and using:
This requires 2k precomputation and storage to speed up the resulting calculation by a factor of k, a space–time tradeoff.
For the special purpose of searching for a counterexample to the Collatz conjecture, this precomputation leads to an even more important acceleration, used by Tomás Oliveira e Silva in his computational confirmations of the Collatz conjecture up to large values of n. If, for some given b and k, the inequality
holds for all a, then the first counterexample, if it exists, cannot be b modulo 2k. For instance, the first counterexample must be odd because f(2n) = n, smaller than 2n; and it must be 3 mod 4 because f2(4n + 1) = 3n + 1, smaller than 4n + 1. For each starting value a which is not a counterexample to the Collatz conjecture, there is a k for which such an inequality holds, so checking the Collatz conjecture for one starting value is as good as checking an entire congruence class. As k increases, the search only needs to check those residues b that are not eliminated by lower values of k. Only an exponentially small fraction of the residues survive. For example, the only surviving residues mod 32 are 7, 15, 27, and 31.
If k is an odd integer, then 3k + 1 is even, so 3k + 1 = 2ak′ with k′ odd and a ≥ 1. The Syracuse function is the function f from the set I of odd integers into itself, for which f(k) = k′(sequence A075677 in the OEIS ).
Some properties of the Syracuse function are:
The Collatz conjecture is equivalent to the statement that, for all k in I, there exists an integer n ≥ 1 such that fn(k) = 1.
In 1972, John Horton Conway proved that a natural generalization of the Collatz problem is algorithmically undecidable.
Specifically, he considered functions of the form
and a0, b0, ..., aP − 1, bP − 1 are rational numbers which are so chosen that g(n) is always an integer.
The standard Collatz function is given by P = 2, a0 = 1/2, b0 = 0, a1 = 3, b1 = 1. Conway proved that the problem:
is undecidable, by representing the halting problem in this way.
Closer to the Collatz problem is the following universally quantified problem:
Modifying the condition in this way can make a problem either harder or easier to solve (intuitively, it is harder to justify a positive answer but might be easier to justify a negative one). Kurtz and Simon Π0
2-complete. This hardness result holds even if one restricts the class of functions g by fixing the modulus P to 6480.
In the movie Incendies , a graduate student in pure mathematics explains the Collatz conjecture to a group of undergraduates. She puts her studies on hold for a time to address some unresolved questions about her family's past. Late in the movie, the Collatz conjecture turns out to have foreshadowed a disturbing and difficult discovery that she makes about her family.
In mathematics, the Chinese remainder theorem states that if one knows the remainders of the Euclidean division of an integer n by several integers, then one can determine uniquely the remainder of the division of n by the product of these integers, under the condition that the divisors are pairwise coprime.
In mathematics, the Fibonacci numbers, commonly denoted Fn , form a sequence, the Fibonacci sequence, in which each number is the sum of the two preceding ones. The sequence commonly starts from 0 and 1, although some authors omit the initial terms and start the sequence from 1 and 1 or from 1 and 2. Starting from 0 and 1, the next few values in the sequence are:
In number theory, the Legendre symbol is a multiplicative function with values 1, −1, 0 that is a quadratic character modulo an odd prime number p: its value at a (nonzero) quadratic residue mod p is 1 and at a non-quadratic residue (non-residue) is −1. Its value at zero is 0.
In mathematics, the Euler numbers are a sequence En of integers defined by the Taylor series expansion
In mathematics, a Fermat number, named after Pierre de Fermat, who first studied them, is a positive integer of the form
The Jacobi symbol is a generalization of the Legendre symbol. Introduced by Jacobi in 1837, it is of theoretical interest in modular arithmetic and other branches of number theory, but its main use is in computational number theory, especially primality testing and integer factorization; these in turn are important in cryptography.
In number theory, Euler's criterion is a formula for determining whether an integer is a quadratic residue modulo a prime. Precisely,
In modular arithmetic, a number g is a primitive root modulo n if every number a coprime to n is congruent to a power of g modulo n. That is, g is a primitive root modulo n if for every integer a coprime to n, there is some integer k for which gk ≡ a. Such a value k is called the index or discrete logarithm of a to the base g modulo n. So g is a primitive root modulo n if and only if g is a generator of the multiplicative group of integers modulo n.
In theoretical computer science and cryptography, a trapdoor function is a function that is easy to compute in one direction, yet difficult to compute in the opposite direction without special information, called the "trapdoor". Trapdoor functions are a special case of one-way functions and are widely used in public-key cryptography.
In number theory, a Wall–Sun–Sun prime or Fibonacci–Wieferich prime is a certain kind of prime number which is conjectured to exist, although none are known.
In mathematics, the Lucas–Lehmer test (LLT) is a primality test for Mersenne numbers. The test was originally developed by Édouard Lucas in 1876 and subsequently improved by Derrick Henry Lehmer in the 1930s.
In mathematics a polydivisible number is a number in a given number base with digits abcde... that has the following properties:
In arithmetic and algebra, the cube of a number n is its third power, that is, the result of multiplying three instances of n together. The cube of a number or any other mathematical expression is denoted by a superscript 3, for example 23 = 8 or (x + 1)3.
In mathematics, Wolstenholme's theorem states that for a prime number , the congruence
In number theory, the nth Pisano period, written as π(n), is the period with which the sequence of Fibonacci numbers taken modulo n repeats. Pisano periods are named after Leonardo Pisano, better known as Fibonacci. The existence of periodic functions in Fibonacci numbers was noted by Joseph Louis Lagrange in 1774.
In additive number theory, Fermat's theorem on sums of two squares states that an odd prime p can be expressed as:
Gauss's lemma in number theory gives a condition for an integer to be a quadratic residue. Although it is not useful computationally, it has theoretical significance, being involved in some proofs of quadratic reciprocity.
Quartic or biquadratic reciprocity is a collection of theorems in elementary and algebraic number theory that state conditions under which the congruence x4 ≡ p is solvable; the word "reciprocity" comes from the form of some of these theorems, in that they relate the solvability of the congruence x4 ≡ p to that of x4 ≡ q.
In algebra, the 3x + 1 semigroup is a special subsemigroup of the multiplicative semigroup of all positive rational numbers. The elements of a generating set of this semigroup are related to the sequence of numbers involved in the still open Collatz conjecture or the "3x + 1 problem". The 3x + 1 semigroup has been used to prove a weaker form of the Collatz conjecture. In fact, it was in such context the concept of the 3x + 1 semigroup was introduced by H. Farkas in 2005. Various generalizations of the 3x + 1 semigroup have been constructed and their properties have been investigated.
In mathematics, specifically in number theory, Newman's conjecture is a conjecture about the behavior of the partition function modulo any integer. Specifically, it states that for any integers m and r such that , the value of the partition function satisfies the congruence for infinitely many non-negative integers n. It was formulated by mathematician Morris Newman in 1960. It is unsolved as of 2020.
The problem is also known by several other names, including: Ulam's conjecture, the Hailstone problem, the Syracuse problem, Kakutani's problem, Hasse's algorithm, and the Collatz problem.