In mathematics, the Cantor set is a set of points lying on a single line segment that has a number of unintuitive properties. It was discovered in 1874 by Henry John Stephen Smithand introduced by German mathematician Georg Cantor in 1883.
Through consideration of this set, Cantor and others helped lay the foundations of modern point-set topology. Although Cantor himself defined the set in a general, abstract way, the most common modern construction is the Cantor ternary set, built by removing the middle third of a line segment and then repeating the process with the remaining shorter segments. Cantor himself mentioned the ternary construction only in passing, as an example of a more general idea, that of a perfect set that is nowhere dense.
The Cantor ternary set is created by iteratively deleting the open middle third from a set of line segments. One starts by deleting the open middle third from the interval , leaving two line segments: . Next, the open middle third of each of these remaining segments is deleted, leaving four line segments: . This process is continued ad infinitum, where the nth set is
The Cantor ternary set contains all points in the interval that are not deleted at any step in this infinite process:
The first six steps of this process are illustrated below.
Using the idea of self-similar transformations, and the explicit closed formulas for the Cantor set are
where every middle third is removed as the open interval from the closed interval surrounding it, or
where the middle third of the foregoing closed interval is removed by intersecting with
This process of removing middle thirds is a simple example of a finite subdivision rule. The Cantor ternary set is an example of a fractal string.
In arithmetical terms, the Cantor set consists of all real numbers of the unit interval that do not require the digit 1 in order to be expressed as a ternary (base 3) fraction. As the above diagram illustrates, each point in the Cantor set is uniquely located by a path through an infinitely deep binary tree, where the path turns left or right at each level according to which side of a deleted segment the point lies on. Representing each left turn with 0 and each right turn with 2 yields the ternary fraction for a point.
Since the Cantor set is defined as the set of points not excluded, the proportion (i.e., measure) of the unit interval remaining can be found by total length removed. This total is the geometric progression
So that the proportion left is 1 − 1 = 0.
This calculation suggests that the Cantor set cannot contain any interval of non-zero length. It may seem surprising that there should be anything left—after all, the sum of the lengths of the removed intervals is equal to the length of the original interval. However, a closer look at the process reveals that there must be something left, since removing the "middle third" of each interval involved removing open sets (sets that do not include their endpoints). So removing the line segment (1/3, 2/3) from the original interval [0, 1] leaves behind the points 1/3 and 2/3. Subsequent steps do not remove these (or other) endpoints, since the intervals removed are always internal to the intervals remaining. So the Cantor set is not empty, and in fact contains an uncountably infinite number of points (as follows from the above description in terms of paths in an infinite binary tree).
It may appear that only the endpoints of the construction segments are left, but that is not the case either. The number 1/4, for example, has the unique ternary form 0.020202... = 0.02. It is in the bottom third, and the top third of that third, and the bottom third of that top third, and so on. Since it is never in one of the middle segments, it is never removed. Yet it is also not an endpoint of any middle segment, because it is not a multiple of any power of 1/3. All endpoints of segments are terminating ternary fractions and are contained in the set
which is a countably infinite set. As to cardinality, almost all elements of the Cantor set are not endpoints of intervals, nor rational points like 1/4. The whole Cantor set is in fact not countable.
It can be shown that there are as many points left behind in this process as there were to begin with, and that therefore, the Cantor set is uncountable. To see this, we show that there is a function f from the Cantor set to the closed interval [0,1] that is surjective (i.e. f maps from onto [0,1]) so that the cardinality of is no less than that of [0,1]. Since is a subset of [0,1], its cardinality is also no greater, so the two cardinalities must in fact be equal, by the Cantor–Bernstein–Schröder theorem.
To construct this function, consider the points in the [0, 1] interval in terms of base 3 (or ternary) notation. Recall that the proper ternary fractions, more precisely: the elements of , admit more than one representation in this notation, as for example 1/3, that can be written as 0.13 = 0.103, but also as 0.0222...3 = 0.023, and 2/3, that can be written as 0.23 = 0.203 but also as 0.1222...3 = 0.123. When we remove the middle third, this contains the numbers with ternary numerals of the form 0.1xxxxx...3 where xxxxx...3 is strictly between 00000...3 and 22222...3. So the numbers remaining after the first step consist of
This can be summarized by saying that those numbers with a ternary representation such that the first digit after the radix point is not 1 are the ones remaining after the first step.
The second step removes numbers of the form 0.01xxxx...3 and 0.21xxxx...3, and (with appropriate care for the endpoints) it can be concluded that the remaining numbers are those with a ternary numeral where neither of the first two digits is 1.
Continuing in this way, for a number not to be excluded at step n, it must have a ternary representation whose nth digit is not 1. For a number to be in the Cantor set, it must not be excluded at any step, it must admit a numeral representation consisting entirely of 0s and 2s.
It is worth emphasizing that numbers like 1, 1/3 = 0.13 and 7/9 = 0.213 are in the Cantor set, as they have ternary numerals consisting entirely of 0s and 2s: 1 = 0.222...3 = 0.23, 1/3 = 0.0222...3 = 0.023 and 7/9 = 0.20222...3 = 0.2023. All the latter numbers are “endpoints”, and these examples are right limit points of . The same is true for the left limit points of , e.g. 2/3 = 0.1222...3 = 0.123 = 0.203 and 8/9 = 0.21222...3 = 0.2123 = 0.2203. All these endpoints are proper ternary fractions (elements of ) of the form p/q, where denominator q is a power of 3 when the fraction is in its irreducible form. The ternary representation of these fractions terminates (i.e., is finite) or — recall from above that proper ternary fractions each have 2 representations — is infinite and “ends” in either infinitely many recurring 0s or infinitely many recurring 2s. Such a fraction is a left limit point of if its ternary representation contains no 1's and “ends” in infinitely many recurring 0s. Similarly, a proper ternary fraction is a right limit point of if it again its ternary expansion contains no 1's and “ends” in infinitely many recurring 2s.
This set of endpoints is dense in (but not dense in [0, 1]) and makes up a countably infinite set. The numbers in which are not endpoints also have only 0s and 2s in their ternary representation, but they cannot end in an infinite repetition of the digit 0, nor of the digit 2, because then it would be an endpoint.
The function from to [0,1] is defined by taking the ternary numerals that do consist entirely of 0s and 2s, replacing all the 2s by 1s, and interpreting the sequence as a binary representation of a real number. In a formula,
For any number y in [0,1], its binary representation can be translated into a ternary representation of a number x in by replacing all the 1s by 2s. With this, f(x) = y so that y is in the range of f. For instance if y = 3/5 = 0.100110011001...2 = 0.1001, we write x = 0.2002 = 0.200220022002...3 = 7/10. Consequently, f is surjective. However, f is not injective — the values for which f(x) coincides are those at opposing ends of one of the middle thirds removed. For instance, take
Thus there are as many points in the Cantor set as there are in the interval [0, 1] (which has the uncountable cardinality ). However, the set of endpoints of the removed intervals is countable, so there must be uncountably many numbers in the Cantor set which are not interval endpoints. As noted above, one example of such a number is 1/4, which can be written as 0.020202...3 = 0.02 in ternary notation. In fact, given any , there exist such that . This was first demonstrated by Steinhaus in 1917, who proved, via a geometric argument, the equivalent assertion that for every . Since this construction provides an injection from to , we have as an immediate corollary. Assuming that for any infinite set (a statement shown to be equivalent to the axiom of choice by Tarski), this provides another demonstration that .
The Cantor set contains as many points as the interval from which it is taken, yet itself contains no interval of nonzero length. The irrational numbers have the same property, but the Cantor set has the additional property of being closed, so it is not even dense in any interval, unlike the irrational numbers which are dense in every interval.
It has been conjectured that all algebraic irrational numbers are normal. Since members of the Cantor set are not normal, this would imply that all members of the Cantor set are either rational or transcendental.
The Cantor set is the prototype of a fractal. It is self-similar, because it is equal to two copies of itself, if each copy is shrunk by a factor of 3 and translated. More precisely, the Cantor set is equal to the union of two functions, the left and right self-similarity transformations of itself, and , which leave the Cantor set invariant up to homeomorphism:
Repeated iteration of and can be visualized as an infinite binary tree. That is, at each node of the tree, one may consider the subtree to the left or to the right. Taking the set together with function composition forms a monoid, the dyadic monoid.
The automorphisms of the binary tree are its hyperbolic rotations, and are given by the modular group. Thus, the Cantor set is a homogeneous space in the sense that for any two points and in the Cantor set , there exists a homeomorphism with . An explicit construction of can be described more easily if we see the Cantor set as a product space of countably many copies of the discrete space . Then the map defined by is an involutive homeomorphism exchanging and .
It has been found that some form of conservation law is always responsible behind scaling and self-similarity. In the case of Cantor set it can be seen that the th moment (where is the fractal dimension) of all the surviving intervals at any stage of the construction process is equal to constant which is equal to one in the case of Cantor set. We know that there are intervals of size present in the system at the th step of its construction. Then if we label the surviving intervals as then the th moment is since .
The Hausdorff dimension of the Cantor set is equal to ln(2)/ln(3) ≈ 0.631.
Although "the" Cantor set typically refers to the original, middle-thirds Cantor described above, topologists often talk about "a" Cantor set, which means any topological space that is homeomorphic (topologically equivalent) to it.
As the above summation argument shows, the Cantor set is uncountable but has Lebesgue measure 0. Since the Cantor set is the complement of a union of open sets, it itself is a closed subset of the reals, and therefore a complete metric space. Since it is also totally bounded, the Heine–Borel theorem says that it must be compact.
For any point in the Cantor set and any arbitrarily small neighborhood of the point, there is some other number with a ternary numeral of only 0s and 2s, as well as numbers whose ternary numerals contain 1s. Hence, every point in the Cantor set is an accumulation point (also called a cluster point or limit point) of the Cantor set, but none is an interior point. A closed set in which every point is an accumulation point is also called a perfect set in topology, while a closed subset of the interval with no interior points is nowhere dense in the interval.
Every point of the Cantor set is also an accumulation point of the complement of the Cantor set.
For any two points in the Cantor set, there will be some ternary digit where they differ — one will have 0 and the other 2. By splitting the Cantor set into "halves" depending on the value of this digit, one obtains a partition of the Cantor set into two closed sets that separate the original two points. In the relative topology on the Cantor set, the points have been separated by a clopen set. Consequently, the Cantor set is totally disconnected. As a compact totally disconnected Hausdorff space, the Cantor set is an example of a Stone space.
As a topological space, the Cantor set is naturally homeomorphic to the product of countably many copies of the space , where each copy carries the discrete topology. This is the space of all sequences in two digits
which can also be identified with the set of 2-adic integers. The basis for the open sets of the product topology are cylinder sets; the homeomorphism maps these to the subspace topology that the Cantor set inherits from the natural topology on the real number line. This characterization of the Cantor space as a product of compact spaces gives a second proof that Cantor space is compact, via Tychonoff's theorem.
From the above characterization, the Cantor set is homeomorphic to the p-adic integers, and, if one point is removed from it, to the p-adic numbers.
The Cantor set is a subset of the reals, which are a metric space with respect to the ordinary distance metric; therefore the Cantor set itself is a metric space, by using that same metric. Alternatively, one can use the p-adic metric on : given two sequences , the distance between them is , where is the smallest index such that ; if there is no such index, then the two sequences are the same, and one defines the distance to be zero. These two metrics generate the same topology on the Cantor set.
We have seen above that the Cantor set is a totally disconnected perfect compact metric space. Indeed, in a sense it is the only one: every nonempty totally disconnected perfect compact metric space is homeomorphic to the Cantor set. See Cantor space for more on spaces homeomorphic to the Cantor set.
The Cantor set is sometimes regarded as "universal" in the category of compact metric spaces, since any compact metric space is a continuous image of the Cantor set; however this construction is not unique and so the Cantor set is not universal in the precise categorical sense. The "universal" property has important applications in functional analysis, where it is sometimes known as the representation theorem for compact metric spaces.
For any integer q ≥ 2, the topology on the group G=Zqω (the countable direct sum) is discrete. Although the Pontrjagin dual Γ is also Zqω, the topology of Γ is compact. One can see that Γ is totally disconnected and perfect - thus it is homeomorphic to the Cantor set. It is easiest to write out the homeomorphism explicitly in the case q=2. (See Rudin 1962 p 40.)
The geometric mean of the Cantor set is approximately 0.274974. [ unreliable source? ]
The Cantor set can be seen as the compact group of binary sequences, and as such, it is endowed with a natural Haar measure. When normalized so that the measure of the set is 1, it is a model of an infinite sequence of coin tosses. Furthermore, one can show that the usual Lebesgue measure on the interval is an image of the Haar measure on the Cantor set, while the natural injection into the ternary set is a canonical example of a singular measure. It can also be shown that the Haar measure is an image of any probability, making the Cantor set a universal probability space in some ways.
In Lebesgue measure theory, the Cantor set is an example of a set which is uncountable and has zero measure.
If we define a Cantor number as a member of the Cantor set, then
The Cantor set is a meager set (or a set of first category) as a subset of [0,1] (although not as a subset of itself, since it is a Baire space). The Cantor set thus demonstrates that notions of "size" in terms of cardinality, measure, and (Baire) category need not coincide. Like the set , the Cantor set is "small" in the sense that it is a null set (a set of measure zero) and it is a meager subset of [0,1]. However, unlike , which is countable and has a "small" cardinality, , the cardinality of is the same as that of [0,1], the continuum , and is "large" in the sense of cardinality. In fact, it is also possible to construct a subset of [0,1] that is meager but of positive measure and a subset that is non-meager but of measure zero: By taking the countable union of "fat" Cantor sets of measure (see Smith–Volterra–Cantor set below for the construction), we obtain a set which has a positive measure (equal to 1) but is meager in [0,1], since each is nowhere dense. Then consider the set . Since , cannot be meager, but since , must have measure zero.
Instead of repeatedly removing the middle third of every piece as in the Cantor set, we could also keep removing any other fixed percentage (other than 0% and 100%) from the middle. In the case where the middle 8/10 of the interval is removed, we get a remarkably accessible case — the set consists of all numbers in [0,1] that can be written as a decimal consisting entirely of 0s and 9s. If a fixed percentage is removed at each stage, then the limiting set will have measure zero, since the length of the remainder as for any f such that .
On the other hand, "fat Cantor sets" of positive measure can be generated by removal of smaller fractions of the middle of the segment in each iteration. Thus, one can construct sets homeomorphic to the Cantor set that have positive Lebesgue measure while still being nowhere dense. If an interval of length () is removed from the middle of each segment at the nth iteration, then the total length removed is , and the limiting set will have a Lebesgue measure of . Thus, in a sense, the middle-thirds Cantor set is a limiting case with . If , then the remainder will have positive measure with . The case is known as the Smith–Volterra–Cantor set, which has a Lebesgue measure of .
One can modify the construction of the Cantor set by dividing randomly instead of equally. Besides, to incorporate time we can divide only one of the available intervals at each step instead of dividing all the available intervals. In the case of stochastic triadic Cantor set the resulting process can be described by the following rate equation
and for the stochastic dyadic Cantor set
where is the number of intervals of size between and . In the case of triadic Cantor set the fractal dimension is which is less than its deterministic counterpart . In the case of stochastic dyadic Cantor set the fractal dimension is which is again less than that of its deterministic counterpart . In the case of stochastic dyadic Cantor set the solution for exhibits dynamic scaling as its solution in the long-time limit is where the fractal dimension of the stochastic dyadic Cantor set . In either case, like triadic Cantor set, the th moment () of stochastic triadic and dyadic Cantor set too are conserved quantities.
Cantor dust is a multi-dimensional version of the Cantor set. It can be formed by taking a finite Cartesian product of the Cantor set with itself, making it a Cantor space. Like the Cantor set, Cantor dust has zero measure.
A different 2D analogue of the Cantor set is the Sierpinski carpet, where a square is divided up into nine smaller squares, and the middle one removed. The remaining squares are then further divided into nine each and the middle removed, and so on ad infinitum.One 3D analogue of this is the Menger sponge.
Cantor himself defined the set in a general, abstract way, and mentioned the ternary construction only in passing, as an example of a more general idea, that of a perfect set that is nowhere dense. The original paper provides several different constructions of the abstract concept.
This set would have been considered abstract at the time when Cantor devised it. Cantor himself was led to it by practical concerns about the set of points where a trigonometric series might fail to converge. The discovery did much to set him on the course for developing an abstract, general theory of infinite sets.
In probability and statistics, a random variable, random quantity, aleatory variable, or stochastic variable is described informally as a variable whose values depend on outcomes of a random phenomenon. The formal mathematical treatment of random variables is a topic in probability theory. In that context, a random variable is understood as a measurable function defined on a probability space that maps from the sample space to the real numbers.
In mathematics, real analysis is the branch of mathematical analysis that studies the behavior of real numbers, sequences and series of real numbers, and real functions. Some particular properties of real-valued sequences and functions that real analysis studies include convergence, limits, continuity, smoothness, differentiability and integrability.
In mathematical analysis and in probability theory, a σ-algebra on a set X is a collection of subsets of X that includes X itself, is closed under complement, and is closed under countable unions.
In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution even if the original variables themselves are not normally distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions. This theorem has seen many changes during the formal development of probability theory. Previous versions of the theorem date back to 1811, but in its modern general form, this fundamental result in probability theory was precisely stated as late as 1920, thereby serving as a bridge between classical and modern probability theory.
In mathematics, a (real) interval is a set of real numbers that contains all real numbers lying between any two numbers of the set. For example, the set of numbers x satisfying 0 ≤ x ≤ 1 is an interval which contains 0, 1, and all numbers in between. Other examples of intervals are the set of numbers such that 0 < x < 1, the set of all real numbers , the set of nonnegative real numbers, the set of positive real numbers, the empty set, and any singleton.
In mathematics, an infinite series of numbers is said to converge absolutely if the sum of the absolute values of the summands is finite. More precisely, a real or complex series is said to converge absolutely if for some real number . Similarly, an improper integral of a function, , is said to converge absolutely if the integral of the absolute value of the integrand is finite—that is, if
In probability and statistics, a Bernoulli process is a finite or infinite sequence of binary random variables, so it is a discrete-time stochastic process that takes only two values, canonically 0 and 1. The component Bernoulli variablesXi are identically distributed and independent. Prosaically, a Bernoulli process is a repeated coin flipping, possibly with an unfair coin. Every variable Xi in the sequence is associated with a Bernoulli trial or experiment. They all have the same Bernoulli distribution. Much of what can be said about the Bernoulli process can also be generalized to more than two outcomes ; this generalization is known as the Bernoulli scheme.
In the mathematical field of real analysis, the monotone convergence theorem is any of a number of related theorems proving the convergence of monotonic sequences that are also bounded. Informally, the theorems state that if a sequence is increasing and bounded above by a supremum, then the sequence will converge to the supremum; in the same way, if a sequence is decreasing and is bounded below by an infimum, it will converge to the infimum.
In mathematics, the Cantor function is an example of a function that is continuous, but not absolutely continuous. It is a notorious counterexample in analysis, because it challenges naive intuitions about continuity, derivative, and measure. Though it is continuous everywhere and has zero derivative almost everywhere, its value still goes from 0 to 1 as its argument reaches from 0 to 1. Thus, in one sense the function seems very much like a constant one which cannot grow, and in another, it does indeed monotonically grow, by construction.
In mathematics, the limit of a sequence of sets A1, A2, … is a set whose elements are determined by the sequence in either of two equivalent ways: (1) by upper and lower bounds on the sequence that converge monotonically to the same set and (2) by convergence of a sequence of indicator functions which are themselves real-valued. As is the case with sequences of other objects, convergence is not necessary or even usual.
In mathematical analysis, an improper integral is the limit of a definite integral as an endpoint of the interval(s) of integration approaches either a specified real number, , , or in some instances as both endpoints approach limits. Such an integral is often written symbolically just like a standard definite integral, in some cases with infinity as a limit of integration.
In mathematical analysis, a space-filling curve is a curve whose range contains the entire 2-dimensional unit square. Because Giuseppe Peano (1858–1932) was the first to discover one, space-filling curves in the 2-dimensional plane are sometimes called Peano curves, but that phrase also refers to the Peano curve, the specific example of a space-filling curve found by Peano.
The dyadic transformation is the mapping
In set theory, the cardinality of the continuum is the cardinality or "size" of the set of real numbers , sometimes called the continuum. It is an infinite cardinal number and is denoted by or .
In mathematics, a π-system on a set is a collection of certain subsets of such that
In mathematics, ergodicity expresses the idea that a point of a moving system, either a dynamical system or a stochastic process, will eventually visit all parts of the space that the system moves in, in a uniform and random sense. This implies that the average behavior of the system can be deduced from the trajectory of a "typical" point. Equivalently, a sufficiently large collection of random samples from a process can represent the average statistical properties of the entire process. Ergodicity is a property of the system; it is a statement that the system cannot be reduced or factored into smaller components. Ergodic theory is the study of systems possessing ergodicity.
In probability theory, a standard probability space, also called Lebesgue–Rokhlin probability space or just Lebesgue space is a probability space satisfying certain assumptions introduced by Vladimir Rokhlin in 1940. Informally, it is a probability space consisting of an interval and/or a finite or countable number of atoms.
In mathematics, a line integral is an integral where the function to be integrated is evaluated along a curve. The terms path integral, curve integral, and curvilinear integral are also used; contour integral is used as well, although that is typically reserved for line integrals in the complex plane.
A Stein discrepancy is a statistical divergence between two probability measures that is rooted in Stein's method. It was first formulated as a tool to assess the quality of Markov chain Monte Carlo samplers, but has since been used in diverse settings in statistics, machine learning and computer science.