Natural density

Last updated October 17, 2023

In number theory, natural density, also referred to as asymptotic density or arithmetic density, is one method to measure how "large" a subset of the set of natural numbers is. It relies chiefly on the probability of encountering members of the desired subset when combing through the interval $[1, n]$ as $n$ grows large.

Intuitively, it is thought that there are more positive integers than perfect squares, since every perfect square is already positive, and many other positive integers exist besides. However, the set of positive integers is not in fact larger than the set of perfect squares: both sets are infinite and countable and can therefore be put in one-to-one correspondence. Nevertheless if one goes through the natural numbers, the squares become increasingly scarce. The notion of natural density makes this intuition precise for many, but not all, subsets of the naturals (see Schnirelmann density, which is similar to natural density but defined for all subsets of $\mathbb {N}$ ).

If an integer is randomly selected from the interval $[1, n]$ , then the probability that it belongs to $A$ is the ratio of the number of elements of $A$ in $[1, n]$ to the total number of elements in $[1, n]$ . If this probability tends to some limit as $n$ tends to infinity, then this limit is referred to as the asymptotic density of $A$ . This notion can be understood as a kind of probability of choosing a number from the set $A$ . Indeed, the asymptotic density (as well as some other types of densities) is studied in probabilistic number theory.

Definition

A subset $A$ of positive integers has natural density $α$ if the proportion of elements of $A$ among all natural numbers from 1 to $n$ converges to $α$ as $n$ tends to infinity.

More explicitly, if one defines for any natural number $n$ the counting function $a (n)$ as the number of elements of $A$ less than or equal to $n$ , then the natural density of $A$ being $α$ exactly means that^[1]

a (n)/ n \to α

as

n \to \infty

.

It follows from the definition that if a set $A$ has natural density $α$ then $0 \leq α \leq 1$ .

Upper and lower asymptotic density

Let $A$ be a subset of the set of natural numbers $\mathbb {N} =\{1,2,\ldots \}.$ For any $n\in \mathbb {N}$ , define $A(n)$ to be the intersection $A(n)=\{1,2,\ldots ,n\}\cap A,$ and let $a(n)=|A(n)|$ be the number of elements of $A$ less than or equal to $n$ .

Define the upper asymptotic density ${\overline {d}}(A)$ of $A$ (also called the "upper density") by

{\overline {d}}(A)=\limsup _{n\rightarrow \infty }{\frac {a(n)}{n}}

where lim sup is the limit superior.

Similarly, define the lower asymptotic density ${\underline {d}}(A)$ of $A$ (also called the "lower density") by

{\underline {d}}(A)=\liminf _{n\rightarrow \infty }{\frac {a(n)}{n}}

where lim inf is the limit inferior. One may say $A$ has asymptotic density $d(A)$ if ${\underline {d}}(A)={\overline {d}}(A)$ , in which case $d(A)$ is equal to this common value.

This definition can be restated in the following way:

d(A)=\lim _{n\rightarrow \infty }{\frac {a(n)}{n}}

if this limit exists.^[2]

These definitions may equivalently be expressed in the following way. Given a subset $A$ of $\mathbb {N}$ , write it as an increasing sequence indexed by the natural numbers:

A=\{a_{1}<a_{2}<\ldots \}.

Then

{\underline {d}}(A)=\liminf _{n\rightarrow \infty }{\frac {n}{a_{n}}},

{\overline {d}}(A)=\limsup _{n\rightarrow \infty }{\frac {n}{a_{n}}}

and

d(A)=\lim _{n\rightarrow \infty }{\frac {n}{a_{n}}}

if the limit exists.

A somewhat weaker notion of density is the upper Banach density $d^{*}(A)$ of a set $A\subseteq \mathbb {N} .$ This is defined as

d^{*}(A)=\limsup _{N-M\rightarrow \infty }{\frac {|A\cap \{M,M+1,\ldots ,N\}|}{N-M+1}}.

Properties and examples

For any finite set F of positive integers, d(F) = 0.

If d(A) exists for some set A and A^c denotes its complement set with respect to $\mathbb {N}$ $Natural density$ , then d(A^c) = 1 − d(A).
- Corollary: If $F\subset \mathbb {N}$ is finite (including the case $F=\emptyset$ ), $d(\mathbb {N} \setminus F)=1.$

If $d(A),d(B),$ and $d(A\cup B)$ exist, then $\max\{d(A),d(B)\}\leq d(A\cup B)\leq \min\{d(A)+d(B),1\}.$

If $A=\{n^{2}:n\in \mathbb {N} \}$ is the set of all squares, then d(A) = 0.

If $A=\{2n:n\in \mathbb {N} \}$ is the set of all even numbers, then d(A) = 0.5. Similarly, for any arithmetical progression $A=\{an+b:n\in \mathbb {N} \}$ we get $d(A)={\tfrac {1}{a}}.$

For the set P of all primes we get from the prime number theorem that d(P) = 0.

The set of all square-free integers has density ${\tfrac {6}{\pi ^{2}}}.$ More generally, the set of all n^th-power-free numbers for any natural n has density ${\tfrac {1}{\zeta (n)}},$ where $\zeta (n)$ is the Riemann zeta function.

The set of abundant numbers has non-zero density.^[3] Marc Deléglise showed in 1998 that the density of the set of abundant numbers is between 0.2474 and 0.2480.^[4]

The set

A=\bigcup _{n=0}^{\infty }\left\{2^{2n},\ldots ,2^{2n+1}-1\right\}

of numbers whose binary expansion contains an odd number of digits is an example of a set which does not have an asymptotic density, since the upper density of this set is

{\overline {d}}(A)=\lim _{m\to \infty }{\frac {1+2^{2}+\cdots +2^{2m}}{2^{2m+1}-1}}=\lim _{m\to \infty }{\frac {2^{2m+2}-1}{3(2^{2m+1}-1)}}={\frac {2}{3}},

whereas its lower density is

{\underline {d}}(A)=\lim _{m\to \infty }{\frac {1+2^{2}+\cdots +2^{2m}}{2^{2m+2}-1}}=\lim _{m\to \infty }{\frac {2^{2m+2}-1}{3(2^{2m+2}-1)}}={\frac {1}{3}}.

The set of numbers whose decimal expansion begins with the digit 1 similarly has no natural density: the lower density is 1/9 and the upper density is 5/9.^[1] (See Benford's law.)

Consider an equidistributed sequence $\{\alpha _{n}\}_{n\in \mathbb {N} }$ in $[0,1]$ and define a monotone family $\{A_{x}\}_{x\in [0,1]}$ of sets:

A_{x}:=\{n\in \mathbb {N

Then, by definition,

d(A_{x})=x

for all

x

.

If S is a set of positive upper density then Szemerédi's theorem states that S contains arbitrarily large finite arithmetic progressions, and the Furstenberg–Sárközy theorem states that some two members of S differ by a square number.

Other density functions

Other density functions on subsets of the natural numbers may be defined analogously. For example, the logarithmic density of a set A is defined as the limit (if it exists)

\mathbf {\delta } (A)=\lim _{x\rightarrow \infty }{\frac {1}{\log x}}\sum _{n\in A,n\leq x}{\frac {1}{n}}\ .

Upper and lower logarithmic densities are defined analogously as well.

For the set of multiples of an integer sequence, the Davenport–Erdős theorem states that the natural density, when it exists, is equal to the logarithmic density.^[5]

Notes

1 2 Tenenbaum (1995) p.261
↑ Nathanson (2000) pp.256–257
↑ Hall, Richard R.; Tenenbaum, Gérald (1988). Divisors. Cambridge Tracts in Mathematics. Vol. 90. Cambridge: Cambridge University Press. p. 95. ISBN 978-0-521-34056-4. Zbl 0653.10001.
↑ Deléglise, Marc (1998). "Bounds for the density of abundant integers". Experimental Mathematics. 7 (2): 137–143. CiteSeerX 10.1.1.36.8272 . doi:10.1080/10586458.1998.10504363. ISSN 1058-6458. MR 1677091. Zbl 0923.11127.
↑ Hall, Richard R. (1996), Sets of multiples, Cambridge Tracts in Mathematics, vol. 118, Cambridge University Press, Cambridge, Theorem 0.2, p. 5, doi:10.1017/CBO9780511566011, ISBN 978-0-521-40424-2, MR 1414678

Related Research Articles

In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members. The number of elements is called the length of the sequence. Unlike a set, the same elements can appear multiple times at different positions in a sequence, and unlike a set, the order does matter. Formally, a sequence can be defined as a function from natural numbers to the elements at each position. The notion of a sequence can be generalized to an indexed family, defined as a function from an arbitrary index set.

In mathematics, the extended real number system is obtained from the real number system $by adding two infinity elements: and where the infinities are treated as actual numbers. It is useful in describing the algebra on infinities and the various limiting behaviors in calculus and mathematical analysis, especially in the theory of measure and integration. The extended real number system is denoted or or It is the Dedekind-MacNeille completion of the real numbers.$

A mathematical symbol is a figure or a combination of figures that is used to represent a mathematical object, an action on mathematical objects, a relation between mathematical objects, or for structuring the other symbols that occur in a formula. As formulas are entirely constituted with symbols of various types, many symbols are needed for expressing all mathematics.

In complex analysis, a branch of mathematics, analytic continuation is a technique to extend the domain of definition of a given analytic function. Analytic continuation often succeeds in defining further values of a function, for example in a new region where the infinite series representation which initially defined the function becomes divergent.

In mathematics, a real number is said to be simply normal in an integer base b if its infinite sequence of digits is distributed uniformly in the sense that each of the b digit values has the same natural density 1/b. A number is said to be normal in base b if, for every positive integer n, all possible strings n digits long have density b⁻ⁿ.

In information theory, the asymptotic equipartition property (AEP) is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of data compression.

In mathematical analysis, a function of bounded variation, also known as $BV$ function, is a real-valued function whose total variation is bounded (finite): the graph of a function having this property is well behaved in a precise sense. For a continuous function of a single variable, being of bounded variation means that the distance along the direction of the $y$ -axis, neglecting the contribution of motion along $x$ -axis, traveled by a point moving along the graph has a finite value. For a continuous function of several variables, the meaning of the definition is the same, except for the fact that the continuous path to be considered cannot be the whole graph of the given function, but can be every intersection of the graph itself with a hyperplane parallel to a fixed $x$ -axis and to the $y$ -axis.

In additive number theory, the Schnirelmann density of a sequence of numbers is a way to measure how "dense" the sequence is. It is named after Russian mathematician Lev Schnirelmann, who was the first to study it.

<span class="mw-page-title-main">Bump function</span> Smooth and compactly supported function

In mathematics, a bump function is a function $on a Euclidean space which is both smooth and compactly supported. The set of all bump functions with domain forms a vector space, denoted or The dual space of this space endowed with a suitable topology is the space of distributions.$

In mathematics, the Mahler measure $of a polynomial with complex coefficients is defined as$

In mathematics, the Riemann series theorem, also called the Riemann rearrangement theorem, named after 19th-century German mathematician Bernhard Riemann, says that if an infinite series of real numbers is conditionally convergent, then its terms can be arranged in a permutation so that the new series converges to an arbitrary real number, or diverges. This implies that a series of real numbers is absolutely convergent if and only if it is unconditionally convergent.

In mathematics, specifically in category theory, an $-coalgebra$ is a structure defined according to a functor $, with specific properties as defined below. For both algebras and coalgebras, a functor is a convenient and general way of organizing a signature. This has applications in computer science: examples of coalgebras include lazy evaluation, infinite data structures, such as streams, and also transition systems.$

In combinatorics, a branch of mathematics, partition regularity is one notion of largeness for a collection of sets.

In mathematics, a limit is the value that a function approaches as the input approaches some value. Limits are essential to calculus and mathematical analysis, and are used to define continuity, derivatives, and integrals.

The Erdős–Turán conjecture is an old unsolved problem in additive number theory posed by Paul Erdős and Pál Turán in 1941.

In mathematics, a profinite integer is an element of the ring

In mathematics, the mean (topological) dimension of a topological dynamical system is a non-negative extended real number that is a measure of the complexity of the system. Mean dimension was first introduced in 1999 by Gromov. Shortly after it was developed and studied systematically by Lindenstrauss and Weiss. In particular they proved the following key fact: a system with finite topological entropy has zero mean dimension. For various topological dynamical systems with infinite topological entropy, the mean dimension can be calculated or at least bounded from below and above. This allows mean dimension to be used to distinguish between systems with infinite topological entropy. Mean dimension is also related to the problem of embedding topological dynamical systems in shift spaces.

In physics, the poppy-seed bagel theorem concerns interacting particles confined to a bounded surface $when the particles repel each other pairwise with a magnitude that is proportional to the inverse distance between them raised to some positive power . In particular, this includes the Coulomb law observed in Electrostatics and Riesz potentials extensively studied in Potential theory. Other classes of potentials, which not necessarily involve the Riesz kernel, for example nearest neighbor interactions, are also described by this theorem in the macroscopic regime. For such particles, a stable equilibrium state, which depends on the parameter, is attained when the associated potential energy of the system is minimal. For large numbers of points, these equilibrium configurations provide a discretization of which may or may not be nearly uniform with respect to the surface area of . The poppy-seed bagel theorem asserts that for a large class of sets, the uniformity property holds when the parameter is larger than or equal to the dimension of the set . For example, when the points are confined to the 2-dimensional surface of a torus embedded in 3 dimensions, one can create a large number of points that are nearly uniformly spread on the surface by imposing a repulsion proportional to the inverse square distance between the points, or any stronger repulsion. From a culinary perspective, to create the nearly perfect poppy-seed bagel where bites of equal size anywhere on the bagel would contain essentially the same number of poppy seeds, impose at least an inverse square distance repelling force on the seeds.$

In number theory, specifically in Diophantine approximation theory, the Markov constant $of an irrational number is the factor for which Dirichlet's approximation theorem can be improved for .$

In the mathematical discipline of functional analysis, a differentiable vector-valued function from Euclidean space is a differentiable function valued in a topological vector space (TVS) whose domains is a subset of some finite-dimensional Euclidean space. It is possible to generalize the notion of derivative to functions whose domain and codomain are subsets of arbitrary topological vector spaces (TVSs) in multiple ways. But when the domain of a TVS-valued function is a subset of a finite-dimensional Euclidean space then many of these notions become logically equivalent resulting in a much more limited number of generalizations of the derivative and additionally, differentiability is also more well-behaved compared to the general case. This article presents the theory of $-times continuously differentiable functions on an open subset of Euclidean space, which is an important special case of differentiation between arbitrary TVSs. This importance stems partially from the fact that every finite-dimensional vector subspace of a Hausdorff topological vector space is TVS isomorphic to Euclidean space so that, for example, this special case can be applied to any function whose domain is an arbitrary Hausdorff TVS by restricting it to finite-dimensional vector subspaces.$

References

Nathanson, Melvyn B. (2000). Elementary Methods in Number Theory. Graduate Texts in Mathematics. Vol. 195. Springer-Verlag. ISBN 978-0387989129. Zbl 0953.11002.
Niven, Ivan (1951). "The asymptotic density of sequences". Bulletin of the American Mathematical Society . 57 (6): 420–434. doi: 10.1090/s0002-9904-1951-09543-9 . MR 0044561. Zbl 0044.03603.
Steuding, Jörn (2002). "Probabilistic number theory" (PDF). Archived from the original (PDF) on December 22, 2011. Retrieved 2014-11-16.
Tenenbaum, Gérald (1995). Introduction to analytic and probabilistic number theory. Cambridge Studies in Advanced Mathematics. Vol. 46. Cambridge University Press. Zbl 0831.11001.

This article incorporates material from Asymptotic density on PlanetMath, which is licensed under the Creative Commons Attribution/Share-Alike License.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Ten261-1] 1 2 Tenenbaum (1995) p.261

[2] Nathanson (2000) pp.256–257

[HT95-3] Hall, Richard R.; Tenenbaum, Gérald (1988). Divisors. Cambridge Tracts in Mathematics. Vol. 90. Cambridge: Cambridge University Press. p. 95. ISBN 978-0-521-34056-4. Zbl 0653.10001.

[Del1998-4] Deléglise, Marc (1998). "Bounds for the density of abundant integers". Experimental Mathematics. 7 (2): 137–143. CiteSeerX 10.1.1.36.8272 . doi:10.1080/10586458.1998.10504363. ISSN 1058-6458. MR 1677091. Zbl 0923.11127.

[5] Hall, Richard R. (1996), Sets of multiples, Cambridge Tracts in Mathematics, vol. 118, Cambridge University Press, Cambridge, Theorem 0.2, p. 5, doi:10.1017/CBO9780511566011, ISBN 978-0-521-40424-2, MR 1414678

[1]

[2]

[3]

[4]

[5]