Bernoulli's inequality

Last updated December 25, 2024

In mathematics, Bernoulli's inequality (named after Jacob Bernoulli) is an inequality that approximates exponentiations of $1+x$ . It is often employed in real analysis. It has several useful variants:^[1]

Integer exponent
Real exponent
History
Proof for integer exponent
Generalizations
Generalization of exponent
Generalization of base
Strengthened version
Related inequalities
Alternative form
Alternative proofs
Arithmetic and geometric means
Geometric series
Binomial theorem
Using convexity
Notes
References
External links

Integer exponent

Case 1: $(1+x)^{r}\geq 1+rx$ for every integer $r\geq 1$ and real number $x\geq -1$ . The inequality is strict if $x\neq 0$ and $r\geq 2$ .
Case 2: $(1+x)^{r}\geq 1+rx$ for every integer $r\geq 0$ and every real number $x\geq -2$ .^[2]
Case 3: $(1+x)^{r}\geq 1+rx$ for every even integer $r\geq 0$ and every real number $x$ .

Real exponent

$(1+x)^{r}\geq 1+rx$ for every real number $r\geq 1$ and $x\geq -1$ . The inequality is strict if $x\neq 0$ and $r\neq 1$ .
$(1+x)^{r}\leq 1+rx$ for every real number $0\leq r\leq 1$ and $x\geq -1$ .

History

Jacob Bernoulli first published the inequality in his treatise "Positiones Arithmeticae de Seriebus Infinitis" (Basel, 1689), where he used the inequality often.^[3]

According to Joseph E. Hofmann, Über die Exercitatio Geometrica des M. A. Ricci (1963), p. 177, the inequality is actually due to Sluse in his Mesolabum (1668 edition), Chapter IV "De maximis & minimis".^[3]

Proof for integer exponent

The first case has a simple inductive proof:

Suppose the statement is true for $r=k$ :

(1+x)^{k}\geq 1+kx.

Then it follows that

{\begin{aligned}(1+x)^{k+1}&=(1+x)^{k}(1+x)\\&\geq (1+kx)(1+x)\\&=1+kx+x+kx^{2}\\&=1+x(k+1)+kx^{2}\\&\geq 1+(k+1)x\end{aligned}}

Bernoulli's inequality can be proved for case 2, in which $r$ is a non-negative integer and $x\geq -2$ , using mathematical induction in the following form:

we prove the inequality for $r\in \{0,1\}$ ,
from validity for some r we deduce validity for $r+2$ .

For $r=0$ ,

(1+x)^{0}\geq 1+0x

is equivalent to $1\geq 1$ which is true.

Similarly, for $r=1$ we have

(1+x)^{r}=1+x\geq 1+rx.

Now suppose the statement is true for $r=k$ :

(1+x)^{k}\geq 1+kx.

Then it follows that

{\begin{aligned}(1+x)^{k+2}&=(1+x)^{k}(1+x)^{2}\\&\geq (1+kx)\left(1+2x+x^{2}\right)\qquad \qquad \qquad {\text{ by hypothesis and }}(1+x)^{2}\geq 0\\&=1+2x+x^{2}+kx+2kx^{2}+kx^{3}\\&=1+(k+2)x+kx^{2}(x+2)+x^{2}\\&\geq 1+(k+2)x\end{aligned}}

since $x^{2}\geq 0$ as well as $x+2\geq 0$ . By the modified induction we conclude the statement is true for every non-negative integer $r$ .

By noting that if $x<-2$ , then $1+rx$ is negative gives case 3.

Generalizations

Generalization of exponent

The exponent $r$ can be generalized to an arbitrary real number as follows: if $x>-1$ , then

(1+x)^{r}\geq 1+rx

for $r\leq 0$ or $\geq 1$ , and

(1+x)^{r}\leq 1+rx

for $0\leq r\leq 1$ .

This generalization can be proved by comparing derivatives. The strict versions of these inequalities require $x\neq 0$ and $r\neq 0,1$ .

Generalization of base

Instead of $(1+x)^{n}$ the inequality holds also in the form $(1+x_{1})(1+x_{2})\dots (1+x_{r})\geq 1+x_{1}+x_{2}+\dots +x_{r}$ where $x_{1},x_{2},\dots ,x_{r}$ are real numbers, all greater than $-1$ , all with the same sign. Bernoulli's inequality is a special case when $x_{1}=x_{2}=\dots =x_{r}=x$ . This generalized inequality can be proved by mathematical induction.

Proof

In the first step we take $n=1$ . In this case the inequality $1+x_{1}\geq 1+x_{1}$ is obviously true.

In the second step we assume validity of the inequality for $r$ numbers and deduce validity for $r+1$ numbers.

We assume that $(1+x_{1})(1+x_{2})\dots (1+x_{r})\geq 1+x_{1}+x_{2}+\dots +x_{r}$ is valid. After multiplying both sides with a positive number $(x_{r+1}+1)$ we get:

${\begin{alignedat}{2}(1+x_{1})(1+x_{2})\dots (1+x_{r})(1+x_{r+1})\geq &(1+x_{1}+x_{2}+\dots +x_{r})(1+x_{r+1})\\\geq &(1+x_{1}+x_{2}+\dots +x_{r})\cdot 1+(1+x_{1}+x_{2}+\dots +x_{r})\cdot x_{r+1}\\\geq &(1+x_{1}+x_{2}+\dots +x_{r})+x_{r+1}+x_{1}x_{r+1}+x_{2}x_{r+1}+\dots +x_{r}x_{r+1}\\\end{alignedat}}$

As $x_{1},x_{2},\dots x_{r},x_{r+1}$ all have the same sign, the products $x_{1}x_{r+1},x_{2}x_{r+1},\dots x_{r}x_{r+1}$ are all positive numbers. So the quantity on the right-hand side can be bounded as follows: $(1+x_{1}+x_{2}+\dots +x_{r})+x_{r+1}+x_{1}x_{r+1}+x_{2}x_{r+1}+\dots +x_{r}x_{r+1}\geq 1+x_{1}+x_{2}+\dots +x_{r}+x_{r+1},$ what was to be shown.

Strengthened version

The following theorem presents a strengthened version of the Bernoulli inequality, incorporating additional terms to refine the estimate under specific conditions. Let the expoent $r$ be a nonnegative integer and let $x$ be a real number with $x\geq -2$ if $r$ is odd and greater than 1. Then

$(1+x)^{r}\geq 1+rx+\lfloor r/2\rfloor x^{2}$

with equality if and only if $r\in \{0,1,2\}$ or $x\in \{-2,0\}$ .^[4]

Related inequalities

The following inequality estimates the $r$ -th power of $1+x$ from the other side. For any real numbers $x$ and $r$ with $r>0$ , one has

(1+x)^{r}\leq e^{rx},

where $e=$ 2.718.... This may be proved using the inequality

\left(1+{\frac {1}{k}}\right)^{k}<e.

Alternative form

An alternative form of Bernoulli's inequality for $t\geq 1$ and $0\leq x\leq 1$ is:

(1-x)^{t}\geq 1-xt.

This can be proved (for any integer $t$ ) by using the formula for geometric series: (using $y=1-x$ )

t=1+1+\dots +1\geq 1+y+y^{2}+\ldots +y^{t-1}={\frac {1-y^{t}}{1-y}},

or equivalently $xt\geq 1-(1-x)^{t}.$

Alternative proofs

Arithmetic and geometric means

An elementary proof for $0\leq r\leq 1$ and $x\geq -1$ can be given using weighted AM-GM.

Let $\lambda _{1},\lambda _{2}$ be two non-negative real constants. By weighted AM-GM on $1,1+x$ with weights $\lambda _{1},\lambda _{2}$ respectively, we get

{\dfrac {\lambda _{1}\cdot 1+\lambda _{2}\cdot (1+x)}{\lambda _{1}+\lambda _{2}}}\geq {\sqrt[{\lambda _{1}+\lambda _{2}}]{(1+x)^{\lambda _{2}}}}.

Note that

{\dfrac {\lambda _{1}\cdot 1+\lambda _{2}\cdot (1+x)}{\lambda _{1}+\lambda _{2}}}={\dfrac {\lambda _{1}+\lambda _{2}+\lambda _{2}x}{\lambda _{1}+\lambda _{2}}}=1+{\dfrac {\lambda _{2}}{\lambda _{1}+\lambda _{2}}}x

and

{\sqrt[{\lambda _{1}+\lambda _{2}}]{(1+x)^{\lambda _{2}}}}=(1+x)^{\frac {\lambda _{2}}{\lambda _{1}+\lambda _{2}}},

so our inequality is equivalent to

1+{\dfrac {\lambda _{2}}{\lambda _{1}+\lambda _{2}}}x\geq (1+x)^{\frac {\lambda _{2}}{\lambda _{1}+\lambda _{2}}}.

After substituting $r={\dfrac {\lambda _{2}}{\lambda _{1}+\lambda _{2}}}$ (bearing in mind that this implies $0\leq r\leq 1$ ) our inequality turns into

1+rx\geq (1+x)^{r}

which is Bernoulli's inequality.

Geometric series

Bernoulli's inequality

(1+x)^{r}\geq 1+rx

(1)

is equivalent to

(1+x)^{r}-1-rx\geq 0,

(2)

and by the formula for geometric series (using y = 1 + x) we get

(1+x)^{r}-1=y^{r}-1=\left(\sum _{k=0}^{r-1}y^{k}\right)\cdot (y-1)=\left(\sum _{k=0}^{r-1}(1+x)^{k}\right)\cdot x

(3)

which leads to

(1+x)^{r}-1-rx=\left(\left(\sum _{k=0}^{r-1}(1+x)^{k}\right)-r\right)\cdot x=\left(\sum _{k=0}^{r-1}\left((1+x)^{k}-1\right)\right)\cdot x\geq 0.

(4)

Now if $x\geq 0$ then by monotony of the powers each summand $(1+x)^{k}-1=(1+x)^{k}-1^{k}\geq 0$ , and therefore their sum is greater $0$ and hence the product on the LHS of ( 4 ).

If $0\geq x\geq -2$ then by the same arguments $1\geq (1+x)^{k}$ and thus all addends $(1+x)^{k}-1$ are non-positive and hence so is their sum. Since the product of two non-positive numbers is non-negative, we get again ( 4 ).

Binomial theorem

One can prove Bernoulli's inequality for x ≥ 0 using the binomial theorem. It is true trivially for r = 0, so suppose r is a positive integer. Then $(1+x)^{r}=1+rx+{\tbinom {r}{2}}x^{2}+...+{\tbinom {r}{r}}x^{r}.$ Clearly ${\tbinom {r}{2}}x^{2}+...+{\tbinom {r}{r}}x^{r}\geq 0,$ and hence $(1+x)^{r}\geq 1+rx$ as required.

Using convexity

For $0\neq x\geq -1$ the function $h(\alpha )=(1+x)^{\alpha }$ is strictly convex. Therefore, for $0<\alpha <1$ holds $(1+x)^{\alpha }=h(\alpha )=h((1-\alpha )\cdot 0+\alpha \cdot 1)<(1-\alpha )h(0)+\alpha h(1)=1+\alpha x$ and the reversed inequality is valid for $\alpha <0$ and $\alpha >1$ .

Another way of using convexity is to re-cast the desired inequality to $\log(1+x)\geq {\frac {1}{r}}\log(1+rx)$ for real $r\geq 1$ and real $x>-1/r$ . This inequality can be proved using the fact that the $\log$ function is concave, and then using Jensen's inequality in the form $\log(p\,a+(1-p)b)\geq p\log(a)+(1-p)\log(b)$ to give: $\log(1+x)=\log({\frac {1}{r}}(1+rx)+{\frac {r-1}{r}})\geq {\frac {1}{r}}\log(1+rx)+{\frac {r-1}{r}}\log 1={\frac {1}{r}}\log(1+rx)$ which is the desired inequality.

Notes

↑ Brannan, D. A. (2006). A First Course in Mathematical Analysis. Cambridge University Press. p. 20. ISBN 9781139458955.
↑ Excluding the case $r = 0$ and $x = -1$ , or assuming that $00 = 1$ .
1 2 mathematics – First use of Bernoulli's inequality and its name – History of Science and Mathematics Stack Exchange
↑ Bradley, David M. (2024-12-23). "A Stronger Version of Bernoulli's Inequality". The Mathematical Intelligencer. doi:10.1007/s00283-024-10396-5. ISSN 0343-6993.

Related Research Articles

In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers $n \geq k \geq 0$ and is written $It is the coefficient of the x k term in the polynomial expansion of the binomial power (1 + x) n; this coefficient can be computed by the multiplicative formula$

In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate; the distance parameter could be any meaningful mono-dimensional measure of the process, such as time between production errors, or length along a roll of fabric in the weaving manufacturing process. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

In number theory, a Liouville number is a real number $with the property that, for every positive integer, there exists a pair of integers with such that$

The Liouville lambda function, denoted by $λ(n)$ and named after Joseph Liouville, is an important arithmetic function. Its value is $+1$ if $n$ is the product of an even number of prime numbers, and $-1$ if it is the product of an odd number of primes.

In probability theory, Chebyshev's inequality provides an upper bound on the probability of deviation of a random variable from its mean. More specifically, the probability that a random variable deviates from its mean by more than $is at most, where is any positive constant and is the standard deviation.$

<span class="mw-page-title-main">Jensen's inequality</span> Theorem of convex functions

In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906, building on an earlier proof of the same inequality for doubly-differentiable functions by Otto Hölder in 1889. Given its generality, the inequality appears in many forms depending on the context, some of which are presented below. In its simplest form the inequality states that the convex transformation of a mean is less than or equal to the mean applied after convex transformation.

In probability theory, the Vysochanskij–Petunin inequality gives a lower bound for the probability that a random variable with finite variance lies within a certain number of standard deviations of the variable's mean, or equivalently an upper bound for the probability that it lies further away. The sole restrictions on the distribution are that it be unimodal and have finite variance; here unimodal implies that it is a continuous probability distribution except at the mode, which may have a non-zero probability.

In mathematics and mathematical optimization, the convex conjugate of a function is a generalization of the Legendre transformation which applies to non-convex functions. It is also known as Legendre–Fenchel transformation, Fenchel transformation, or Fenchel conjugate. The convex conjugate is widely used for constructing the dual problem in optimization theory, thus generalizing Lagrangian duality.

In mathematics, the von Mangoldt function is an arithmetic function named after German mathematician Hans von Mangoldt. It is an example of an important arithmetic function that is neither multiplicative nor additive.

In number theory, Ostrowski's theorem, due to Alexander Ostrowski (1916), states that every non-trivial absolute value on the rational numbers $is equivalent to either the usual real absolute value or a p -adic absolute value.$

In mathematics, a real or complex-valued function $f$ on $d$ -dimensional Euclidean space satisfies a Hölder condition, or is Hölder continuous, when there are real constants $C \geq 0$ , $α > 0$ , such that $for all x and y in the domain of f . More generally, the condition can be formulated for functions between any two metric spaces. The number is called the exponent of the Hölder condition. A function on an interval satisfying the condition with α > 1 is constant. If α = 1, then the function satisfies a Lipschitz condition. For any α > 0, the condition implies the function is uniformly continuous. The condition is named after Otto Hölder.$

Hardy's inequality is an inequality in mathematics, named after G. H. Hardy.

In statistics, a power transform is a family of functions applied to create a monotonic transformation of data using power functions. It is a data transformation technique used to stabilize variance, make the data more normal distribution-like, improve the validity of measures of association, and for other data stabilization procedures.

A self-concordant function is a function satisfying a certain differential inequality, which makes it particularly easy for optimization using Newton's method A self-concordant barrier is a particular self-concordant function, that is also a barrier function for a particular convex set. Self-concordant barriers are important ingredients in interior point methods for optimization.

In mathematics, the logarithmic norm is a real-valued functional on operators, and is derived from either an inner product, a vector norm, or its induced operator norm. The logarithmic norm was independently introduced by Germund Dahlquist and Sergei Lozinskiĭ in 1958, for square matrices. It has since been extended to nonlinear operators and unbounded operators as well. The logarithmic norm has a wide range of applications, in particular in matrix theory, differential equations and numerical analysis. In the finite-dimensional setting, it is also referred to as the matrix measure or the Lozinskiĭ measure.

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event. It can also be used for the number of events in other types of intervals than time, and in dimension greater than 1.

Khabibullin's conjecture is a conjecture in mathematics related to Paley's problem for plurisubharmonic functions and to various extremal problems in the theory of entire functions of several variables. The conjecture was named after its proposer, B. N. Khabibullin.

In complex analysis and geometric function theory, the Grunsky matrices, or Grunsky operators, are infinite matrices introduced in 1939 by Helmut Grunsky. The matrices correspond to either a single holomorphic function on the unit disk or a pair of holomorphic functions on the unit disk and its complement. The Grunsky inequalities express boundedness properties of these matrices, which in general are contraction operators or in important special cases unitary operators. As Grunsky showed, these inequalities hold if and only if the holomorphic function is univalent. The inequalities are equivalent to the inequalities of Goluzin, discovered in 1947. Roughly speaking, the Grunsky inequalities give information on the coefficients of the logarithm of a univalent function; later generalizations by Milin, starting from the Lebedev–Milin inequality, succeeded in exponentiating the inequalities to obtain inequalities for the coefficients of the univalent function itself. The Grunsky matrix and its associated inequalities were originally formulated in a more general setting of univalent functions between a region bounded by finitely many sufficiently smooth Jordan curves and its complement: the results of Grunsky, Goluzin and Milin generalize to that case.

For certain applications in linear algebra, it is useful to know properties of the probability distribution of the largest eigenvalue of a finite sum of random matrices. Suppose $is a finite sequence of random matrices. Analogous to the well-known Chernoff bound for sums of scalars, a bound on the following is sought for a given parameter t :$

In mathematics, there are many kinds of inequalities involving matrices and linear operators on Hilbert spaces. This article covers some important operator inequalities connected with traces of matrices.

References

Carothers, N.L. (2000). Real analysis . Cambridge: Cambridge University Press. p. 9. ISBN 978-0-521-49756-5.
Bullen, P. S. (2003). Handbook of means and their inequalities . Dordercht [u.a.]: Kluwer Academic Publ. p. 4. ISBN 978-1-4020-1522-9.
Zaidman, S. (1997). Advanced calculus : an introduction to mathematical analysis . River Edge, NJ: World Scientific. p. 32. ISBN 978-981-02-2704-3.

External links

Weisstein, Eric W. "Bernoulli Inequality". MathWorld .
Bernoulli Inequality by Chris Boucher, Wolfram Demonstrations Project.
Arthur Lohwater (1982). "Introduction to Inequalities". Online e-book in PDF format.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Brannan, D. A. (2006). A First Course in Mathematical Analysis. Cambridge University Press. p. 20. ISBN 9781139458955.

[2] Excluding the case $r = 0$ and $x = -1$ , or assuming that $00 = 1$ .

[autogenerated1-3] 1 2 mathematics – First use of Bernoulli's inequality and its name – History of Science and Mathematics Stack Exchange

[4] Bradley, David M. (2024-12-23). "A Stronger Version of Bernoulli's Inequality". The Mathematical Intelligencer. doi:10.1007/s00283-024-10396-5. ISSN 0343-6993.

[1]

[2]

[3]

[4]

Bernoulli's inequality

Contents

Integer exponent

Real exponent

History

Proof for integer exponent

Generalizations

Generalization of exponent

Generalization of base

Strengthened version

Related inequalities

Alternative form

Alternative proofs

Arithmetic and geometric means

Geometric series

Binomial theorem

Using convexity

Notes

Related Research Articles

References

External links