Bernoulli's inequality

Last updated
An illustration of Bernoulli's inequality, with the graphs of
y
=
(
1
+
x
)
r
{\displaystyle y=(1+x)^{r}}
and
y
=
1
+
r
x
{\displaystyle y=1+rx}
shown in red and blue respectively. Here,
r
=
3.
{\displaystyle r=3.} Bernoulli inequality.svg
An illustration of Bernoulli's inequality, with the graphs of and shown in red and blue respectively. Here,

In mathematics, Bernoulli's inequality (named after Jacob Bernoulli) is an inequality that approximates exponentiations of . It is often employed in real analysis. It has several useful variants: [1]

Contents

Integer exponent

Real exponent

History

Jacob Bernoulli first published the inequality in his treatise "Positiones Arithmeticae de Seriebus Infinitis" (Basel, 1689), where he used the inequality often. [3]

According to Joseph E. Hofmann, Über die Exercitatio Geometrica des M. A. Ricci (1963), p. 177, the inequality is actually due to Sluse in his Mesolabum (1668 edition), Chapter IV "De maximis & minimis". [3]

Proof for integer exponent

The first case has a simple inductive proof:

Suppose the statement is true for :

Then it follows that

Bernoulli's inequality can be proved for case 2, in which is a non-negative integer and , using mathematical induction in the following form:

For ,

is equivalent to which is true.

Similarly, for we have

Now suppose the statement is true for :

Then it follows that

since as well as . By the modified induction we conclude the statement is true for every non-negative integer .

By noting that if , then is negative gives case 3.

Generalizations

Generalization of exponent

The exponent can be generalized to an arbitrary real number as follows: if , then

for or , and

for .

This generalization can be proved by comparing derivatives. The strict versions of these inequalities require and .

Generalization of base

Instead of the inequality holds also in the form where are real numbers, all greater than , all with the same sign. Bernoulli's inequality is a special case when . This generalized inequality can be proved by mathematical induction.

Proof

In the first step we take . In this case the inequality is obviously true.

In the second step we assume validity of the inequality for numbers and deduce validity for numbers.

We assume thatis valid. After multiplying both sides with a positive number we get:

As all have the same sign, the products are all positive numbers. So the quantity on the right-hand side can be bounded as follows:what was to be shown.

The following inequality estimates the -th power of from the other side. For any real numbers and with , one has

where 2.718.... This may be proved using the inequality

Alternative form

An alternative form of Bernoulli's inequality for and is:

This can be proved (for any integer ) by using the formula for geometric series: (using )

or equivalently

Alternative proofs

Arithmetic and geometric means

An elementary proof for and can be given using weighted AM-GM.

Let be two non-negative real constants. By weighted AM-GM on with weights respectively, we get

Note that

and

so our inequality is equivalent to

After substituting (bearing in mind that this implies ) our inequality turns into

which is Bernoulli's inequality.

Geometric series

Bernoulli's inequality

is equivalent to

and by the formula for geometric series (using y = 1+x) we get

which leads to

Now if then by monotony of the powers each summand , and therefore their sum is greater and hence the product on the LHS of ( 4 ).

If then by the same arguments and thus all addends are non-positive and hence so is their sum. Since the product of two non-positive numbers is non-negative, we get again ( 4 ).

Binomial theorem

One can prove Bernoulli's inequality for x ≥ 0 using the binomial theorem. It is true trivially for r = 0, so suppose r is a positive integer. Then Clearly and hence as required.

Using convexity

For the function is strictly convex. Therefore for holds and the reversed inequality is valid for and .

Another way of using convexity is to re-cast the desired inequality to for real and real . This inequality can be proved using the fact that the function is concave, and then using Jensen's inequality in the form to give: which is the desired inequality.

Notes

  1. Brannan, D. A. (2006). A First Course in Mathematical Analysis. Cambridge University Press. p. 20. ISBN   9781139458955.
  2. Excluding the case r = 0 and x = –1, or assuming that 00 = 1.
  3. 1 2 mathematics – First use of Bernoulli's inequality and its name – History of Science and Mathematics Stack Exchange

Related Research Articles

<span class="mw-page-title-main">Binomial coefficient</span> Number of subsets of a given size

In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers nk ≥ 0 and is written It is the coefficient of the xk term in the polynomial expansion of the binomial power (1 + x)n; this coefficient can be computed by the multiplicative formula

<span class="mw-page-title-main">Exponential distribution</span> Probability distribution

In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate; the distance parameter could be any meaningful mono-dimensional measure of the process, such as time between production errors, or length along a roll of fabric in the weaving manufacturing process. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

The Liouville lambda function, denoted by λ(n) and named after Joseph Liouville, is an important arithmetic function. Its value is +1 if n is the product of an even number of prime numbers, and −1 if it is the product of an odd number of primes.

<span class="mw-page-title-main">Jensen's inequality</span> Theorem of convex functions

In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906, building on an earlier proof of the same inequality for doubly-differentiable functions by Otto Hölder in 1889. Given its generality, the inequality appears in many forms depending on the context, some of which are presented below. In its simplest form the inequality states that the convex transformation of a mean is less than or equal to the mean applied after convex transformation; it is a simple corollary that the opposite is true of concave transformations.

In probability theory, the Vysochanskij–Petunin inequality gives a lower bound for the probability that a random variable with finite variance lies within a certain number of standard deviations of the variable's mean, or equivalently an upper bound for the probability that it lies further away. The sole restrictions on the distribution are that it be unimodal and have finite variance; here unimodal implies that it is a continuous probability distribution except at the mode, which may have a non-zero probability.

In mathematics, the von Mangoldt function is an arithmetic function named after German mathematician Hans von Mangoldt. It is an example of an important arithmetic function that is neither multiplicative nor additive.

In number theory, Ostrowski's theorem, due to Alexander Ostrowski (1916), states that every non-trivial absolute value on the rational numbers is equivalent to either the usual real absolute value or a p-adic absolute value.

In mathematics, a real or complex-valued function f on d-dimensional Euclidean space satisfies a Hölder condition, or is Hölder continuous, when there are real constants C ≥ 0, α > 0, such that for all x and y in the domain of f. More generally, the condition can be formulated for functions between any two metric spaces. The number is called the exponent of the Hölder condition. A function on an interval satisfying the condition with α > 1 is constant. If α = 1, then the function satisfies a Lipschitz condition. For any α > 0, the condition implies the function is uniformly continuous. The condition is named after Otto Hölder.

Hardy's inequality is an inequality in mathematics, named after G. H. Hardy. It states that if is a sequence of non-negative real numbers, then for every real number p > 1 one has

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

In statistics, a power transform is a family of functions applied to create a monotonic transformation of data using power functions. It is a data transformation technique used to stabilize variance, make the data more normal distribution-like, improve the validity of measures of association, and for other data stabilization procedures.

A self-concordant function is a function satisfying a certain differential inequality, which makes it particularly easy for optimization using Newton's method A self-concordant barrier is a particular self-concordant function, that is also a barrier function for a particular convex set. Self-concordant barriers are important ingredients in interior point methods for optimization.

In mathematics, the logarithmic norm is a real-valued functional on operators, and is derived from either an inner product, a vector norm, or its induced operator norm. The logarithmic norm was independently introduced by Germund Dahlquist and Sergei Lozinskiĭ in 1958, for square matrices. It has since been extended to nonlinear operators and unbounded operators as well. The logarithmic norm has a wide range of applications, in particular in matrix theory, differential equations and numerical analysis. In the finite-dimensional setting, it is also referred to as the matrix measure or the Lozinskiĭ measure.

<span class="mw-page-title-main">Poisson distribution</span> Discrete probability distribution

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event. It can also be used for the number of events in other types of intervals than time, and in dimension greater than 1.

Khabibullin's conjecture is a conjecture in mathematics related to Paley's problem for plurisubharmonic functions and to various extremal problems in the theory of entire functions of several variables. The conjecture was named after its proposer, B. N. Khabibullin.

In probability theory, concentration inequalities provide mathematical bounds on the probability of a random variable deviating from some value.

<span class="mw-page-title-main">Grunsky matrix</span> Matrix used in complex analysis

In complex analysis and geometric function theory, the Grunsky matrices, or Grunsky operators, are infinite matrices introduced in 1939 by Helmut Grunsky. The matrices correspond to either a single holomorphic function on the unit disk or a pair of holomorphic functions on the unit disk and its complement. The Grunsky inequalities express boundedness properties of these matrices, which in general are contraction operators or in important special cases unitary operators. As Grunsky showed, these inequalities hold if and only if the holomorphic function is univalent. The inequalities are equivalent to the inequalities of Goluzin, discovered in 1947. Roughly speaking, the Grunsky inequalities give information on the coefficients of the logarithm of a univalent function; later generalizations by Milin, starting from the Lebedev–Milin inequality, succeeded in exponentiating the inequalities to obtain inequalities for the coefficients of the univalent function itself. The Grunsky matrix and its associated inequalities were originally formulated in a more general setting of univalent functions between a region bounded by finitely many sufficiently smooth Jordan curves and its complement: the results of Grunsky, Goluzin and Milin generalize to that case.

For certain applications in linear algebra, it is useful to know properties of the probability distribution of the largest eigenvalue of a finite sum of random matrices. Suppose is a finite sequence of random matrices. Analogous to the well-known Chernoff bound for sums of scalars, a bound on the following is sought for a given parameter t:

In mathematics, there are many kinds of inequalities involving matrices and linear operators on Hilbert spaces. This article covers some important operator inequalities connected with traces of matrices.

In number theory, the prime omega functions and count the number of prime factors of a natural number Thereby counts each distinct prime factor, whereas the related function counts the total number of prime factors of honoring their multiplicity. That is, if we have a prime factorization of of the form for distinct primes , then the respective prime omega functions are given by and . These prime factor counting functions have many important number theoretic relations.

References