Nesbitt's inequality

Last updated

In mathematics, Nesbitt's inequality, named after Alfred Nesbitt, states that for positive real numbers a, b and c,

Contents

with equality only when (i. e. in an equilateral triangle).

There is no corresponding upper bound as any of the 3 fractions in the inequality can be made arbitrarily large.

It is the three-variable case of the rather more difficult Shapiro inequality, and was published at least 50 years earlier.

Proof

First proof: AM-HM inequality

By the AM-HM inequality on ,

Clearing denominators yields

from which we obtain

by expanding the product and collecting like denominators. This then simplifies directly to the final result.

Second proof: Rearrangement

Supposing , we have that

Define

and .

By the rearrangement inequality, the dot product of the two sequences is maximized when the terms are arranged to be both increasing or both decreasing. The order here is both decreasing. Let and be the vector cyclically shifted by one and by two places; then

Addition then yields Nesbitt's inequality.

Third proof: Sum of Squares

The following identity is true for all

This clearly proves that the left side is no less than for positive a, b and c.

Note: every rational inequality can be demonstrated by transforming it to the appropriate sum-of-squares identity—see Hilbert's seventeenth problem.

Fourth proof: Cauchy–Schwarz

Invoking the Cauchy–Schwarz inequality on the vectors yields

which can be transformed into the final result as we did in the AM-HM proof.

Fifth proof: AM-GM

Let . We then apply the AM-GM inequality to obtain

because

Substituting out the in favor of yields

which then simplifies to the final result.

Sixth proof: Titu's lemma

Titu's lemma, a direct consequence of the Cauchy–Schwarz inequality, states that for any sequence of real numbers and any sequence of positive numbers ,

We use the lemma on and . This gives

which results in

i.e.,

Seventh proof: Using homogeneity

As the left side of the inequality is homogeneous, we may assume . Now define , , and . The desired inequality turns into , or, equivalently, . This is clearly true by Titu's Lemma.

Eighth proof: Jensen's inequality

Let and consider the function . This function can be shown to be convex in and, invoking Jensen's inequality, we get

A straightforward computation then yields

Ninth proof: Reduction to a two-variable inequality

By clearing denominators,

It therefore suffices to prove that for , as summing this three times for and completes the proof.

As we are done.

Related Research Articles

<span class="mw-page-title-main">Arithmetic–geometric mean</span> Mathematical function of two positive real arguments

In mathematics, the arithmetic–geometric mean of two positive real numbers x and y is the mutual limit of a sequence of arithmetic means and a sequence of geometric means. The arithmetic–geometric mean is used in fast algorithms for exponential, trigonometric functions, and other special functions, as well as some mathematical constants, in particular, computing π.

<span class="mw-page-title-main">Binomial distribution</span> Probability distribution

In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own Boolean-valued outcome: success or failure. A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment, and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.

<span class="mw-page-title-main">Binomial coefficient</span> Number of subsets of a given size

In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers nk ≥ 0 and is written It is the coefficient of the xk term in the polynomial expansion of the binomial power (1 + x)n; this coefficient can be computed by the multiplicative formula

<span class="mw-page-title-main">Maxwell–Boltzmann distribution</span> Specific probability distribution function, important in physics

In physics, the Maxwell–Boltzmann distribution, or Maxwell(ian) distribution, is a particular probability distribution named after James Clerk Maxwell and Ludwig Boltzmann.

<span class="mw-page-title-main">Pauli matrices</span> Matrices important in quantum mechanics and the study of spin

In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices that are traceless, Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

<span class="mw-page-title-main">Square root</span> Number whose square is a given number

In mathematics, a square root of a number x is a number y such that ; in other words, a number y whose square is x. For example, 4 and −4 are square roots of 16 because .

<span class="mw-page-title-main">Inequality (mathematics)</span> Mathematical relation expressed with < or ≤

In mathematics, an inequality is a relation which makes a non-equal comparison between two numbers or other mathematical expressions. It is used most often to compare two numbers on the number line by their size. The main types of inequality are less than and greater than.

In mathematics, a generating function is a representation of an infinite sequence of numbers as the coefficients of a formal power series. Unlike an ordinary series, the formal power series is not required to converge: in fact, the generating function is not actually regarded as a function, and the "variable" remains an indeterminate. Generating functions were first introduced by Abraham de Moivre in 1730, in order to solve the general linear recurrence problem. One can generalize to formal power series in more than one indeterminate, to encode information about infinite multi-dimensional arrays of numbers.

<span class="mw-page-title-main">Error function</span> Sigmoid shape special function

In mathematics, the error function, often denoted by erf, is a function defined as:

In probability theory, the Azuma–Hoeffding inequality gives a concentration result for the values of martingales that have bounded differences.

In probability theory, the central limit theorem states that, under certain circumstances, the probability distribution of the scaled mean of a random sample converges to a normal distribution as the sample size increases to infinity. Under stronger assumptions, the Berry–Esseen theorem, or Berry–Esseen inequality, gives a more quantitative result, because it also specifies the rate at which this convergence takes place by giving a bound on the maximal error of approximation between the normal distribution and the true distribution of the scaled sample mean. The approximation is measured by the Kolmogorov–Smirnov distance. In the case of independent samples, the convergence rate is n−1/2, where n is the sample size, and the constant is estimated in terms of the third absolute normalized moment.

<span class="mw-page-title-main">AM–GM inequality</span> Arithmetic mean is greater than or equal to geometric mean

In mathematics, the inequality of arithmetic and geometric means, or more briefly the AM–GM inequality, states that the arithmetic mean of a list of non-negative real numbers is greater than or equal to the geometric mean of the same list; and further, that the two means are equal if and only if every number in the list is the same.

In mathematics, Muirhead's inequality, named after Robert Franklin Muirhead, also known as the "bunching" method, generalizes the inequality of arithmetic and geometric means.

In the field of mathematics, norms are defined for elements within a vector space. Specifically, when the vector space comprises matrices, such norms are referred to as matrix norms. Matrix norms differ from vector norms in that they must also interact with matrix multiplication.

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

Methods of computing square roots are algorithms for approximating the non-negative square root of a positive real number . Since all square roots of natural numbers, other than of perfect squares, are irrational, square roots can usually only be computed to some finite precision: these methods typically construct a series of increasingly accurate approximations.

In probability theory, concentration inequalities provide mathematical bounds on the probability of a random variable deviating from some value.

In pure and applied mathematics, quantum mechanics and computer graphics, a tensor operator generalizes the notion of operators which are scalars and vectors. A special class of these are spherical tensor operators which apply the notion of the spherical basis and spherical harmonics. The spherical basis closely relates to the description of angular momentum in quantum mechanics and spherical harmonic functions. The coordinate-free generalization of a tensor operator is known as a representation operator.

In mathematics, a transformation of a sequence's generating function provides a method of converting the generating function for one sequence into a generating function enumerating another. These transformations typically involve integral formulas applied to a sequence generating function or weighted sums over the higher-order derivatives of these functions.

References