Boole's inequality

Last updated

In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events. This inequality provides an upper bound on the probability of occurrence of at least one of a countable number of events in terms of the individual probabilities of the events. Boole's inequality is named for its discoverer, George Boole. [1]

Contents

Formally, for a countable set of events A1, A2, A3, ..., we have

In measure-theoretic terms, Boole's inequality follows from the fact that a measure (and certainly any probability measure) is σ-sub-additive.

Proof

Proof using induction

Boole's inequality may be proved for finite collections of events using the method of induction.

For the case, it follows that

For the case , we have

Since and because the union operation is associative, we have

Since

by the first axiom of probability, we have

and therefore

Proof without using induction

For any events in in our probability space we have

One of the axioms of a probability space is that if are disjoint subsets of the probability space then

this is called countable additivity.

If we modify the sets , so they become disjoint,

we can show that

by proving both directions of inclusion.

Suppose . Then for some minimum such that . Therefore . So the first inclusion is true: .

Next suppose that . It follows that for some . And so , and we have the other inclusion: .

By construction of each , . For it is the case that

So, we can conclude that the desired inequality is true:

Bonferroni inequalities

Boole's inequality may be generalized to find upper and lower bounds on the probability of finite unions of events. [2] These bounds are known as Bonferroni inequalities, after Carlo Emilio Bonferroni; see Bonferroni (1936).

Let

for all integers k in {1, ..., n}.

When n is odd, the sequence of inequalities:

and

both hold. When n is even, then:

and

both hold. The chains of inequalities among the partial sums follow from the observation that the events Sk form a decreasing sequence of sets. [3] The equalities follow from the inclusion–exclusion principle, and Boole's inequality is the special case of the extremal upper bounds.

Example

Suppose that you are estimating 5 parameters based on a random sample, and you can control each parameter separately. If you want your estimations of all five parameters to be good with a chance 95%, what should you do to each parameter?

Tuning each parameter's chance to be good to within 95% is not enough because "all are good" is a subset of each event "Estimate i is good". We can use Boole's Inequality to solve this problem. By finding the complement of event "all fives are good", we can change this question into another condition:

P( at least one estimation is bad) = 0.05 ≤ P( A1 is bad) + P( A2 is bad) + P( A3 is bad) + P( A4 is bad) + P( A5 is bad)

One way is to make each of them equal to 0.05/5 = 0.01, that is 1%. In another word, you have to guarantee each estimate good to 99%( for example, by constructing a 99% confidence interval) to make sure the total estimation to be good with a chance 95%. This is called Bonferroni Method of simultaneous inference.

See also

Related Research Articles

<span class="mw-page-title-main">Entropy (information theory)</span> Expected amount of information needed to specify the output of a stochastic data source

In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes. Given a discrete random variable , which takes values in the alphabet and is distributed according to :

<span class="mw-page-title-main">Measure (mathematics)</span> Generalization of mass, length, area and volume

In mathematics, the concept of a measure is a generalization and formalization of geometrical measures and other common notions, such as magnitude, mass, and probability of events. These seemingly distinct concepts have many similarities and can often be treated together in a single mathematical context. Measures are foundational in probability theory, integration theory, and can be generalized to assume negative values, as with electrical charge. Far-reaching generalizations of measure are widely used in quantum physics and physics in general.

In mathematical analysis and in probability theory, a σ-algebra on a set X is a nonempty collection Σ of subsets of X closed under complement, countable unions, and countable intersections. The ordered pair is called a measurable space.

In mathematics, an infinite series of numbers is said to converge absolutely if the sum of the absolute values of the summands is finite. More precisely, a real or complex series is said to converge absolutely if for some real number Similarly, an improper integral of a function, is said to converge absolutely if the integral of the absolute value of the integrand is finite—that is, if

In the mathematical field of real analysis, the monotone convergence theorem is any of a number of related theorems proving the convergence of monotonic sequences that are also bounded. Informally, the theorems state that if a sequence is increasing and bounded above by a supremum, then the sequence will converge to the supremum; in the same way, if a sequence is decreasing and is bounded below by an infimum, it will converge to the infimum.

In mathematical analysis, Hölder's inequality, named after Otto Hölder, is a fundamental inequality between integrals and an indispensable tool for the study of Lp spaces.

In mathematics, Fatou's lemma establishes an inequality relating the Lebesgue integral of the limit inferior of a sequence of functions to the limit inferior of integrals of these functions. The lemma is named after Pierre Fatou.

<span class="mw-page-title-main">Vapnik–Chervonenkis theory</span> Branch of statistical computational learning theory

Vapnik–Chervonenkis theory was developed during 1960–1990 by Vladimir Vapnik and Alexey Chervonenkis. The theory is a form of computational learning theory, which attempts to explain the learning process from a statistical point of view.

<span class="mw-page-title-main">Inclusion–exclusion principle</span> Counting technique in combinatorics

In combinatorics, a branch of mathematics, the inclusion–exclusion principle is a counting technique which generalizes the familiar method of obtaining the number of elements in the union of two finite sets; symbolically expressed as

In mathematics, the limit of a sequence of sets is a set whose elements are determined by the sequence in either of two equivalent ways: (1) by upper and lower bounds on the sequence that converge monotonically to the same set and (2) by convergence of a sequence of indicator functions which are themselves real-valued. As is the case with sequences of other objects, convergence is not necessary or even usual.

In probability theory, a pairwise independent collection of random variables is a set of random variables any two of which are independent. Any collection of mutually independent random variables is pairwise independent, but some pairwise independent collections are not mutually independent. Pairwise independent random variables with finite variance are uncorrelated.

In functional analysis and related areas of mathematics, a sequence space is a vector space whose elements are infinite sequences of real or complex numbers. Equivalently, it is a function space whose elements are functions from the natural numbers to the field K of real or complex numbers. The set of all such functions is naturally identified with the set of all possible infinite sequences with elements in K, and can be turned into a vector space under the operations of pointwise addition of functions and pointwise scalar multiplication. All sequence spaces are linear subspaces of this space. Sequence spaces are typically equipped with a norm, or at least the structure of a topological vector space.

In mathematics, a positive (or signed) measure μ defined on a σ-algebra Σ of subsets of a set X is called a finite measure if μ(X) is a finite real number (rather than ∞). A set A in Σ is of finite measure if μ(A) < ∞. The measure μ is called σ-finite if X is a countable union of measurable sets each with finite measure. A set in a measure space is said to have σ-finite measure if it is a countable union of measurable sets with finite measure. A measure being σ-finite is a weaker condition than being finite, i.e. all finite measures are σ-finite but there are (many) σ-finite measures that are not finite.

In the mathematical theory of probability, a Doob martingale is a stochastic process that approximates a given random variable and has the martingale property with respect to the given filtration. It may be thought of as the evolving sequence of best approximations to the random variable based on information accumulated up to a certain time.

In probability theory, Bernstein inequalities give bounds on the probability that the sum of random variables deviates from its mean. In the simplest case, let X1, ..., Xn be independent Bernoulli random variables taking values +1 and −1 with probability 1/2, then for every positive ,

In probability theory and theoretical computer science, McDiarmid's inequality is a concentration inequality which bounds the deviation between the sampled value and the expected value of certain functions when they are evaluated on independent random variables. McDiarmid's inequality applies to functions that satisfy a bounded differences property, meaning that replacing a single argument to the function while leaving all other arguments unchanged cannot cause too large of a change in the value of the function.

In mathematics, the Grothendieck inequality states that there is a universal constant with the following property. If Mij is an n × n matrix with

In statistics, the Holm–Bonferroni method, also called the Holm method or Bonferroni–Holm method, is used to counteract the problem of multiple comparisons. It is intended to control the family-wise error rate (FWER) and offers a simple test uniformly more powerful than the Bonferroni correction. It is named after Sture Holm, who codified the method, and Carlo Emilio Bonferroni.

In mathematics, especially measure theory, a set function is a function whose domain is a family of subsets of some given set and that (usually) takes its values in the extended real number line which consists of the real numbers and

In probabilistic logic, the Fréchet inequalities, also known as the Boole–Fréchet inequalities, are rules implicit in the work of George Boole and explicitly derived by Maurice Fréchet that govern the combination of probabilities about logical propositions or events logically linked together in conjunctions or disjunctions as in Boolean expressions or fault or event trees common in risk assessments, engineering design and artificial intelligence. These inequalities can be considered rules about how to bound calculations involving probabilities without assuming independence or, indeed, without making any dependence assumptions whatsoever. The Fréchet inequalities are closely related to the Boole–Bonferroni–Fréchet inequalities, and to Fréchet bounds.

References

  1. Boole, George (1847). The Mathematical Analysis of Logic. Philosophical Library. ISBN   9780802201546.
  2. Casella, George; Berger, Roger L. (2002). Statistical Inference. Duxbury. pp. 11–13. ISBN   0-534-24312-6.
  3. Venkatesh, Santosh (2012). The Theory of Probability. Cambridge University Press. pp. 94–99, 113–115. ISBN   978-0-534-24312-8.

This article incorporates material from Bonferroni inequalities on PlanetMath, which is licensed under the Creative Commons Attribution/Share-Alike License.