Boole's inequality

Last updated

In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events. This inequality provides an upper bound on the probability of occurrence of at least one of a countable number of events in terms of the individual probabilities of the events. Boole's inequality is named for its discoverer, George Boole. [1]

Contents

Formally, for a countable set of events A1, A2, A3, ..., we have

In measure-theoretic terms, Boole's inequality follows from the fact that a measure (and certainly any probability measure) is σ-sub-additive.

Proof

Proof using induction

Boole's inequality may be proved for finite collections of events using the method of induction.

For the case, it follows that

For the case , we have

Since and because the union operation is associative, we have

Since

by the first axiom of probability, we have

and therefore

Proof without using induction

For any events in in our probability space we have

One of the axioms of a probability space is that if are disjoint subsets of the probability space then

this is called countable additivity.

If we modify the sets , so they become disjoint,

we can show that

by proving both directions of inclusion.

Suppose . Then for some minimum such that . Therefore . So the first inclusion is true: .

Next suppose that . It follows that for some . And so , and we have the other inclusion: .

By construction of each , . For it is the case that

So, we can conclude that the desired inequality is true:

Bonferroni inequalities

Boole's inequality may be generalized to find upper and lower bounds on the probability of finite unions of events. [2] These bounds are known as Bonferroni inequalities, after Carlo Emilio Bonferroni; see Bonferroni (1936).

Let

for all integers k in {1, ..., n}.

Then, when is odd:

holds, and when is even:

holds.

The equalities follow from the inclusion–exclusion principle, and Boole's inequality is the special case of .

Proof for odd K

Let , where for each . These such partition the sample space, and for each and every , is either contained in or disjoint from it.

If , then contributes 0 to both sides of the inequality.

Otherwise, assume is contained in exactly of the . Then contributes exactly to the right side of the inequality, while it contributes

to the left side of the inequality. However, by Pascal's rule, this is equal to

which telescopes to

Thus, the inequality holds for all events , and so by summing over , we obtain the desired inequality:

The proof for even is nearly identical. [3]

Example

Suppose that you are estimating 5 parameters based on a random sample, and you can control each parameter separately. If you want your estimations of all five parameters to be good with a chance 95%, what should you do to each parameter?

Tuning each parameter's chance to be good to within 95% is not enough because "all are good" is a subset of each event "Estimate i is good". We can use Boole's Inequality to solve this problem. By finding the complement of event "all five are good", we can change this question into another condition:

P( at least one estimation is bad) = 0.05 ≤ P( A1 is bad) + P( A2 is bad) + P( A3 is bad) + P( A4 is bad) + P( A5 is bad)

One way is to make each of them equal to 0.05/5 = 0.01, that is 1%. In another word, you have to guarantee each estimate good to 99%( for example, by constructing a 99% confidence interval) to make sure the total estimation to be good with a chance 95%. This is called the Bonferroni Method of simultaneous inference.

See also

Related Research Articles

<span class="mw-page-title-main">Measure (mathematics)</span> Generalization of mass, length, area and volume

In mathematics, the concept of a measure is a generalization and formalization of geometrical measures and other common notions, such as magnitude, mass, and probability of events. These seemingly distinct concepts have many similarities and can often be treated together in a single mathematical context. Measures are foundational in probability theory, integration theory, and can be generalized to assume negative values, as with electrical charge. Far-reaching generalizations of measure are widely used in quantum physics and physics in general.

In mathematical analysis and in probability theory, a σ-algebra on a set X is a nonempty collection Σ of subsets of X closed under complement, countable unions, and countable intersections. The ordered pair is called a measurable space.

In probability theory, the Borel–Cantelli lemma is a theorem about sequences of events. In general, it is a result in measure theory. It is named after Émile Borel and Francesco Paolo Cantelli, who gave statement to the lemma in the first decades of the 20th century. A related result, sometimes called the second Borel–Cantelli lemma, is a partial converse of the first Borel–Cantelli lemma. The lemma states that, under certain conditions, an event will have probability of either zero or one. Accordingly, it is the best-known of a class of similar theorems, known as zero-one laws. Other examples include Kolmogorov's zero–one law and the Hewitt–Savage zero–one law.

Distributions, also known as Schwartz distributions or generalized functions, are objects that generalize the classical notion of functions in mathematical analysis. Distributions make it possible to differentiate functions whose derivatives do not exist in the classical sense. In particular, any locally integrable function has a distributional derivative.

In the mathematical field of real analysis, the monotone convergence theorem is any of a number of related theorems proving the good convergence behaviour of monotonic sequences, i.e. sequences that are non-increasing, or non-decreasing. In its simplest form, it says that a non-decreasing bounded-above sequence of real numbers converges to its smallest upper bound, its supremum. Likewise, a non-increasing bounded-below sequence converges to its largest lower bound, its infimum. In particular, infinite sums of non-negative numbers converge to the supremum of the partial sums if and only if the partial sums are bounded.

In mathematical analysis, Hölder's inequality, named after Otto Hölder, is a fundamental inequality between integrals and an indispensable tool for the study of Lp spaces.

In mathematics, Fatou's lemma establishes an inequality relating the Lebesgue integral of the limit inferior of a sequence of functions to the limit inferior of integrals of these functions. The lemma is named after Pierre Fatou.

Vapnik–Chervonenkis theory was developed during 1960–1990 by Vladimir Vapnik and Alexey Chervonenkis. The theory is a form of computational learning theory, which attempts to explain the learning process from a statistical point of view.

<span class="mw-page-title-main">Inclusion–exclusion principle</span> Counting technique in combinatorics

In combinatorics, a branch of mathematics, the inclusion–exclusion principle is a counting technique which generalizes the familiar method of obtaining the number of elements in the union of two finite sets; symbolically expressed as

In mathematics, the limit of a sequence of sets is a set whose elements are determined by the sequence in either of two equivalent ways: (1) by upper and lower bounds on the sequence that converge monotonically to the same set and (2) by convergence of a sequence of indicator functions which are themselves real-valued. As is the case with sequences of other objects, convergence is not necessary or even usual.

In probability theory, the Azuma–Hoeffding inequality gives a concentration result for the values of martingales that have bounded differences.

In probability theory, a pairwise independent collection of random variables is a set of random variables any two of which are independent. Any collection of mutually independent random variables is pairwise independent, but some pairwise independent collections are not mutually independent. Pairwise independent random variables with finite variance are uncorrelated.

In mathematics, a positive (or signed) measure μ defined on a σ-algebra Σ of subsets of a set X is called a finite measure if μ(X) is a finite real number (rather than ∞). A set A in Σ is of finite measure if μ(A) < ∞. The measure μ is called σ-finite if X is a countable union of measurable sets each with finite measure. A set in a measure space is said to have σ-finite measure if it is a countable union of measurable sets with finite measure. A measure being σ-finite is a weaker condition than being finite, i.e. all finite measures are σ-finite but there are (many) σ-finite measures that are not finite.

In mathematics, the Grothendieck inequality states that there is a universal constant with the following property. If Mij is an n × n matrix with

In mathematics, in particular in measure theory, a content is a real-valued function defined on a collection of subsets such that

In mathematics, the Vitali covering lemma is a combinatorial and geometric result commonly used in measure theory of Euclidean spaces. This lemma is an intermediate step, of independent interest, in the proof of the Vitali covering theorem. The covering theorem is credited to the Italian mathematician Giuseppe Vitali. The theorem states that it is possible to cover, up to a Lebesgue-negligible set, a given subset E of Rd by a disjoint family extracted from a Vitali covering of E.

In mathematics, the Brascamp–Lieb inequality is either of two inequalities. The first is a result in geometry concerning integrable functions on n-dimensional Euclidean space . It generalizes the Loomis–Whitney inequality and Hölder's inequality. The second is a result of probability theory which gives a concentration inequality for log-concave probability distributions. Both are named after Herm Jan Brascamp and Elliott H. Lieb.

In mathematics, especially measure theory, a set function is a function whose domain is a family of subsets of some given set and that (usually) takes its values in the extended real number line which consists of the real numbers and

In probabilistic logic, the Fréchet inequalities, also known as the Boole–Fréchet inequalities, are rules implicit in the work of George Boole and explicitly derived by Maurice Fréchet that govern the combination of probabilities about logical propositions or events logically linked together in conjunctions or disjunctions as in Boolean expressions or fault or event trees common in risk assessments, engineering design and artificial intelligence. These inequalities can be considered rules about how to bound calculations involving probabilities without assuming independence or, indeed, without making any dependence assumptions whatsoever. The Fréchet inequalities are closely related to the Boole–Bonferroni–Fréchet inequalities, and to Fréchet bounds.

In probability theory, Kolmogorov's two-series theorem is a result about the convergence of random series. It follows from Kolmogorov's inequality and is used in one proof of the strong law of large numbers.

References

  1. Boole, George (1847). The Mathematical Analysis of Logic. Philosophical Library. ISBN   9780802201546.
  2. Casella, George; Berger, Roger L. (2002). Statistical Inference. Duxbury. pp. 11–13. ISBN   0-534-24312-6.
  3. Venkatesh, Santosh (2012). The Theory of Probability. Cambridge University Press. pp. 94–99, 113–115. ISBN   978-0-534-24312-8.

This article incorporates material from Bonferroni inequalities on PlanetMath, which is licensed under the Creative Commons Attribution/Share-Alike License.