Upper and lower probabilities

Last updated September 08, 2023

Upper and lower probabilities are representations of imprecise probability. Whereas probability theory uses a single number, the probability, to describe how likely an event is to occur, this method uses two numbers: the upper probability of the event and the lower probability of the event.

Because frequentist statistics disallows metaprobabilities,^{[ citation needed ]} frequentists have had to propose new solutions. Cedric Smith and Arthur Dempster each developed a theory of upper and lower probabilities. Glenn Shafer developed Dempster's theory further, and it is now known as Dempster–Shafer theory or Choquet (1953). More precisely, in the work of these authors one considers in a power set, $P(S)\,\!$ , a mass function $m:P(S)\rightarrow R$ satisfying the conditions

m(\varnothing )=0\,\,\,\,\,\,\!;\,\,\,\,\,\,m(A)\geq 0\,\,\,\,\,\,\!;\,\,\,\,\,\,\sum _{A\in P(S)}m(A)=1.\,\!

In turn, a mass is associated with two non-additive continuous measures called belief and plausibility defined as follows:

\operatorname {bel} (A)=\sum _{B\mid B\subseteq A}m(B)\,\,\,\,;\,\,\,\,\operatorname {pl} (A)=\sum _{B\mid B\cap A\neq \varnothing }m(B)

In the case where $S$ is infinite there can be $\operatorname {bel}$ such that there is no associated mass function. See p. 36 of Halpern (2003). Probability measures are a special case of belief functions in which the mass function assigns positive mass to singletons of the event space only.

A different notion of upper and lower probabilities is obtained by the lower and upper envelopes obtained from a class C of probability distributions by setting

\operatorname {env_{1}} (A)=\inf _{p\in C}p(A)\,\,\,\,;\,\,\,\,\operatorname {env_{2}} (A)=\sup _{p\in C}p(A)

The upper and lower probabilities are also related with probabilistic logic: see Gerla (1994).

Observe also that a necessity measure can be seen as a lower probability and a possibility measure can be seen as an upper probability.

Related Research Articles

The Kolmogorov axioms are the foundations of probability theory introduced by Russian mathematician Andrey Kolmogorov in 1933. These axioms remain central and have direct contributions to mathematics, the physical sciences, and real-world probability cases. An alternative approach to formalising probability, favoured by some Bayesians, is given by Cox's theorem.

A random variable is a mathematical formalization of a quantity or object which depends on random events. The term 'random variable' can be misleading as it is not actually random nor a variable, but rather it is a function from possible outcomes in a sample space to a measurable space, often to the real numbers.

The likelihood function is the joint probability of observed data viewed as a function of the parameters of a statistical model.

Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, and especially in mathematical statistics. Bayesian updating is particularly important in the dynamic analysis of a sequence of data. Bayesian inference has found application in a wide range of activities, including science, engineering, philosophy, medicine, sport, and law. In the philosophy of decision theory, Bayesian inference is closely related to subjective probability, often called "Bayesian probability".

A Bayesian network is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). It is one of several forms of causal notation. Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.

The theory of belief functions, also referred to as evidence theory or Dempster–Shafer theory (DST), is a general framework for reasoning with uncertainty, with understood connections to other frameworks such as probability, possibility and imprecise probability theories. First introduced by Arthur P. Dempster in the context of statistical inference, the theory was later developed by Glenn Shafer into a general framework for modeling epistemic uncertainty—a mathematical theory of evidence. The theory allows one to combine evidence from different sources and arrive at a degree of belief that takes into account all the available evidence.

In mathematics, the moments of a function are certain quantitative measures related to the shape of the function's graph. If the function represents mass density, then the zeroth moment is the total mass, the first moment is the center of mass, and the second moment is the moment of inertia. If the function is a probability distribution, then the first moment is the expected value, the second central moment is the variance, the third standardized moment is the skewness, and the fourth standardized moment is the kurtosis. The mathematical concept is closely related to the concept of moment in physics.

In mathematics, a closure operator on a set S is a function $:{\mathcal {P}}(S)\rightarrow {\mathcal {P}}(S)}$ from the power set of S to itself that satisfies the following conditions for all sets

In information theory, the information content, self-information, surprisal, or Shannon information is a basic quantity derived from the probability of a particular event occurring from a random variable. It can be thought of as an alternative way of expressing probability, much like odds or log-odds, but which has particular mathematical advantages in the setting of information theory.

Possibility theory is a mathematical theory for dealing with certain types of uncertainty and is an alternative to probability theory. It uses measures of possibility and necessity between 0 and 1, ranging from impossible to possible and unnecessary to necessary, respectively. Professor Lotfi Zadeh first introduced possibility theory in 1978 as an extension of his theory of fuzzy sets and fuzzy logic. Didier Dubois and Henri Prade further contributed to its development. Earlier, in the 1950s, economist G. L. S. Shackle proposed the min/max algebra to describe degrees of potential surprise.

A Dynkin system, named after Eugene Dynkin, is a collection of subsets of another universal set $satisfying a set of axioms weaker than those of 𝜎-algebra. Dynkin systems are sometimes referred to as 𝜆-systems or d-system . These set families have applications in measure theory and probability.$

Imprecise probability generalizes probability theory to allow for partial probability specifications, and is applicable when information is scarce, vague, or conflicting, in which case a unique probability distribution may be hard to identify. Thereby, the theory aims to represent the available knowledge more accurately. Imprecision is useful for dealing with expert elicitation, because:

In mathematics, the concepts of essential infimum and essential supremum are related to the notions of infimum and supremum, but adapted to measure theory and functional analysis, where one often deals with statements that are not valid for all elements in a set, but rather almost everywhere, that is, except on a set of measure zero.

Probabilistic logic involves the use of probability and logic to deal with uncertain situations. Probabilistic logic extends traditional logic truth tables with probabilistic expressions. A difficulty of probabilistic logics is their tendency to multiply the computational complexities of their probabilistic and logical components. Other difficulties include the possibility of counter-intuitive results, such as in case of belief fusion in Dempster–Shafer theory. Source trust and epistemic uncertainty about the probabilities they provide, such as defined in subjective logic, are additional elements to consider. The need to deal with a broad variety of contexts and issues has led to many different proposals.

The transferable belief model (TBM) is an elaboration on the Dempster–Shafer theory (DST), which is a mathematical model used to evaluate the probability that a given proposition is true from other propositions that are assigned probabilities. It was developed by Philippe Smets who proposed his approach as a response to Zadeh’s example against Dempster's rule of combination. In contrast to the original DST the TBM propagates the open-world assumption that relaxes the assumption that all possible outcomes are known. Under the open world assumption Dempster's rule of combination is adapted such that there is no normalization. The underlying idea is that the probability mass pertaining to the empty set is taken to indicate an unexpected outcome, e.g. the belief in a hypothesis outside the frame of discernment. This adaptation violates the probabilistic character of the original DST and also Bayesian inference. Therefore, the authors substituted notation such as probability masses and probability update with terms such as degrees of belief and transfer giving rise to the name of the method: The transferable belief model.

In mathematics, more specifically measure theory, there are various notions of the convergence of measures. For an intuitive general sense of what is meant by convergence of measures, consider a sequence of measures μ_n on a space, sharing a common collection of measurable sets. Such a sequence might represent an attempt to construct 'better and better' approximations to a desired measure μ that is difficult to obtain directly. The meaning of 'better and better' is subject to all the usual caveats for taking limits; for any error tolerance ε > 0 we require there be N sufficiently large for n ≥ N to ensure the 'difference' between μ_n and μ is smaller than ε. Various notions of convergence specify precisely what the word 'difference' should mean in that description; these notions are not equivalent to one another, and vary in strength.

A Choquet integral is a subadditive or superadditive integral created by the French mathematician Gustave Choquet in 1953. It was initially used in statistical mechanics and potential theory, but found its way into decision theory in the 1980s, where it is used as a way of measuring the expected utility of an uncertain event. It is applied specifically to membership functions and capacities. In imprecise probability theory, the Choquet integral is also used to calculate the lower expectation induced by a 2-monotone lower probability, or the upper expectation induced by a 2-alternating upper probability.

In mathematics, especially measure theory, a set function is a function whose domain is a family of subsets of some given set and that (usually) takes its values in the extended real number line $which consists of the real numbers and$

In mathematics, a credal set is a set of probability distributions or, more generally, a set of probability measures. A credal set is often assumed or constructed to be a closed convex set. It is intended to express uncertainty or doubt about the probability model that should be used, or to convey the beliefs of a Bayesian agent about the possible states of the world.

Probability bounds analysis (PBA) is a collection of methods of uncertainty propagation for making qualitative and quantitative calculations in the face of uncertainties of various kinds. It is used to project partial information about random variables and other quantities through mathematical expressions. For instance, it computes sure bounds on the distribution of a sum, product, or more complex function, given only sure bounds on the distributions of the inputs. Such bounds are called probability boxes, and constrain cumulative probability distributions.

References

Choquet, G. (1953). "Theory of Capacities". Annales de l'Institut Fourier . 5: 131–295. doi: 10.5802/aif.53 .
Gerla, G. (1994). "Inferences in Probability Logic". Artificial Intelligence . 70 (1–2): 33–52. doi:10.1016/0004-3702(94)90102-3.
Halpern, J. Y. (2003). Reasoning about Uncertainty. MIT Press. ISBN 978-0-262-08320-1.
Halpern, J. Y.; Fagin, R. (1992). "Two views of belief: Belief as generalized probability and belief as evidence". Artificial Intelligence. 54 (3): 275–317. CiteSeerX 10.1.1.70.6130 . doi:10.1016/0004-3702(92)90048-3. S2CID 11339219.
Huber, P. J. (1980). Robust Statistics. New York: Wiley. ISBN 978-0-471-41805-4.
Saffiotti, A. (1992). "A Belief-Function Logic". Procs of the 10h AAAI Conference . San Jose, CA. pp. 642–647. ISBN 978-0-262-51063-9.{{cite book}}: CS1 maint: location missing publisher (link)
Shafer, G. (1976). A Mathematical Theory of Evidence . Princeton: Princeton University Press. ISBN 978-0-691-08175-5.
Walley, P.; Fine, T. L. (1982). "Towards a frequentist theory of upper and lower probability". Annals of Statistics . 10 (3): 741–761. doi: 10.1214/aos/1176345868 . JSTOR 2240901.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

Upper and lower probabilities

See also

Related Research Articles

References