Imprecise probability

Last updated

Imprecise probability generalizes probability theory to allow for partial probability specifications, and is applicable when information is scarce, vague, or conflicting, in which case a unique probability distribution may be hard to identify. Thereby, the theory aims to represent the available knowledge more accurately. Imprecision is useful for dealing with expert elicitation, because:

Contents

Introduction

Uncertainty is traditionally modelled by a probability distribution, as developed by Kolmogorov, [1] Laplace, de Finetti, [2] Ramsey, Cox, Lindley, and many others. However, this has not been unanimously accepted by scientists, statisticians, and probabilists: it has been argued that some modification or broadening of probability theory is required, because one may not always be able to provide a probability for every event, particularly when only little information or data is availablean early example of such criticism is Boole's critique [3] of Laplace's work, or when we wish to model probabilities that a group agrees with, rather than those of a single individual.

Perhaps the most common generalization is to replace a single probability specification with an interval specification. Lower and upper probabilities, denoted by and , or more generally, lower and upper expectations (previsions), [4] [5] [6] [7] aim to fill this gap. A lower probability function is superadditive but not necessarily additive, whereas an upper probability is subadditive. To get a general understanding of the theory, consider:

We then have a flexible continuum of more or less precise models in between.

Some approaches, summarized under the name nonadditive probabilities, [8] directly use one of these set functions, assuming the other one to be naturally defined such that , with the complement of . Other related concepts understand the corresponding intervals for all events as the basic entity. [9] [10]

History

The idea to use imprecise probability has a long history. The first formal treatment dates back at least to the middle of the nineteenth century, by George Boole, [3] who aimed to reconcile the theories of logic and probability. In the 1920s, in A Treatise on Probability , Keynes [11] formulated and applied an explicit interval estimate approach to probability. Work on imprecise probability models proceeded fitfully throughout the 20th century, with important contributions by Bernard Koopman, C.A.B. Smith, I.J. Good, Arthur Dempster, Glenn Shafer, Peter M. Williams, Henry Kyburg, Isaac Levi, and Teddy Seidenfeld. [12] At the start of the 1990s, the field started to gather some momentum, with the publication of Peter Walley's book Statistical Reasoning with Imprecise Probabilities [7] (which is also where the term "imprecise probability" originates). The 1990s also saw important works by Kuznetsov, [13] and by Weichselberger, [9] [10] who both use the term interval probability. Walley's theory extends the traditional subjective probability theory via buying and selling prices for gambles, whereas Weichselberger's approach generalizes Kolmogorov's axioms without imposing an interpretation.

Standard consistency conditions relate upper and lower probability assignments to non-empty closed convex sets of probability distributions. Therefore, as a welcome by-product, the theory also provides a formal framework for models used in robust statistics [14] and non-parametric statistics. [15] Included are also concepts based on Choquet integration, [16] and so-called two-monotone and totally monotone capacities, [17] which have become very popular in artificial intelligence under the name (Dempster–Shafer) belief functions. [18] [19] Moreover, there is a strong connection [20] to Shafer and Vovk's notion of game-theoretic probability. [21]

Mathematical models

The term "imprecise probability" is somewhat misleading in that precision is often mistaken for accuracy, whereas an imprecise representation may be more accurate than a spuriously precise representation. In any case, the term appears to have become established in the 1990s, and covers a wide range of extensions of the theory of probability, including:

Interpretation of imprecise probabilities

A unification of many of the above-mentioned imprecise probability theories was proposed by Walley, [7] although this is in no way the first attempt to formalize imprecise probabilities. In terms of probability interpretations, Walley's formulation of imprecise probabilities is based on the subjective variant of the Bayesian interpretation of probability. Walley defines upper and lower probabilities as special cases of upper and lower previsions and the gambling framework advanced by Bruno de Finetti. In simple terms, a decision maker's lower prevision is the highest price at which the decision maker is sure he or she would buy a gamble, and the upper prevision is the lowest price at which the decision maker is sure he or she would buy the opposite of the gamble (which is equivalent to selling the original gamble). If the upper and lower previsions are equal, then they jointly represent the decision maker's fair price for the gamble, the price at which the decision maker is willing to take either side of the gamble. The existence of a fair price leads to precise probabilities.

The allowance for imprecision, or a gap between a decision maker's upper and lower previsions, is the primary difference between precise and imprecise probability theories. Such gaps arise naturally in betting markets that happen to be financially illiquid due to asymmetric information. This gap is also given by Henry Kyburg repeatedly for his interval probabilities, though he and Isaac Levi also give other reasons for intervals, or sets of distributions, representing states of belief.

Issues with imprecise probabilities

One issue with imprecise probabilities is that there is often an independent degree of caution or boldness inherent in the use of one interval, rather than a wider or narrower one. This may be a degree of confidence, degree of fuzzy membership, or threshold of acceptance. This is not as much of a problem for intervals that are lower and upper bounds derived from a set of probability distributions, e.g., a set of priors followed by conditionalization on each member of the set. However, it can lead to the question why some distributions are included in the set of priors and some are not.

Another issue is why one can be precise about two numbers, a lower bound and an upper bound, rather than a single number, a point probability. This issue may be merely rhetorical, as the robustness of a model with intervals is inherently greater than that of a model with point-valued probabilities. It does raise concerns about inappropriate claims of precision at endpoints, as well as for point values.

A more practical issue is what kind of decision theory can make use of imprecise probabilities. [31] For fuzzy measures, there is the work of Ronald R. Yager. [32] For convex sets of distributions, Levi's works are instructive. [33] Another approach asks whether the threshold controlling the boldness of the interval matters more to a decision than simply taking the average or using a Hurwicz decision rule. [34] Other approaches appear in the literature. [35] [36] [37] [38]

See also

Related Research Articles

<span class="mw-page-title-main">Probability</span> Branch of mathematics concerning chance and uncertainty

In science, the probability of an event is a number that indicates how likely the event is to occur. It is expressed as a number in the range from 0 and 1, or, using percentage notation, in the range from 0% to 100%. The more likely it is that the event will occur, the higher its probability. The probability of an impossible event is 0; that of an event that is certain to occur is 1. The probabilities of two complementary events A and B – either A occurs or B occurs – add up to 1. A simple example is the tossing of a fair (unbiased) coin. If a coin is fair, the two possible outcomes are equally likely; since these two outcomes are complementary and the probability of "heads" equals the probability of "tails", the probability of each of the two outcomes equals 1/2.

<span class="mw-page-title-main">Dempster–Shafer theory</span> Mathematical framework to model epistemic uncertainty

The theory of belief functions, also referred to as evidence theory or Dempster–Shafer theory (DST), is a general framework for reasoning with uncertainty, with understood connections to other frameworks such as probability, possibility and imprecise probability theories. First introduced by Arthur P. Dempster in the context of statistical inference, the theory was later developed by Glenn Shafer into a general framework for modeling epistemic uncertainty—a mathematical theory of evidence. The theory allows one to combine evidence from different sources and arrive at a degree of belief that takes into account all the available evidence.

Decision theory is a branch of applied probability theory and analytic philosophy concerned with the theory of making decisions based on assigning probabilities to various factors and assigning numerical consequences to the outcome.

In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed. Prediction intervals are often used in regression analysis.

Possibility theory is a mathematical theory for dealing with certain types of uncertainty and is an alternative to probability theory. It uses measures of possibility and necessity between 0 and 1, ranging from impossible to possible and unnecessary to necessary, respectively. Professor Lotfi Zadeh first introduced possibility theory in 1978 as an extension of his theory of fuzzy sets and fuzzy logic. Didier Dubois and Henri Prade further contributed to its development. Earlier, in the 1950s, economist G. L. S. Shackle proposed the min/max algebra to describe degrees of potential surprise.

In computer science, a rough set, first described by Polish computer scientist Zdzisław I. Pawlak, is a formal approximation of a crisp set in terms of a pair of sets which give the lower and the upper approximation of the original set. In the standard version of rough set theory, the lower- and upper-approximation sets are crisp sets, but in other variations, the approximating sets may be fuzzy sets.

In number theory, natural density is one method to measure how "large" a subset of the set of natural numbers is. It relies chiefly on the probability of encountering members of the desired subset when combing through the interval [1, n] as n grows large.

Upper and lower probabilities are representations of imprecise probability. Whereas probability theory uses a single number, the probability, to describe how likely an event is to occur, this method uses two numbers: the upper probability of the event and the lower probability of the event.

Probabilistic logic involves the use of probability and logic to deal with uncertain situations. Probabilistic logic extends traditional logic truth tables with probabilistic expressions. A difficulty of probabilistic logics is their tendency to multiply the computational complexities of their probabilistic and logical components. Other difficulties include the possibility of counter-intuitive results, such as in case of belief fusion in Dempster–Shafer theory. Source trust and epistemic uncertainty about the probabilities they provide, such as defined in subjective logic, are additional elements to consider. The need to deal with a broad variety of contexts and issues has led to many different proposals.

The transferable belief model (TBM) is an elaboration on the Dempster–Shafer theory (DST), which is a mathematical model used to evaluate the probability that a given proposition is true from other propositions that are assigned probabilities. It was developed by Philippe Smets who proposed his approach as a response to Zadeh’s example against Dempster's rule of combination. In contrast to the original DST the TBM propagates the open-world assumption that relaxes the assumption that all possible outcomes are known. Under the open world assumption Dempster's rule of combination is adapted such that there is no normalization. The underlying idea is that the probability mass pertaining to the empty set is taken to indicate an unexpected outcome, e.g. the belief in a hypothesis outside the frame of discernment. This adaptation violates the probabilistic character of the original DST and also Bayesian inference. Therefore, the authors substituted notation such as probability masses and probability update with terms such as degrees of belief and transfer giving rise to the name of the method: The transferable belief model.

The Society for Imprecise Probability: Theories and Applications (SIPTA) was created in February 2002, with the aim of promoting the research on Imprecise probability. This is done through a series of activities for bringing together researchers from different groups, creating resources for information dissemination and documentation, and making other people aware of the potential of Imprecise Probability models.

Bayes linear statistics is a subjectivist statistical methodology and framework. Traditional subjective Bayesian analysis is based upon fully specified probability distributions, which are very difficult to specify at the necessary level of detail. Bayes linear analysis attempts to solve this problem by developing theory and practise for using partially specified probability models. Bayes linear in its current form has been primarily developed by Michael Goldstein. Mathematically and philosophically it extends Bruno de Finetti's Operational Subjective approach to probability and statistics.

<span class="mw-page-title-main">Interval finite element</span>

In numerical analysis, the interval finite element method is a finite element method that uses interval parameters. Interval FEM can be applied in situations where it is not possible to get reliable probabilistic characteristics of the structure. This is important in concrete structures, wood structures, geomechanics, composite structures, biomechanics and in many other areas. The goal of the Interval Finite Element is to find upper and lower bounds of different characteristics of the model and use these results in the design process. This is so called worst case design, which is closely related to the limit state design.

<span class="mw-page-title-main">Probability box</span> Characterization of uncertain numbers consisting of both aleatoric and epistemic uncertainties

A probability box is a characterization of uncertain numbers consisting of both aleatoric and epistemic uncertainties that is often used in risk analysis or quantitative uncertainty modeling where numerical calculations must be performed. Probability bounds analysis is used to make arithmetic and logical calculations with p-boxes.

In mathematics, a credal set is a set of probability distributions or, more generally, a set of probability measures. A credal set is often assumed or constructed to be a closed convex set. It is intended to express uncertainty or doubt about the probability model that should be used, or to convey the beliefs of a Bayesian agent about the possible states of the world.

In statistics, robust Bayesian analysis, also called Bayesian sensitivity analysis, is a type of sensitivity analysis applied to the outcome from Bayesian inference or Bayesian optimal decisions.

Probability bounds analysis (PBA) is a collection of methods of uncertainty propagation for making qualitative and quantitative calculations in the face of uncertainties of various kinds. It is used to project partial information about random variables and other quantities through mathematical expressions. For instance, it computes sure bounds on the distribution of a sum, product, or more complex function, given only sure bounds on the distributions of the inputs. Such bounds are called probability boxes, and constrain cumulative probability distributions.

In probability theory and statistics, the Dirichlet process (DP) is one of the most popular Bayesian nonparametric models. It was introduced by Thomas Ferguson as a prior over probability distributions.

In measurements, the measurement obtained can suffer from two types of uncertainties. The first is the random uncertainty which is due to the noise in the process and the measurement. The second contribution is due to the systematic uncertainty which may be present in the measuring instrument. Systematic errors, if detected, can be easily compensated as they are usually constant throughout the measurement process as long as the measuring instrument and the measurement process are not changed. But it can not be accurately known while using the instrument if there is a systematic error and if there is, how much? Hence, systematic uncertainty could be considered as a contribution of a fuzzy nature.

In regression analysis, an interval predictor model (IPM) is an approach to regression where bounds on the function to be approximated are obtained. This differs from other techniques in machine learning, where usually one wishes to estimate point values or an entire probability distribution. Interval Predictor Models are sometimes referred to as a nonparametric regression technique, because a potentially infinite set of functions are contained by the IPM, and no specific distribution is implied for the regressed variables.

References

  1. Kolmogorov, A. N. (1950). Foundations of the Theory of Probability. New York: Chelsea Publishing Company.
  2. 1 2 de Finetti, Bruno (1974). Theory of Probability. New York: Wiley.
  3. 1 2 3 Boole, George (1854). An investigation of the laws of thought on which are founded the mathematical theories of logic and probabilities. London: Walton and Maberly.
  4. Smith, Cedric A. B. (1961). "Consistency in statistical inference and decision". Journal of the Royal Statistical Society . B (23): 1–37.
  5. 1 2 3 Williams, Peter M. (1975). Notes on conditional previsions. School of Math. and Phys. Sci., Univ. of Sussex.
  6. 1 2 3 Williams, Peter M. (2007). "Notes on conditional previsions". International Journal of Approximate Reasoning . 44 (3): 366–383. doi: 10.1016/j.ijar.2006.07.019 .
  7. 1 2 3 4 5 Walley, Peter (1991). Statistical Reasoning with Imprecise Probabilities . London: Chapman and Hall. ISBN   978-0-412-28660-5.
  8. Denneberg, Dieter (1994). Non-additive Measure and Integral. Dordrecht: Kluwer.
  9. 1 2 3 Weichselberger, Kurt (2000). "The theory of interval probability as a unifying concept for uncertainty". International Journal of Approximate Reasoning. 24 (2–3): 149–170. doi: 10.1016/S0888-613X(00)00032-3 .
  10. 1 2 Weichselberger, K. (2001). Elementare Grundbegriffe einer allgemeineren Wahrscheinlichkeitsrechnung I - Intervallwahrscheinlichkeit als umfassendes Konzept. Heidelberg: Physica.
  11. 1 2 3 Keynes, John Maynard (1921). A Treatise on Probability. London: Macmillan And Co.
  12. "Imprecise Probabilities > Historical appendix: Theories of imprecise belief (Stanford Encyclopedia of Philosophy)".
  13. Kuznetsov, Vladimir P. (1991). Interval Statistical Models. Moscow: Radio i Svyaz Publ.
  14. Ruggeri, Fabrizio (2000). Robust Bayesian Analysis. D. Ríos Insua. New York: Springer.
  15. Augustin, T.; Coolen, F. P. A. (2004). "Nonparametric predictive inference and interval probability" (PDF). Journal of Statistical Planning and Inference. 124 (2): 251–272. doi:10.1016/j.jspi.2003.07.003.
  16. de Cooman, G.; Troffaes, M. C. M.; Miranda, E. (2008). "n-Monotone exact functionals". Journal of Mathematical Analysis and Applications . 347 (1): 143–156. arXiv: 0801.1962 . Bibcode:2008JMAA..347..143D. doi:10.1016/j.jmaa.2008.05.071. S2CID   6561656.
  17. Huber, P. J.; V. Strassen (1973). "Minimax tests and the Neyman-Pearson lemma for capacities". The Annals of Statistics . 1 (2): 251–263. doi: 10.1214/aos/1176342363 .
  18. 1 2 Dempster, A. P. (1967). "Upper and lower probabilities induced by a multivalued mapping". The Annals of Mathematical Statistics. 38 (2): 325–339. doi: 10.1214/aoms/1177698950 . JSTOR   2239146.
  19. 1 2 Shafer, Glenn (1976). A Mathematical Theory of Evidence . Princeton University Press. ISBN   978-0-691-08175-5.
  20. de Cooman, G.; Hermans, F. (2008). "Imprecise probability trees: Bridging two theories of imprecise probability". Artificial Intelligence . 172 (11): 1400–1427. arXiv: 0801.1196 . doi:10.1016/j.artint.2008.03.001. S2CID   14060218.
  21. Shafer, Glenn; Vladimir Vovk (2001). Probability and Finance: It's Only a Game!. Wiley.
  22. Zadeh, L. A. (1978). "Fuzzy sets as a basis for a theory of possibility". Fuzzy Sets and Systems . 1: 3–28. doi:10.1016/0165-0114(78)90029-5. hdl: 10338.dmlcz/135193 .
  23. Dubois, Didier; Henri Prade (1985). Théorie des possibilité. Paris: Masson.
  24. Dubois, Didier; Henri Prade (1988). Possibility Theory - An Approach to Computerized Processing of Uncertainty . New York: Plenum Press. ISBN   978-0-306-42520-2.
  25. Troffaes, Matthias C. M.; de Cooman, Gert (2014). Lower previsions. Wiley. doi:10.1002/9781118762622. ISBN   978-0-470-72377-7.
  26. de Finetti, Bruno (1931). "Sul significato soggettivo della probabilità". Fundamenta Mathematicae . 17: 298–329. doi: 10.4064/fm-17-1-298-329 .
  27. Fine, Terrence L. (1973). Theories of Probability . New York: Academic Press. ISBN   978-0-12-256450-5.
  28. Fishburn, P. C. (1986). "The axioms of subjective probability". Statistical Science. 1 (3): 335–358. doi: 10.1214/ss/1177013611 .
  29. Ferson, Scott; Vladik Kreinovich; Lev Ginzburg; David S. Myers; Kari Sentz (2003). "Constructing Probability Boxes and Dempster-Shafer Structures". SAND2002-4015. Sandia National Laboratories. Archived from the original on 2011-07-22. Retrieved 2009-09-23.
  30. Berger, James O. (1984). "The robust Bayesian viewpoint". In Kadane, J. B. (ed.). Robustness of Bayesian Analyses . Elsevier Science. pp.  63–144. ISBN   978-0-444-86209-9.
  31. Seidenfeld, Teddy (1983). "Decisions with indeterminate probabilities". Behavioral and Brain Sciences. 6 (2): 259–261. doi:10.1017/S0140525X0001582X. S2CID   145583756.
  32. Yager, R. R. (1978). "Fuzzy decision making including unequal objectives". Fuzzy Sets and Systems. 1 (2): 87–95. doi:10.1016/0165-0114(78)90010-6.
  33. Levi, I. (1990). Hard choices: Decision making under unresolved conflict. Cambridge University Press. ISBN   0-521-38630-6.
  34. Loui, R. P. (1986). "Decisions with indeterminate probabilities". Theory and Decision. 21 (3): 283–309. doi:10.1007/BF00134099. S2CID   121036131.
  35. Guo, P.; Tanaka, H. (2010). "Decision making with interval probabilities". European Journal of Operational Research . 203 (2): 444–454. doi:10.1016/j.ejor.2009.07.020. S2CID   10582873.
  36. Caselton, W. F.; Luo, W. (1992). "Decision making with imprecise probabilities: Dempster‐Shafer theory and application". Water Resources Research. 28 (12): 3071–3083. doi:10.1029/92WR01818.
  37. Breese, J. S.; Fertig, K. W. (2013). "Decision making with interval influence diagrams". arXiv: 1304.1096 .
  38. Gärdenfors, P.; Sahlin, N. E. (1982). "Unreliable probabilities, risk taking, and decision making". Synthese . 53 (3): 361–386. doi:10.1007/BF00486156. S2CID   36194904.