Possibility theory

Last updated

Possibility theory is a mathematical theory for dealing with certain types of uncertainty and is an alternative to probability theory. It uses measures of possibility and necessity between 0 and 1, ranging from impossible to possible and unnecessary to necessary, respectively. Professor Lotfi Zadeh first introduced possibility theory in 1978 as an extension of his theory of fuzzy sets and fuzzy logic. Didier Dubois and Henri Prade further contributed to its development. Earlier, in the 1950s, economist G. L. S. Shackle proposed the min/max algebra to describe degrees of potential surprise.

Contents

Formalization of possibility

For simplicity, assume that the universe of discourse Ω is a finite set. A possibility measure is a function from to [0, 1] such that:

Axiom 1:
Axiom 2:
Axiom 3: for any disjoint subsets and . [1]

It follows that, like probability on finite probability spaces, the possibility measure is determined by its behavior on singletons:

Axiom 1 can be interpreted as the assumption that Ω is an exhaustive description of future states of the world, because it means that no belief weight is given to elements outside Ω.

Axiom 2 could be interpreted as the assumption that the evidence from which was constructed is free of any contradiction. Technically, it implies that there is at least one element in Ω with possibility 1.

Axiom 3 corresponds to the additivity axiom in probabilities. However there is an important practical difference. Possibility theory is computationally more convenient because Axioms 1–3 imply that:

for any subsets and .

Because one can know the possibility of the union from the possibility of each component, it can be said that possibility is compositional with respect to the union operator. Note however that it is not compositional with respect to the intersection operator. Generally:

When Ω is not finite, Axiom 3 can be replaced by:

For all index sets , if the subsets are pairwise disjoint,

Necessity

Whereas probability theory uses a single number, the probability, to describe how likely an event is to occur, possibility theory uses two concepts, the possibility and the necessity of the event. For any set , the necessity measure is defined by

.

In the above formula, denotes the complement of , that is the elements of that do not belong to . It is straightforward to show that:

for any

and that:

.

Note that contrary to probability theory, possibility is not self-dual. That is, for any event , we only have the inequality:

However, the following duality rule holds:

For any event , either , or

Accordingly, beliefs about an event can be represented by a number and a bit.

Interpretation

There are four cases that can be interpreted as follows:

means that is necessary. is certainly true. It implies that .

means that is impossible. is certainly false. It implies that .

means that is possible. I would not be surprised at all if occurs. It leaves unconstrained.

means that is unnecessary. I would not be surprised at all if does not occur. It leaves unconstrained.

The intersection of the last two cases is and meaning that I believe nothing at all about . Because it allows for indeterminacy like this, possibility theory relates to the graduation of a many-valued logic, such as intuitionistic logic, rather than the classical two-valued logic.

Note that unlike possibility, fuzzy logic is compositional with respect to both the union and the intersection operator. The relationship with fuzzy theory can be explained with the following classic example.

Possibility theory as an imprecise probability theory

There is an extensive formal correspondence between probability and possibility theories, where the addition operator corresponds to the maximum operator.

A possibility measure can be seen as a consonant plausibility measure in the Dempster–Shafer theory of evidence. The operators of possibility theory can be seen as a hyper-cautious version of the operators of the transferable belief model, a modern development of the theory of evidence.

Possibility can be seen as an upper probability: any possibility distribution defines a unique credal set set of admissible probability distributions by

This allows one to study possibility theory using the tools of imprecise probabilities.

Necessity logic

We call generalized possibility every function satisfying Axiom 1 and Axiom 3. We call generalized necessity the dual of a generalized possibility. The generalized necessities are related to a very simple and interesting fuzzy logic called necessity logic. In the deduction apparatus of necessity logic the logical axioms are the usual classical tautologies. Also, there is only a fuzzy inference rule extending the usual modus ponens. Such a rule says that if α and αβ are proved at degree λ and μ, respectively, then we can assert β at degree min{λ,μ}. It is easy to see that the theories of such a logic are the generalized necessities and that the completely consistent theories coincide with the necessities (see for example Gerla 2001).

See also

Related Research Articles

<span class="mw-page-title-main">Probability theory</span> Branch of mathematics concerning probability

Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set of axioms. Typically these axioms formalise probability in terms of a probability space, which assigns a measure taking values between 0 and 1, termed the probability measure, to a set of outcomes called the sample space. Any specified subset of the sample space is called an event.

<span class="mw-page-title-main">Probability distribution</span> Mathematical function for the probability a given outcome occurs in an experiment

In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon in terms of its sample space and the probabilities of events.

<span class="mw-page-title-main">Random variable</span> Variable representing a random phenomenon

A random variable is a mathematical formalization of a quantity or object which depends on random events. The term 'random variable' can be misleading as its mathematical definition is not actually random nor a variable, but rather it is a function from possible outcomes in a sample space to a measurable space, often to the real numbers.

In mathematics, fuzzy sets are sets whose elements have degrees of membership. Fuzzy sets were introduced independently by Lotfi A. Zadeh in 1965 as an extension of the classical notion of set. At the same time, Salii (1965) defined a more general kind of structure called an L-relation, which he studied in an abstract algebraic context. Fuzzy relations, which are now used throughout fuzzy mathematics and have applications in areas such as linguistics, decision-making, and clustering, are special cases of L-relations when L is the unit interval [0, 1].

<span class="mw-page-title-main">Indicator function</span> Mathematical function characterizing set membership

In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if A is a subset of some set X, then if and otherwise, where is a common notation for the indicator function. Other common notations are and

In mathematics, particularly in functional analysis, a projection-valued measure (PVM) is a function defined on certain subsets of a fixed set and whose values are self-adjoint projections on a fixed Hilbert space. Projection-valued measures are formally similar to real-valued measures, except that their values are self-adjoint projections rather than real numbers. As in the case of ordinary measures, it is possible to integrate complex-valued functions with respect to a PVM; the result of such an integration is a linear operator on the given Hilbert space.

In mathematics, set-theoretic topology is a subject that combines set theory and general topology. It focuses on topological questions that are independent of Zermelo–Fraenkel set theory (ZFC).

In mathematics, probabilistic metric spaces are a generalization of metric spaces where the distance no longer takes values in the non-negative real numbers R0, but in distribution functions.

In mathematics, a π-system on a set is a collection of certain subsets of such that

The rank-dependent expected utility model is a generalized expected utility model of choice under uncertainty, designed to explain the behaviour observed in the Allais paradox, as well as for the observation that many people both purchase lottery tickets and insure against losses.

<span class="mw-page-title-main">Elliptic boundary value problem</span>

In mathematics, an elliptic boundary value problem is a special kind of boundary value problem which can be thought of as the stable state of an evolution problem. For example, the Dirichlet problem for the Laplacian gives the eventual distribution of heat in a room several hours after the heating is turned on.

<span class="mw-page-title-main">Fisher's noncentral hypergeometric distribution</span>

In probability theory and statistics, Fisher's noncentral hypergeometric distribution is a generalization of the hypergeometric distribution where sampling probabilities are modified by weight factors. It can also be defined as the conditional distribution of two or more binomially distributed variables dependent upon their fixed sum.

In proof theory, ordinal analysis assigns ordinals to mathematical theories as a measure of their strength. If theories have the same proof-theoretic ordinal they are often equiconsistent, and if one theory has a larger proof-theoretic ordinal than another it can often prove the consistency of the second theory.

In mathematics, ψ0ω), widely known as Buchholz's ordinal, is a large countable ordinal that is used to measure the proof-theoretic strength of some mathematical systems. In particular, it is the proof theoretic ordinal of the subsystem -CA0 of second-order arithmetic; this is one of the "big five" subsystems studied in reverse mathematics (Simpson 1999). It is also the proof-theoretic ordinal of , the theory of finitely iterated inductive definitions, and of , a fragment of Kripke-Platek set theory extended by an axiom stating every set is contained in an admissible set. Buchholz's ordinal is also the order type of the segment bounded by in Buchholz's ordinal notation . Lastly, it can be expressed as the limit of the sequence: , , , ...

In mathematics, a submodular set function is a set function that, informally, describes the relationship between a set of inputs and an output, where adding more of one input has a decreasing additional benefit. The natural diminishing returns property which makes them suitable for many applications, including approximation algorithms, game theory and electrical networks. Recently, submodular functions have also found utility in several real world problems in machine learning and artificial intelligence, including automatic summarization, multi-document summarization, feature selection, active learning, sensor placement, image collection summarization and many other domains.

In mathematics, the Neumann–Poincaré operator or Poincaré–Neumann operator, named after Carl Neumann and Henri Poincaré, is a non-self-adjoint compact operator introduced by Poincaré to solve boundary value problems for the Laplacian on bounded domains in Euclidean space. Within the language of potential theory it reduces the partial differential equation to an integral equation on the boundary to which the theory of Fredholm operators can be applied. The theory is particularly simple in two dimensions—the case treated in detail in this article—where it is related to complex function theory, the conjugate Beurling transform or complex Hilbert transform and the Fredholm eigenvalues of bounded planar domains.

The min-entropy, in information theory, is the smallest of the Rényi family of entropies, corresponding to the most conservative way of measuring the unpredictability of a set of outcomes, as the negative logarithm of the probability of the most likely outcome. The various Rényi entropies are all equal for a uniform distribution, but measure the unpredictability of a nonuniform distribution in different ways. The min-entropy is never greater than the ordinary or Shannon entropy and that in turn is never greater than the Hartley or max-entropy, defined as the logarithm of the number of outcomes with nonzero probability.

In machine learning, the kernel embedding of distributions comprises a class of nonparametric methods in which a probability distribution is represented as an element of a reproducing kernel Hilbert space (RKHS). A generalization of the individual data-point feature mapping done in classical kernel methods, the embedding of distributions into infinite-dimensional feature spaces can preserve all of the statistical features of arbitrary distributions, while allowing one to compare and manipulate distributions using Hilbert space operations such as inner products, distances, projections, linear transformations, and spectral analysis. This learning framework is very general and can be applied to distributions over any space on which a sensible kernel function may be defined. For example, various kernels have been proposed for learning from data which are: vectors in , discrete classes/categories, strings, graphs/networks, images, time series, manifolds, dynamical systems, and other structured objects. The theory behind kernel embeddings of distributions has been primarily developed by Alex Smola, Le Song , Arthur Gretton, and Bernhard Schölkopf. A review of recent works on kernel embedding of distributions can be found in.

In functional analysis, every C*-algebra is isomorphic to a subalgebra of the C*-algebra of bounded linear operators on some Hilbert space This article describes the spectral theory of closed normal subalgebras of . A subalgebra of is called normal if it is commutative and closed under the operation: for all , we have and that .

The redundancy principle in biology expresses the need of many copies of the same entity to fulfill a biological function. Examples are numerous: disproportionate numbers of spermatozoa during fertilization compared to one egg, large number of neurotransmitters released during neuronal communication compared to the number of receptors, large numbers of released calcium ions during transient in cells, and many more in molecular and cellular transduction or gene activation and cell signaling. This redundancy is particularly relevant when the sites of activation are physically separated from the initial position of the molecular messengers. The redundancy is often generated for the purpose of resolving the time constraint of fast-activating pathways. It can be expressed in terms of the theory of extreme statistics to determine its laws and quantify how the shortest paths are selected. The main goal is to estimate these large numbers from physical principles and mathematical derivations.

References

Citations

  1. Dubois, D.; Prade, H.: Possibility Theory: An Approach to Computerized Processing of Uncertainty. Plenum Press, 1988

Sources