Conditional event algebra

Last updated

A standard, Boolean algebra of events is a set of events related to one another by the familiar operations and, or, and not. A conditional event algebra (CEA) contains not just ordinary events but also conditional events, which have the form "if A, then B". The usual purpose of a CEA is to enable the defining of a probability function, P, that satisfies the equation P(if A then B) = P(A and B) / P(A).

Contents

Motivation

In standard probability theory, an event is a set of outcomes, any one of which would be an occurrence of the event. P(A), the probability of event A, is the sum of the probabilities of all A-outcomes, P(B) is the sum of the probabilities of all B-outcomes, and P(A and B) is the sum of the probabilities of all outcomes that are both A-outcomes and B-outcomes. In other words, and, customarily represented by the logical symbol ∧, is interpreted as set intersection: P(AB) = P(AB). In the same vein, or, ∨, becomes set union, ∪, and not, ¬, becomes set complementation, ′. Any combination of events using the operations and, or, and not is also an event, and assigning probabilities to all outcomes generates a probability for every event. In technical terms, this means that the set of events and the three operations together constitute a Boolean algebra of sets, with an associated probability function.

P(if A, then B) is normally interpreted not as an ordinary probability—not, specifically, as P(A′ ∨ B)—but as the conditional probability of B given A, P(B | A) = P(AB) / P(A). This raises a question: what about a probability like P(if A, then B, and if C, then D)? For this, there is no standard answer. What would be needed, for consistency, is a treatment of if-then as a binary operation, →, such that for conditional eventsAB and CD, P(AB) = P(B | A), P(CD) = P(D | C), and P((AB) ∧ (CD)) is well-defined and reasonable. This treatment is what conditional event algebras try to provide. [1]

Types of conditional event algebra

Ideally, a conditional event algebra, or CEA, would support a probability function that meets three conditions:

1. The probability function validates the usual axioms.
2. For any two ordinary events A and B, if P(A) > 0, then P(AB) = P(B | A) = P(AB) / P(A).
3. For ordinary event A and acceptable probability function P, if P(A) > 0, then PA = P ( ⋅ | A), the function produced by conditioning on A, is also an acceptable probability function.

However, David Lewis showed (1976) that those conditions can only be met when there are just two possible outcomes—as with, say, a single coin flip. With three or more possible outcomes, constructing a probability function requires choosing which of the above three conditions to violate. Interpreting AB as A′ ∪ B produces an ordinary Boolean algebra that violates 2. With CEAs, the choice is between 1 and 3.

Tri-event CEAs

Tri-event CEAs take their inspiration from three-valued logic, where the identification of logical conjunction, disjunction, and negation with simple set operations no longer applies. For ordinary events A and B, the tri-event AB occurs when A and B both occur, fails to occur when A occurs but B does not, and is undecided when A fails to occur. (The term “tri-event” comes from de Finetti (1935): triévénement.) Ordinary events, which are never undecided, are incorporated into the algebra as tri-events conditional on Ω, the vacuous event represented by the entire sample space of outcomes; thus, A becomes Ω → A.

Since there are many three-valued logics, there are many possible tri-event algebras. Two types, however, have attracted more interest than the others. In one type, AB and AB are each undecided only when both A and B are undecided; when just one of them is, the conjunction or disjunction follows the other conjunct or disjunct. When negation is handled in the obvious way, with ¬A undecided just in case A is, this type of tri-event algebra corresponds to a three-valued logic proposed by Sobociński (1920) and favored by Belnap (1973), and also implied by Adams’s (1975) “quasi-conjunction” for conditionals. Schay (1968) was the first to propose an algebraic treatment, which Calabrese (1987) developed more properly. [2]

The other type of tri-event CEA treats negation the same way as the first, but it treats conjunction and disjunction as min and max functions, respectively, with occurrence as the high value, failure as the low value, and undecidedness in between. This type of tri-event algebra corresponds to a three-valued logic proposed by Łukasiewicz (1920) and also favored by de Finetti (1935). Goodman, Nguyen and Walker (1991) eventually provided the algebraic formulation.

The probability of any tri-event is defined as the probability that it occurs divided by the probability that it either occurs or fails to occur. [3] With this convention, conditions 2 and 3 above are satisfied by the two leading tri-event CEA types. Condition 1, however, fails. In a Sobociński-type algebra, ∧ does not distribute over ∨, so P(A ∧ (BC)) and P((AB) ∨ (AC)) need not be equal. [4] In a Łukasiewicz-type algebra, ∧ distributes over ∨ but not over exclusive or, (AB = (A ∧ ¬B) ∨ (¬AB)). [5] Also, tri-event CEAs are not complemented lattices, only pseudocomplemented, because in general, (AB) ∧ ¬(AB) cannot occur but can be undecided and therefore is not identical to Ω → ∅, the bottom element of the lattice. This means that P(C) and P(C ((AB) ∧ ¬(AB))) can differ, when classically they would not.

Product-space CEAs

If P(if A, then B) is thought of as the probability of A-and-B occurring before A-and-not-B in a series of trials, this can be calculated as an infinite sum of simple probabilities: the probability of A-and-B on the first trial, plus the probability of not-A (and either B or not-B) on the first trial and A-and-B on the second, plus the probability of not-A on the first two trials and A-and-B on the third, and so on—that is, P(AB) + PA)P(AB) + PA)2P(AB) + …, or, in factored form, P(AB)[1 + PA) + PA)2 + …]. Since the second factor is the Maclaurin series expansion of 1 / [1 – P(¬A)] = 1 / P(A), the infinite sum equals P(AB) / P(A) = P(B |A).

The infinite sum is itself is a simple probability, but with the sample space now containing not ordinary outcomes of single trials but infinite sequences of ordinary outcomes. Thus the conditional probability P(B |A) is turned into simple probability P(BA) by replacing Ω, the sample space of all ordinary outcomes, with Ω*, the sample space of all sequences of ordinary outcomes, and by identifying conditional event AB with the set of sequences where the first (AB)-outcome comes before the first (A ∧ ¬B)-outcome. In Cartesian-product notation, Ω* = Ω × Ω × Ω × …, and AB is the infinite union [(AB) × Ω × Ω × …] ∪ [A′ × (AB) × Ω × Ω × …] ∪ [A′ × A′ × (AB) × Ω × Ω × …] ∪ …. Unconditional event A is, again, represented by conditional event Ω → A. [6] Unlike tri-event CEAs, this type of CEA supports the identification of ∧, ∨, and ¬ with the familiar operations ∩, ∪, and ′ not just for ordinary, unconditional events but for conditional ones, as well. Because Ω* is a space defined by an infinitely long Cartesian product, the Boolean algebra of conditional-event subsets of Ω* is called a product-space CEA. This type of CEA was introduced by van Fraassen (1976), in response to Lewis’s result, and was later discovered independently by Goodman and Nguyen (1994).

The probability functions associated with product-space CEAs satisfy conditions 1 and 2 above. However, given probability function P that satisfies conditions 1 and 2, if P(A) > 0, it can be shown that PA(C | B) = P(C | AB) and PA(BC) = P(BC | A) + P(B′ | A)P(C | B). [7] If A, B and C are pairwise compatible but P(ABC) = 0, then P(C | AB) = P(BC | A) = 0 but P(B′ | A)P (C | B) > 0. Therefore, PA(BC) does not reliably equal PA(C | B). Since PA fails condition 2, P fails condition 3.

Nested if–thens

What about nested conditional constructions? In a tri-event CEA, right-nested constructions are handled more or less automatically, since it is natural to say that A → (BC) takes the value of BC (possibly undecided) when A is true and is undecided when A is false. Left-nesting, however, requires a more deliberate choice: when AB is undecided, should (AB) → C be undecided, or should it take the value of C? Opinions vary. Calabrese adopts the latter view, identifying (AB) → (CD) with ((¬AB) ∧ C) → D. [8]

With a product-space CEA, nested conditionals call for nested sequence-constructions: evaluating P((AB) → (CD)) requires a sample space of metasequences of sequences of ordinary outcomes. The probabilities of the ordinary sequences are calculated as before. Given a series of trials where the outcomes are sequences of ordinary outcomes, P((AB) → (CD)) is P(CD | AB) = P((AB) ∧ (CD)) / P(AB), the probability that an ((AB) ∧ (CB))-sequence will be encountered before an ((AB) ∧ ¬(CB))-sequence. Higher-order-iterations of conditionals require higher-order metasequential constructions. [9]

In either of the two leading types of tri-event CEA, A → (BC) = (AB) → C. [10] Product space CEAs, on the other hand, do not support this identity. The latter fact can be inferred from the failure, already noted, of PA(BC) to equal PA(C | B), since PA(C | B) = P((AB) → C) and PA(BC) = P(A → (BC)). For a direct analysis, however, consider a metasequence whose first member-sequence starts with an (A ∧ ¬BC)-outcome, followed by a (¬ABC)-outcome, followed by an (AB ∧ ¬C)-outcome. That metasequence will belong to the event A → (BC), because the first member-sequence is an (A ∧ (BC))-sequence, but the metasequence will not belong to the event (AB) → C, because the first member-sequence is an ((AB) → ¬C)-sequence.

Applications

The initial impetus for CEAs is theoretical—namely, the challenge of responding to the Lewis result—but practical applications have been proposed. If, for instance, events A and C involve signals emitted by military radar stations, and events B and D involve missile launches, an opposing military force with an AI-controlled missile defense system may want the system to be able to calculate P((AB) ∧ (CD)) and/or P((AB) → (CD)). [11] Other applications range from image interpretation [12] to the detection of denial-of-service attacks on computer networks. [13]

Notes

  1. The CEA literature actually uses (B | A) to mean “if A, then B,” but this convention makes certain points harder to state clearly. For that reason, and for improved legibility, the present article uses the more familiar AB.
  2. Schay actually specified two algebras, one associated with ∧ and the other with ∨. This line of development has not been followed by others.
  3. De Finetti 1935, p. 184. Technically, there are two probability functions: P, which ranges over ordinary events, and P*, which is determined by P and ranges over conditional events. That notational subtlety will be ignored here.
  4. Consider the case where A is true, B is undecided, and C is false.
  5. With AB undecided when either A or B is, compare A ∧ (BC) and (AB) (AC) when A is undecided and B and C are both true.
  6. Since Ω ∩ A = A and Ω′ = ∅, the infinite union representing Ω → A reduces to A × Ω × Ω × Ω × ….
  7. Goodman, Mahler and Nguyen 1999, p. 7, provides the formula needed for the latter result: P((AB) ∧ (CD)) = [P(ABCD) + P(A′ ∧ CD)P(B | A) + P(C′ ∧ AB)P(D | C)] / P(AC). The special case of interest is P((Ω → A) ∧ (BC)).
  8. Calabrese 1987, p. 217.
  9. Goodman and Nguyen 1995, pp. 281-283.
  10. This identity corresponds in logic to the law of import-export, as it is called.
  11. Goodman, Mahler and Nguyen 1999.
  12. Kelly, Derin and Gong 1999.
  13. Sun et al 2014.

Related Research Articles

<span class="mw-page-title-main">Probability</span> Branch of mathematics concerning chance and uncertainty

Probability is the branch of mathematics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an event is to occur. The higher the probability of an event, the more likely it is that the event will occur. A simple example is the tossing of a fair (unbiased) coin. Since the coin is fair, the two outcomes are both equally probable; the probability of 'heads' equals the probability of 'tails'; and since no other outcomes are possible, the probability of either 'heads' or 'tails' is 1/2.

<span class="mw-page-title-main">Event (probability theory)</span> In statistics and probability theory, set of outcomes to which a probability is assigned

In probability theory, an event is a set of outcomes of an experiment to which a probability is assigned. A single outcome may be an element of many different events, and different events in an experiment are usually not equally likely, since they may include very different groups of outcomes. An event consisting of only a single outcome is called an elementary event or an atomic event; that is, it is a singleton set. An event that has more than one possible outcomes is called compound event. An event is said to occur if contains the outcome of the experiment. The probability that an event occurs is the probability that contains the outcome of an experiment. An event defines a complementary event, namely the complementary set, and together these define a Bernoulli trial: did the event occur or not?

<span class="mw-page-title-main">Probability theory</span> Branch of mathematics concerning probability

Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set of axioms. Typically these axioms formalise probability in terms of a probability space, which assigns a measure taking values between 0 and 1, termed the probability measure, to a set of outcomes called the sample space. Any specified subset of the sample space is called an event.

<span class="mw-page-title-main">Probability distribution</span> Mathematical function for the probability a given outcome occurs in an experiment

In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon in terms of its sample space and the probabilities of events.

<span class="mw-page-title-main">Random variable</span> Variable representing a random phenomenon

A random variable is a mathematical formalization of a quantity or object which depends on random events. The term 'random variable' can be misleading as its mathematical definition is not actually random nor a variable, but rather it is a function from possible outcomes in a sample space to a measurable space, often to the real numbers.

<span class="mw-page-title-main">Independence (probability theory)</span> When the occurrence of one event does not affect the likelihood of another

Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of one does not affect the probability of occurrence of the other or, equivalently, does not affect the odds. Similarly, two random variables are independent if the realization of one does not affect the probability distribution of the other.

In mathematical analysis and in probability theory, a σ-algebra on a set X is a nonempty collection Σ of subsets of X closed under complement, countable unions, and countable intersections. The ordered pair is called a measurable space.

<span class="mw-page-title-main">Probability space</span> Mathematical concept

In probability theory, a probability space or a probability triple is a mathematical construct that provides a formal model of a random process or "experiment". For example, one can define a probability space which models the throwing of a die.

<span class="mw-page-title-main">Exclusive or</span> True when either but not both inputs are true

Exclusive or, exclusive disjunction, exclusive alternation, logical non-equivalence, or logical inequality is a logical operator whose negation is the logical biconditional. With two inputs, XOR is true if and only if the inputs differ. With multiple inputs, XOR is true if and only if the number of true inputs is odd.

In probability theory, de Finetti's theorem states that exchangeable observations are conditionally independent relative to some latent variable. An epistemic probability distribution could then be assigned to this variable. It is named in honor of Bruno de Finetti.

In probability theory, the conditional expectation, conditional expected value, or conditional mean of a random variable is its expected value evaluated with respect to the conditional probability distribution. If the random variable can take on only a finite number of values, the "conditions" are that the variable can only take on a subset of those values. More formally, in the case when the random variable is defined over a discrete probability space, the "conditions" are a partition of this probability space.

In probability theory and statistics, the conditional probability distribution is a probability distribution that describes the probability of an outcome given the occurrence of a particular event. Given two jointly distributed random variables and , the conditional probability distribution of given is the probability distribution of when is known to be a particular value; in some cases the conditional probabilities may be expressed as functions containing the unspecified value of as a parameter. When both and are categorical variables, a conditional probability table is typically used to represent the conditional probability. The conditional distribution contrasts with the marginal distribution of a random variable, which is its distribution without reference to the value of the other variable.

In mathematics, the Gibbs measure, named after Josiah Willard Gibbs, is a probability measure frequently seen in many problems of probability theory and statistical mechanics. It is a generalization of the canonical ensemble to infinite systems. The canonical ensemble gives the probability of the system X being in state x as

In probability theory, random element is a generalization of the concept of random variable to more complicated spaces than the simple real line. The concept was introduced by Maurice Fréchet (1948) who commented that the “development of probability theory and expansion of area of its applications have led to necessity to pass from schemes where (random) outcomes of experiments can be described by number or a finite set of numbers, to schemes where outcomes of experiments represent, for example, vectors, functions, processes, fields, series, transformations, and also sets or collections of sets.”

In probability theory, a standard probability space, also called Lebesgue–Rokhlin probability space or just Lebesgue space is a probability space satisfying certain assumptions introduced by Vladimir Rokhlin in 1940. Informally, it is a probability space consisting of an interval and/or a finite or countable number of atoms.

In mathematics and philosophy, Łukasiewicz logic is a non-classical, many-valued logic. It was originally defined in the early 20th century by Jan Łukasiewicz as a three-valued modal logic; it was later generalized to n-valued as well as infinitely-many-valued (0-valued) variants, both propositional and first order. The ℵ0-valued version was published in 1930 by Łukasiewicz and Alfred Tarski; consequently it is sometimes called the Łukasiewicz–Tarski logic. It belongs to the classes of t-norm fuzzy logics and substructural logics.

Subjective logic is a type of probabilistic logic that explicitly takes epistemic uncertainty and source trust into account. In general, subjective logic is suitable for modeling and analysing situations involving uncertainty and relatively unreliable sources. For example, it can be used for modeling and analysing trust networks and Bayesian networks.

In probability theory, regular conditional probability is a concept that formalizes the notion of conditioning on the outcome of a random variable. The resulting conditional probability distribution is a parametrized family of probability measures called a Markov kernel.

<span class="mw-page-title-main">Conditional probability</span> Probability of an event occurring, given that another event has already occurred

In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) is already known to have occurred. This particular method relies on event A occurring with some sort of relationship with another event B. In this situation, the event A can be analyzed by a conditional probability with respect to B. If the event of interest is A and the event B is known or assumed to have occurred, "the conditional probability of A given B", or "the probability of A under the condition B", is usually written as P(A|B) or occasionally PB(A). This can also be understood as the fraction of probability B that intersects with A, or the ratio of the probabilities of both events happening to the "given" one happening (how many times A occurs rather than not assuming B has occurred): .

References

Adams, E. W. 1975. The Logic of Conditionals. D. Reidel, Dordrecht.

Bamber, D., Goodman, I. R. and Nguyen, H. T. 2004. "Deduction from Conditional Knowledge". Soft Computing 8: 247–255.

Belnap, N. D. 1973. "Restricted quantification and conditional assertion", in H. Leblanc (ed.), Truth, Syntax and Modality North-Holland, Amsterdam. 48–75.

Calabrese, P. 1987. "An algebraic synthesis of the foundations of logic and probability". Information Sciences 42:187-237.

de Finetti, Bruno. 1935. "La logique de la probabilité". Actes du Congrès International Philosophie Scientifique. Paris.

van Fraassen, Bas C. 1976. "Probabilities of conditionals” in W. L. Harper and C. A. Hooker (eds.), Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science, Vol. I. D. Reidel, Dordrecht, pp. 261–308.

Goodman, I. R., Mahler, R. P. S. and Nguyen, H. T. 1999. "What is conditional event algebra and why should you care?" SPIE Proceedings, Vol. 3720.

Goodman, I. R., Nguyen, H. T. and Walker, E .A. 1991. Conditional Inference and Logic for Intelligent Systems: A Theory of Measure-Free Conditioning. Office of Chief of Naval Research, Arlington, Virginia.

Goodman, I. R. and Nguyen, H. T. 1994. "A theory of conditional information for probabilistic inference in intelligent systems: II, Product space approach; III Mathematical appendix". Information Sciences 76:13-42; 75: 253-277.

Goodman, I. R. and Nguyen, H. T. 1995. "Mathematical foundations of conditionals and their probabilistic assignments". International Journal of Uncertainty, Fuzziness and Knowledge-based Systems 3(3): 247-339

Kelly, P. A., Derin, H., and Gong, W.-B. 1999. "Some applications of conditional events and random sets for image estimation and system modeling". SPIE Proceedings 3720: 14-24.

Łukasiewicz, J. 1920. "O logice trójwartościowej" (in Polish). Ruch Filozoficzny 5:170–171. English translation: "On three-valued logic", in L. Borkowski (ed.), Selected works by Jan Łukasiewicz, North–Holland, Amsterdam, 1970, pp. 87–88. ISBN 0-7204-2252-3

Schay, Geza. 1968. "An algebra of conditional events". Journal of Mathematical Analysis and Applications 24: 334-344.

Sobociński, B. 1952. "Axiomatization of a partial system of three-valued calculus of propositions". Journal of Computing Systems 1(1):23-55.

Sun, D., Yang, K., Jing, X., Lv, B., and Wang, Y. 2014. "Abnormal network traffic detection based on conditional event algebra". Applied Mechanics and Materials 644-650: 1093-1099.