Stochastic ordering

Last updated December 07, 2024

In probability theory and statistics, a stochastic order quantifies the concept of one random variable being "bigger" than another. These are usually partial orders, so that one random variable $A$ may be neither stochastically greater than, less than, nor equal to another random variable $B$ . Many different orders exist, which have different applications.

Usual stochastic order

A real random variable $A$ is less than a random variable $B$ in the "usual stochastic order" if

\Pr(A>x)\leq \Pr(B>x){\text{ for all }}x\in (-\infty ,\infty ),

where $\Pr(\cdot )$ denotes the probability of an event. This is sometimes denoted $A\preceq B$ or $A\leq _{\mathrm {st} }B$ .

If additionally $\Pr(A>x)<\Pr(B>x)$ for some $x$ , then $A$ is stochastically strictly less than $B$ , sometimes denoted $A\prec B$ . In decision theory, under this circumstance, $B$ is said to be first-order stochastically dominant over A.

Characterizations

The following rules describe situations when one random variable is stochastically less than or equal to another. Strict version of some of these rules also exist.

$A\preceq B$ if and only if for all non-decreasing functions $u$ , $\operatorname {E} [u(A)]\leq \operatorname {E} [u(B)]$ .
If $u$ is non-decreasing and $A\preceq B$ then $u(A)\preceq u(B)$
If $u:\mathbb {R} ^{n}\to \mathbb {R}$ is increasing in each variable and $A_{i}$ and $B_{i}$ are independent sets of random variables with $A_{i}\preceq B_{i}$ for each $i$ , then $u(A_{1},\dots ,A_{n})\preceq u(B_{1},\dots ,B_{n})$ and in particular $\sum _{i=1}^{n}A_{i}\preceq \sum _{i=1}^{n}B_{i}$ Moreover, the $i$ th order statistics satisfy $A_{(i)}\preceq B_{(i)}$ .
If two sequences of random variables $A_{i}$ and $B_{i}$ , with $A_{i}\preceq B_{i}$ for all $i$ each converge in distribution, then their limits satisfy $A\preceq B$ .
If $A$ , $B$ and $C$ are random variables such that $\sum _{c}\Pr(C=c)=1$ and $\Pr(A>u\mid C=c)\leq \Pr(B>u\mid C=c)$ for all $u$ and $c$ such that $\Pr(C=c)>0$ , then $A\preceq B$ .

Other properties

If $A\preceq B$ and $\operatorname {E} [A]=\operatorname {E} [B]$ then $A\mathrel {\overset {d}{=}} B$ (the random variables are equal in distribution).

Stochastic dominance

Stochastic dominance relations are a family of stochastic orderings used in decision theory:^[1]

Zeroth-order stochastic dominance: $A\prec _{(0)}B$ if and only if $A\leq B$ for all realizations of these random variables and $A<B$ for at least one realization.
First-order stochastic dominance: $A\prec _{(1)}B$ if and only if $\Pr(A>x)\leq \Pr(B>x)$ for all $x$ and there exists $x$ such that $\Pr(A>x)<\Pr(B>x)$ .
Second-order stochastic dominance: $A\prec _{(2)}B$ if and only if $\int _{-\infty }^{x}[\Pr(B>t)-\Pr(A>t)]\,dt\geq 0$ for all $x$ , with strict inequality at some $x$ .

There also exist higher-order notions of stochastic dominance. With the definitions above, we have $A\prec _{(i)}B\implies A\prec _{(i+1)}B$ .

Multivariate stochastic order

An $\mathbb {R} ^{d}$ -valued random variable $A$ is less than an $\mathbb {R} ^{d}$ -valued random variable $B$ in the "usual stochastic order" if

\operatorname {E} [f(A)]\leq \operatorname {E} [f(B)]{\text{ for all bounded, increasing functions }}f\colon \mathbb {R} ^{d}\longrightarrow \mathbb {R}

Other types of multivariate stochastic orders exist. For instance the upper and lower orthant order which are similar to the usual one-dimensional stochastic order. $A$ is said to be smaller than $B$ in upper orthant order if

\Pr(A>\mathbf {x} )\leq \Pr(B>\mathbf {x} ){\text{ for all }}\mathbf {x} \in \mathbb {R} ^{d}

and $A$ is smaller than $B$ in lower orthant order if^[2]

\Pr(A\leq \mathbf {x} )\leq \Pr(B\leq \mathbf {x} ){\text{ for all }}\mathbf {x} \in \mathbb {R} ^{d}

All three order types also have integral representations, that is for a particular order $A$ is smaller than $B$ if and only if $\operatorname {E} [f(A)]\leq \operatorname {E} [f(B)]$ for all $f\colon \mathbb {R} ^{d}\longrightarrow \mathbb {R}$ in a class of functions ${\mathcal {G}}$ .^[3] ${\mathcal {G}}$ is then called generator of the respective order.

Other dominance orders

The following stochastic orders are useful in the theory of random social choice. They are used to compare the outcomes of random social choice functions, in order to check them for efficiency or other desirable criteria.^[4] The dominance orders below are ordered from the most conservative to the least conservative. They are exemplified on random variables over the finite support {30,20,10}.

Deterministic dominance, denoted $A\succeq _{\mathrm {dd} }B$ , means that every possible outcome of $A$ is at least as good as every possible outcome of $B$ : for all x < y, $\Pr[A=x]\cdot \Pr[B=y]=0$ . In other words: $\Pr[A\geq B]=1$ . For example, $0.6\times 30+0.4\times 20\succeq _{\mathrm {dd} }0.5\times 20+0.5\times 10$ .

Bilinear dominance, denoted $A\succeq _{\mathrm {bd} }B$ , means that, for every possible outcome, the probability that $A$ yields the better one and $B$ yields the worse one is at least as large as the probability the other way around: for all x<y, $\Pr[A=x]\cdot \Pr[B=y]\leq \Pr[A=y]\cdot \Pr[B=x]$ For example, $0.5\times 30+0.5\times 20\succeq _{\mathrm {bd} }0.33\times 30+0.33\times 20+0.34\times 10$ .

Stochastic dominance (already mentioned above), denoted $A\succeq _{\mathrm {sd} }B$ , means that, for every possible outcome x, the probability that $A$ yields at least x is at least as large as the probability that $B$ yields at least x: for all x, $\Pr[A\geq x]\geq \Pr[B\geq x]$ . For example, $0.5\times 30+0.5\times 10\succeq _{\mathrm {sd} }0.5\times 20+0.5\times 10$ .

Pairwise-comparison dominance, denoted $A\succeq _{\mathrm {pc} }B$ , means that the probability that that $A$ yields a better outcome than $B$ is larger than the other way around: $\Pr[A\geq B]\geq \Pr[B\geq A]$ . For example, $0.67\times 30+0.33\times 10\succeq _{\mathrm {pc} }1.0\times 20$ .

Downward-lexicographic dominance, denoted $A\succeq _{\mathrm {dl} }B$ , means that $A$ has a larger probability than $B$ of returning the best outcome, or both $A$ and $B$ have the same probability to return the best outcome but $A$ has a larger probability than $B$ of returning the second-best best outcome, etc. Upward-lexicographic dominance is defined analogously based on the probability to return the worst outcomes. See lexicographic dominance.

Other stochastic orders

Hazard rate order

The hazard rate of a non-negative random variable $X$ with absolutely continuous distribution function $F$ and density function $f$ is defined as

r(t)={\frac {d}{dt}}(-\log(1-F(t)))={\frac {f(t)}{1-F(t)}}.

Given two non-negative variables $X$ and $Y$ with absolutely continuous distribution $F$ and $G$ , and with hazard rate functions $r$ and $q$ , respectively, $X$ is said to be smaller than $Y$ in the hazard rate order (denoted as $X\preceq _{\mathrm {hr} }Y$ ) if

r(t)\geq q(t)

for all

t\geq 0

,

or equivalently if

{\frac {1-F(t)}{1-G(t)}}

is decreasing in

t

.

Likelihood ratio order

Let $X$ and $Y$ two continuous (or discrete) random variables with densities (or discrete densities) $f(t)$ and $g(t)$ , respectively, so that ${\frac {g(t)}{f(t)}}$ increases in $t$ over the union of the supports of $X$ and $Y$ ; in this case, $X$ is smaller than $Y$ in the likelihood ratio order ( $X\preceq _{\mathrm {lr} }Y$ ).

Variability orders

If two variables have the same mean, they can still be compared by how "spread out" their distributions are. This is captured to a limited extent by the variance, but more fully by a range of stochastic orders.^{[ citation needed ]}

Convex order

Convex order is a special kind of variability order. Under the convex ordering, $A$ is less than $B$ if and only if for all convex $u$ , $\operatorname {E} [u(A)]\leq \operatorname {E} [u(B)]$ .

Laplace transform order

Laplace transform order compares both size and variability of two random variables. Similar to convex order, Laplace transform order is established by comparing the expectation of a function of the random variable where the function is from a special class: $u(x)=-\exp(-\alpha x)$ . This makes the Laplace transform order an integral stochastic order with the generator set given by the function set defined above with $\alpha$ a positive real number.

Realizable monotonicity

Considering a family of probability distributions $({P}_{\alpha })_{\alpha \in F}$ on partially ordered space $(E,\preceq )$ indexed with $\alpha \in F$ (where $(F,\preceq )$ is another partially ordered space, the concept of complete or realizable monotonicity may be defined. It means, there exists a family of random variables $(X_{\alpha })_{\alpha }$ on the same probability space, such that the distribution of $X_{\alpha }$ is ${P}_{\alpha }$ and $X_{\alpha }\preceq X_{\beta }$ almost surely whenever $\alpha \preceq \beta$ . It means the existence of a monotone coupling.^[5]

Related Research Articles

In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable $, or just distribution function of, evaluated at, is the probability that will take a value less than or equal to .$

In probability theory, the expected value is a generalization of the weighted average. Informally, the expected value is the mean of the possible values a random variable can take, weighted by the probability of those outcomes. Since it is obtained through arithmetic, the expected value sometimes may not even be included in the sample data set; it is not the value you would "expect" to get in reality.

A random variable is a mathematical formalization of a quantity or object which depends on random events. The term 'random variable' in its mathematical definition refers to neither randomness nor variability but instead is a mathematical function in which

Inverse transform sampling is a basic method for pseudo-random number sampling, i.e., for generating sample numbers at random from any probability distribution given its cumulative distribution function.

In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions:

In mathematics, the Wiener process is a real-valued continuous-time stochastic process named in honor of American mathematician Norbert Wiener for his investigations on the mathematical properties of the one-dimensional Brownian motion. It is often also called Brownian motion due to its historical connection with the physical process of the same name originally observed by Scottish botanist Robert Brown. It is one of the best known Lévy processes and occurs frequently in pure and applied mathematics, economics, quantitative finance, evolutionary biology, and physics.

In probability theory, Chebyshev's inequality provides an upper bound on the probability of deviation of a random variable from its mean. More specifically, the probability that a random variable deviates from its mean by more than $is at most, where is any positive constant and is the standard deviation.$

In probability theory, Markov's inequality gives an upper bound on the probability that a non-negative random variable is greater than or equal to some positive constant. Markov's inequality is tight in the sense that for each chosen positive constant, there exists a random variable such that the inequality is in fact an equality.

In probability theory, the conditional expectation, conditional expected value, or conditional mean of a random variable is its expected value evaluated with respect to the conditional probability distribution. If the random variable can take on only a finite number of values, the "conditions" are that the variable can only take on a subset of those values. More formally, in the case when the random variable is defined over a discrete probability space, the "conditions" are a partition of this probability space.

In probability theory, the Vysochanskij–Petunin inequality gives a lower bound for the probability that a random variable with finite variance lies within a certain number of standard deviations of the variable's mean, or equivalently an upper bound for the probability that it lies further away. The sole restrictions on the distribution are that it be unimodal and have finite variance; here unimodal implies that it is a continuous probability distribution except at the mode, which may have a non-zero probability.

In numerical analysis and computational statistics, rejection sampling is a basic technique used to generate observations from a distribution. It is also commonly called the acceptance-rejection method or "accept-reject algorithm" and is a type of exact simulation method. The method works for any distribution in $with a density.$

A stochastic differential equation (SDE) is a differential equation in which one or more of the terms is a stochastic process, resulting in a solution which is also a stochastic process. SDEs have many applications throughout pure mathematics and are used to model various behaviours of stochastic models such as stock prices, random growth models or physical systems that are subjected to thermal fluctuations.

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

Probability theory and statistics have some commonly used conventions, in addition to standard mathematical notation and mathematical symbols.

Stochastic dominance is a partial order between random variables. It is a form of stochastic ordering. The concept arises in decision theory and decision analysis in situations where one gamble can be ranked as superior to another gamble for a broad class of decision-makers. It is based on shared preferences regarding sets of possible outcomes and their associated probabilities. Only limited knowledge of preferences is required for determining dominance. Risk aversion is a factor only in second order stochastic dominance.

In mathematics – specifically, in the theory of stochastic processes – Doob's martingale convergence theorems are a collection of results on the limits of supermartingales, named after the American mathematician Joseph L. Doob. Informally, the martingale convergence theorem typically refers to the result that any supermartingale satisfying a certain boundedness condition must converge. One may think of supermartingales as the random variable analogues of non-increasing sequences; from this perspective, the martingale convergence theorem is a random variable analogue of the monotone convergence theorem, which states that any bounded monotone sequence converges. There are symmetric results for submartingales, which are analogous to non-decreasing sequences.

Financial models with long-tailed distributions and volatility clustering have been introduced to overcome problems with the realism of classical financial models. These classical models of financial time series typically assume homoskedasticity and normality and as such cannot explain stylized phenomena such as skewness, heavy tails, and volatility clustering of the empirical asset returns in finance. In 1963, Benoit Mandelbrot first used the stable distribution to model the empirical distributions which have the skewness and heavy-tail property. Since $-stable distributions have infinite -th moments for all, the tempered stable processes have been proposed for overcoming this limitation of the stable distribution.$

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event. It can also be used for the number of events in other types of intervals than time, and in dimension greater than 1.

In probability theory, concentration inequalities provide mathematical bounds on the probability of a random variable deviating from some value. The deviation or other function of the random variable can be thought of as a secondary random variable. The simplest example of the concentration of such a secondary random variable is the CDF of the first random variable which concentrates the probability to unity. If an analytic form of the CDF is available this provides a concentration equality that provides the exact probability of concentration. It is precisely when the CDF is difficult to calculate or even the exact form of the first random variable is unknown that the applicable concentration inequalities provide useful insight.

For certain applications in linear algebra, it is useful to know properties of the probability distribution of the largest eigenvalue of a finite sum of random matrices. Suppose $is a finite sequence of random matrices. Analogous to the well-known Chernoff bound for sums of scalars, a bound on the following is sought for a given parameter t :$

References

↑ Perrakis, Stylianos (2019). Stochastic Dominance Option Pricing. Palgrave Macmillan, Cham. doi:10.1007/978-3-030-11590-6_1. ISBN 978-3-030-11589-0.
↑ Definition 2.3 in Thibaut Lux, Antonin Papapantoleon: "Improved Fréchet-Hoeffding bounds for d-copulas and applications in model-free finance." Annals of Applied Probability 27, 3633-3671, 2017
↑ Alfred Müller, Dietrich Stoyan: Comparison methods for stochastic models and risks. Wiley, Chichester 2002, ISBN 0-471-49446-1, S. 2.
↑ Felix Brandt (2017-10-26). "Roling the Dice: Recent Results in Probabilistic Social Choice". In Endriss, Ulle (ed.). Trends in Computational Social Choice. Lulu.com. ISBN 978-1-326-91209-3.
↑ Fill, James Allen; Machida, Motoya (2001). "Stochastic Monotonicity and Realizable Monotonicity". The Annals of Probability. 29 (2): 938–978. ISSN 0091-1798.

Bibliography

M. Shaked and J. G. Shanthikumar, Stochastic Orders and their Applications, Associated Press, 1994.
E. L. Lehmann. Ordered families of distributions. The Annals of Mathematical Statistics, 26:399–419, 1955.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Perrakis, Stylianos (2019). Stochastic Dominance Option Pricing. Palgrave Macmillan, Cham. doi:10.1007/978-3-030-11590-6_1. ISBN 978-3-030-11589-0.

[2] Definition 2.3 in Thibaut Lux, Antonin Papapantoleon: "Improved Fréchet-Hoeffding bounds for d-copulas and applications in model-free finance." Annals of Applied Probability 27, 3633-3671, 2017

[3] Alfred Müller, Dietrich Stoyan: Comparison methods for stochastic models and risks. Wiley, Chichester 2002, ISBN 0-471-49446-1, S. 2.

[:0-4] Felix Brandt (2017-10-26). "Roling the Dice: Recent Results in Probabilistic Social Choice". In Endriss, Ulle (ed.). Trends in Computational Social Choice. Lulu.com. ISBN 978-1-326-91209-3.

[5] Fill, James Allen; Machida, Motoya (2001). "Stochastic Monotonicity and Realizable Monotonicity". The Annals of Probability. 29 (2): 938–978. ISSN 0091-1798.

[1]

[2]

[3]

[4]

[5]