Algebra of random variables

Last updated May 11, 2024

The algebra of random variables in statistics, provides rules for the symbolic manipulation of random variables, while avoiding delving too deeply into the mathematically sophisticated ideas of probability theory. Its symbolism allows the treatment of sums, products, ratios and general functions of random variables, as well as dealing with operations such as finding the probability distributions and the expectations (or expected values), variances and covariances of such combinations.

In principle, the elementary algebra of random variables is equivalent to that of conventional non-random (or deterministic) variables. However, the changes occurring on the probability distribution of a random variable obtained after performing algebraic operations are not straightforward. Therefore, the behavior of the different operators of the probability distribution, such as expected values, variances, covariances, and moments, may be different from that observed for the random variable using symbolic algebra. It is possible to identify some key rules for each of those operators, resulting in different types of algebra for random variables, apart from the elementary symbolic algebra: Expectation algebra, Variance algebra, Covariance algebra, Moment algebra, etc.

Elementary symbolic algebra of random variables

Considering two random variables $X$ and $Y$ , the following algebraic operations are possible:

Addition: $Z=X+Y=Y+X$
Subtraction: $Z=X-Y=-Y+X$
Multiplication: $Z=XY=YX$
Division: $Z=X/Y=X\cdot (1/Y)=(1/Y)\cdot X$
Exponentiation: $Z=X^{Y}=e^{Y\ln(X)}$

In all cases, the variable $Z$ resulting from each operation is also a random variable. All commutative and associative properties of conventional algebraic operations are also valid for random variables. If any of the random variables is replaced by a deterministic variable or by a constant value, all the previous properties remain valid.

Expectation algebra for random variables

The expected value $E$ of the random variable $Z$ resulting from an algebraic operation between two random variables can be calculated using the following set of rules:

Addition: $E[Z]=E[X+Y]=E[X]+E[Y]=E[Y]+E[X]$
Subtraction: $E[Z]=E[X-Y]=E[X]-E[Y]=-E[Y]+E[X]$
Multiplication: $E[Z]=E[XY]=E[YX]$ . Particularly, if $X$ and $Y$ are independent from each other, then: $E[XY]=E[X]\cdot E[Y]=E[Y]\cdot E[X]$ .
Division: $E[Z]=E[X/Y]=E[X\cdot (1/Y)]=E[(1/Y)\cdot X]$ . Particularly, if $X$ and $Y$ are independent from each other, then: $E[X/Y]=E[X]\cdot E[1/Y]=E[1/Y]\cdot E[X]$ .
Exponentiation: $E[Z]=E[X^{Y}]=E[e^{Y\ln(X)}]$

If any of the random variables is replaced by a deterministic variable or by a constant value ( $k$ ), the previous properties remain valid considering that $P[X=k]=1$ and, therefore, $E[X]=k$ .

If $Z$ is defined as a general non-linear algebraic function $f$ of a random variable $X$ , then:

$E[Z]=E[f(X)]\neq f(E[X])$

Some examples of this property include:

$E[X^{2}]\neq E[X]^{2}$
$E[1/X]\neq 1/E[X]$
$E[e^{X}]\neq e^{E[X]}$
$E[\ln(X)]\neq \ln(E[X])$

The exact value of the expectation of the non-linear function will depend on the particular probability distribution of the random variable $X$ .

Variance algebra for random variables

The variance $\mathrm {Var}$ of the random variable $Z$ resulting from an algebraic operation between random variables can be calculated using the following set of rules:

Addition: $\mathrm {Var} [Z]=\mathrm {Var} [X+Y]=\mathrm {Var} [X]+2\mathrm {Cov} [X,Y]+\mathrm {Var} [Y]$ . Particularly, if $X$ and $Y$ are independent from each other, then: $\mathrm {Var} [X+Y]=\mathrm {Var} [X]+\mathrm {Var} [Y]$ .
Subtraction: $\mathrm {Var} [Z]=\mathrm {Var} [X-Y]=\mathrm {Var} [X]-2\mathrm {Cov} [X,Y]+\mathrm {Var} [Y]$ . Particularly, if $X$ and $Y$ are independent from each other, then: $\mathrm {Var} [X-Y]=\mathrm {Var} [X]+\mathrm {Var} [Y]$ . That is, for independent random variables the variance is the same for additions and subtractions: $\mathrm {Var} [X+Y]=\mathrm {Var} [X-Y]=\mathrm {Var} [Y-X]=\mathrm {Var} [-X-Y]$
Multiplication: $\mathrm {Var} [Z]=\mathrm {Var} [XY]=\mathrm {Var} [YX]$ . Particularly, if $X$ and $Y$ are independent from each other, then: $\mathrm {Var} [XY]=E[X^{2}]\cdot E[Y^{2}]-(E[X]\cdot E[Y])^{2}=\mathrm {Var} [X]\cdot \mathrm {Var} [Y]+\mathrm {Var} [X]\cdot (E[Y])^{2}+\mathrm {Var} [Y]\cdot (E[X])^{2}$ .
Division: $\mathrm {Var} [Z]=\mathrm {Var} [X/Y]=\mathrm {Var} [X\cdot (1/Y)]=\mathrm {Var} [(1/Y)\cdot X]$ . Particularly, if $X$ and $Y$ are independent from each other, then: $\mathrm {Var} [X/Y]=E[X^{2}]\cdot E[1/Y^{2}]-(E[X]\cdot E[1/Y])^{2}=\mathrm {Var} [X]\cdot \mathrm {Var} [1/Y]+\mathrm {Var} [X]\cdot (E[1/Y])^{2}+\mathrm {Var} [1/Y]\cdot (E[X])^{2}$ .
Exponentiation: $\mathrm {Var} [Z]=\mathrm {Var} [X^{Y}]=\mathrm {Var} [e^{Y\ln(X)}]$

where $\mathrm {Cov} [X,Y]=\mathrm {Cov} [Y,X]$ represents the covariance operator between random variables $X$ and $Y$ .

The variance of a random variable can also be expressed directly in terms of the covariance or in terms of the expected value:

$\mathrm {Var} [X]=\mathrm {Cov} (X,X)=E[X^{2}]-E[X]^{2}$

If any of the random variables is replaced by a deterministic variable or by a constant value ( $k$ ), the previous properties remain valid considering that $P[X=k]=1$ and $E[X]=k$ , $\mathrm {Var} [X]=0$ and $\mathrm {Cov} [Y,k]=0$ . Special cases are the addition and multiplication of a random variable with a deterministic variable or a constant, where:

$\mathrm {Var} [k+Y]=\mathrm {Var} [Y]$
$\mathrm {Var} [kY]=k^{2}\mathrm {Var} [Y]$

If $Z$ is defined as a general non-linear algebraic function $f$ of a random variable $X$ , then:

$\mathrm {Var} [Z]=\mathrm {Var} [f(X)]\neq f(\mathrm {Var} [X])$

The exact value of the variance of the non-linear function will depend on the particular probability distribution of the random variable $X$ .

Covariance algebra for random variables

The covariance ( $\mathrm {Cov}$ ) between the random variable $Z$ resulting from an algebraic operation and the random variable $X$ can be calculated using the following set of rules:

Addition: $\mathrm {Cov} [Z,X]=\mathrm {Cov} [X+Y,X]=\mathrm {Var} [X]+\mathrm {Cov} [X,Y]$ . If $X$ and $Y$ are independent from each other, then: $\mathrm {Cov} [X+Y,X]=\mathrm {Var} [X]$ .
Subtraction: $\mathrm {Cov} [Z,X]=\mathrm {Cov} [X-Y,X]=\mathrm {Var} [X]-\mathrm {Cov} [X,Y]$ . If $X$ and $Y$ are independent from each other, then: $\mathrm {Cov} [X-Y,X]=\mathrm {Var} [X]$ .
Multiplication: $\mathrm {Cov} [Z,X]=\mathrm {Cov} [XY,X]=E[X^{2}Y]-E[XY]E[X]$ . If $X$ and $Y$ are independent from each other, then: $\mathrm {Cov} [XY,X]=\mathrm {Var} [X]\cdot E[Y]$ .
Division (covariance with respect to the numerator): $\mathrm {Cov} [Z,X]=\mathrm {Cov} [X/Y,X]=E[X^{2}/Y]-E[X/Y]E[X]$ . If $X$ and $Y$ are independent from each other, then: $\mathrm {Cov} [X/Y,X]=\mathrm {Var} [X]\cdot E[1/Y]$ .
Division (covariance with respect to the denominator): $\mathrm {Cov} [Z,X]=\mathrm {Cov} [Y/X,X]=E[Y]-E[Y/X]E[X]$ . If $X$ and $Y$ are independent from each other, then: $\mathrm {Cov} [Y/X,X]=E[Y]\cdot (1-E[X]\cdot E[1/X])$ .
Exponentiation (covariance with respect to the base): $\mathrm {Cov} [Z,X]=\mathrm {Cov} [X^{Y},X]=E[X^{Y+1}]-E[X^{Y}]E[X]$ .
Exponentiation (covariance with respect to the power): $\mathrm {Cov} [Z,X]=\mathrm {Cov} [Y^{X},X]=E[XY^{X}]-E[Y^{X}]E[X]$ .

The covariance of a random variable can also be expressed directly in terms of the expected value:

$\mathrm {Cov} (X,Y)=E[XY]-E[X]E[Y]$

If any of the random variables is replaced by a deterministic variable or by a constant value ( $k$ ), the previous properties remain valid considering that $E[k]=k$ , $\mathrm {Var} [k]=0$ and $\mathrm {Cov} [X,k]=0$ .

If $Z$ is defined as a general non-linear algebraic function $f$ of a random variable $X$ , then:

$\mathrm {Cov} [Z,X]=\mathrm {Cov} [f(X),X]=E[Xf(X)]-E[f(X)]E[X]$

The exact value of the variance of the non-linear function will depend on the particular probability distribution of the random variable $X$ .

Approximations by Taylor series expansions of moments

If the moments of a certain random variable $X$ are known (or can be determined by integration if the probability density function is known), then it is possible to approximate the expected value of any general non-linear function $f(X)$ as a Taylor series expansion of the moments, as follows:

$f(X)=\displaystyle \sum _{n=0}^{\infty }\displaystyle {\frac {1}{n!}}{\biggl (}{d^{n}f \over dX^{n}}{\biggr )}_{X=\mu }(X-\mu )^{n}$ , where $\mu =E[X]$ is the mean value of $X$ .

$E[f(X)]=E{\biggl (}\textstyle \sum _{n=0}^{\infty }\displaystyle {1 \over n!}{\biggl (}{d^{n}f \over dX^{n}}{\biggr )}_{X=\mu }(X-\mu )^{n}{\biggr )}=\displaystyle \sum _{n=0}^{\infty }\displaystyle {1 \over n!}{\biggl (}{d^{n}f \over dX^{n}}{\biggr )}_{X=\mu }E[(X-\mu )^{n}]=\textstyle \sum _{n=0}^{\infty }\displaystyle {\frac {1}{n!}}{\biggl (}{d^{n}f \over dX^{n}}{\biggr )}_{X=\mu }\mu _{n}(X)$ , where $\mu _{n}(X)=E[(X-\mu )^{n}]$ is the n-th moment of $X$ about its mean. Note that by their definition, $\mu _{0}(X)=1$ and $\mu _{1}(X)=0$ . The first order term always vanishes but was kept to obtain a closed form expression.

Then,

$E[f(X)]\approx \textstyle \sum _{n=0}^{n_{max}}\displaystyle {1 \over n!}{\biggl (}{d^{n}f \over dX^{n}}{\biggr )}_{X=\mu }\mu _{n}(X)$ , where the Taylor expansion is truncated after the $n_{max}$ -th moment.

Particularly for functions of normal random variables, it is possible to obtain a Taylor expansion in terms of the standard normal distribution:^[1]

$f(X)=\textstyle \sum _{n=0}^{\infty }\displaystyle {\sigma ^{n} \over n!}{\biggl (}{d^{n}f \over dX^{n}}{\biggr )}_{X=\mu }\mu _{n}(Z)$ , where $X\sim N(\mu ,\sigma ^{2})$ is a normal random variable, and $Z\sim N(0,1)$ is the standard normal distribution. Thus,

$E[f(X)]\approx \textstyle \sum _{n=0}^{n_{max}}\displaystyle {\sigma ^{n} \over n!}{\biggl (}{d^{n}f \over dX^{n}}{\biggr )}_{X=\mu }\mu _{n}(Z)$ , where the moments of the standard normal distribution are given by:

$\mu _{n}(Z)={\begin{cases}\prod _{i=1}^{n/2}(2i-1),&{\text{if }}n{\text{ is even}}\\0,&{\text{if }}n{\text{ is odd}}\end{cases}}$

Similarly for normal random variables, it is also possible to approximate the variance of the non-linear function as a Taylor series expansion as:

$Var[f(X)]\approx \textstyle \sum _{n=1}^{n_{max}}\displaystyle {\biggl (}{\sigma ^{n} \over n!}{\biggl (}{d^{n}f \over dX^{n}}{\biggr )}_{X=\mu }{\biggr )}^{2}Var[Z^{n}]+\textstyle \sum _{n=1}^{n_{max}}\displaystyle \textstyle \sum _{m\neq n}\displaystyle {\sigma ^{n+m} \over {n!m!}}{\biggl (}{d^{n}f \over dX^{n}}{\biggr )}_{X=\mu }{\biggl (}{d^{m}f \over dX^{m}}{\biggr )}_{X=\mu }Cov[Z^{n},Z^{m}]$ , where

$Var[Z^{n}]={\begin{cases}\prod _{i=1}^{n}(2i-1)-\prod _{i=1}^{n/2}(2i-1)^{2},&{\text{if }}n{\text{ is even}}\\\prod _{i=1}^{n}(2i-1),&{\text{if }}n{\text{ is odd}}\end{cases}}$ , and

$Cov[Z^{n},Z^{m}]={\begin{cases}\prod _{i=1}^{(n+m)/2}(2i-1)-\prod _{i=1}^{n/2}(2i-1)\prod _{j=1}^{m/2}(2j-1),&{\text{if }}n{\text{ and }}m{\text{ are even}}\\\prod _{i=1}^{(n+m)/2}(2i-1),&{\text{if }}n{\text{ and }}m{\text{ are odd}}\\0,&{\text{otherwise}}\end{cases}}$

Algebra of complex random variables

In the algebraic axiomatization of probability theory, the primary concept is not that of probability of an event, but rather that of a random variable. Probability distributions are determined by assigning an expectation to each random variable. The measurable space and the probability measure arise from the random variables and expectations by means of well-known representation theorems of analysis. One of the important features of the algebraic approach is that apparently infinite-dimensional probability distributions are not harder to formalize than finite-dimensional ones.

Random variables are assumed to have the following properties:

complex constants are possible realizations of a random variable;
the sum of two random variables is a random variable;
the product of two random variables is a random variable;
addition and multiplication of random variables are both commutative; and
there is a notion of conjugation of random variables, satisfying $(XY) * = Y * X *$ and $X ** = X$ for all random variables $X, Y$ and coinciding with complex conjugation if $X$ is a constant.

This means that random variables form complex commutative *-algebras. If $X = X *$ then the random variable $X$ is called "real".

An expectation $E$ on an algebra $A$ of random variables is a normalized, positive linear functional. What this means is that

$E [k] = k$ where $k$ is a constant;
$E [X * X] \geq 0$ for all random variables $X$ ;
$E [X + Y] = E [X] + E [Y]$ for all random variables $X$ and $Y$ ; and
$E [kX] = kE [X]$ if $k$ is a constant.

One may generalize this setup, allowing the algebra to be noncommutative. This leads to other areas of noncommutative probability such as quantum probability, random matrix theory, and free probability.

Related Research Articles

Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of one does not affect the probability of occurrence of the other or, equivalently, does not affect the odds. Similarly, two random variables are independent if the realization of one does not affect the probability distribution of the other.

In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value. It is the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by $,,,, or .$

In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the distribution of a normalized version of the sample mean converges to a standard normal distribution. This holds even if the original variables themselves are not normally distributed. There are several versions of the CLT, each applying in the context of different conditions.

In probability, and statistics, a multivariate random variable or random vector is a list or vector of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value. The individual variables in a random vector are grouped together because they are all part of a single mathematical system — often they represent different properties of an individual statistical unit. For example, while a given person has a specific age, height and weight, the representation of these features of an unspecified person from within a group would be a random vector. Normally each element of a random vector is a real number.

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables, each of which clusters around a mean value.

Covariance in probability theory and statistics is a measure of the joint variability of two random variables.

In probability theory and statistics, two real-valued random variables, $,, are said to be uncorrelated if their covariance,, is zero. If two variables are uncorrelated, there is no linear relationship between them.$

In mathematical analysis, Hölder's inequality, named after Otto Hölder, is a fundamental inequality between integrals and an indispensable tool for the study of $L p$ spaces.

<span class="mw-page-title-main">Covariance matrix</span> Measure of covariance of components of a random vector

In probability theory and statistics, a covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector.

In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability $and the value 0 with probability . Less formally, it can be thought of as a model for the set of possible outcomes of any single experiment that asks a yes-no question. Such questions lead to outcomes that are Boolean-valued: a single bit whose value is success/yes/true/one with probability p and failure/no/false/zero with probability q . It can be used to represent a coin toss where 1 and 0 would represent "heads" and "tails", respectively, and p would be the probability of the coin landing on heads. In particular, unfair coins would have$

In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] or in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.

In probability theory and statistics, the cumulants $κ n$ of a probability distribution are a set of quantities that provide an alternative to the moments of the distribution. Any two probability distributions whose moments are identical will have identical cumulants as well, and vice versa.

In mathematics, the moments of a function are certain quantitative measures related to the shape of the function's graph. If the function represents mass density, then the zeroth moment is the total mass, the first moment is the center of mass, and the second moment is the moment of inertia. If the function is a probability distribution, then the first moment is the expected value, the second central moment is the variance, the third standardized moment is the skewness, and the fourth standardized moment is the kurtosis. The mathematical concept is closely related to the concept of moment in physics.

In probability theory and statistics, the chi distribution is a continuous probability distribution over the non-negative real line. It is the distribution of the positive square root of a sum of squared independent Gaussian random variables. Equivalently, it is the distribution of the Euclidean distance between a multivariate Gaussian random variable and the origin. It is thus related to the chi-squared distribution by describing the distribution of the positive square roots of a variable obeying a chi-squared distribution.

In mathematics, Bochner's theorem characterizes the Fourier transform of a positive finite Borel measure on the real line. More generally in harmonic analysis, Bochner's theorem asserts that under Fourier transform a continuous positive-definite function on a locally compact abelian group corresponds to a finite positive measure on the Pontryagin dual group. The case of sequences was first established by Gustav Herglotz

In probability theory, the inverse Gaussian distribution is a two-parameter family of continuous probability distributions with support on (0,∞).

This article discusses how information theory is related to measure theory.

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

In probability theory and statistics, a cross-covariance matrix is a matrix whose element in the i, j position is the covariance between the i-th element of a random vector and j-th element of another random vector. A random vector is a random variable with multiple dimensions. Each element of the vector is a scalar random variable. Each element has either a finite number of observed empirical values or a finite or infinite number of potential values. The potential values are specified by a theoretical joint probability distribution. Intuitively, the cross-covariance matrix generalizes the notion of covariance to multiple dimensions.

In statistics and in probability theory, distance correlation or distance covariance is a measure of dependence between two paired random vectors of arbitrary, not necessarily equal, dimension. The population distance correlation coefficient is zero if and only if the random vectors are independent. Thus, distance correlation measures both linear and nonlinear association between two random variables or random vectors. This is in contrast to Pearson's correlation, which can only detect linear association between two random variables.

References

↑ Hernandez, Hugo (2016). "Modelling the effect of fluctuation in nonlinear systems using variance algebra - Application to light scattering of ideal gases". ForsChem Research Reports. 2016–1. doi:10.13140/rg.2.2.36501.52969.

Algebra of random variables

Contents

Elementary symbolic algebra of random variables

Expectation algebra for random variables

Variance algebra for random variables

Covariance algebra for random variables

Approximations by Taylor series expansions of moments

Algebra of complex random variables

See also

Related Research Articles

References

Further reading