Zero bias transform

Last updated

The zero-bias transform is a transform from one probability distribution to another. The transform arises in applications of Stein's method in probability and statistics.

In probability theory and statistics, a probability distribution is a mathematical function that provides the probabilities of occurrence of different possible outcomes in an experiment. In more technical terms, the probability distribution is a description of a random phenomenon in terms of the probabilities of events. For instance, if the random variable X is used to denote the outcome of a coin toss, then the probability distribution of X would take the value 0.5 for X = heads, and 0.5 for X = tails. Examples of random phenomena can include the results of an experiment or survey.

Stein's method is a general method in probability theory to obtain bounds on the distance between two probability distributions with respect to a probability metric. It was introduced by Charles Stein, who first published it in 1972, to obtain a bound between the distribution of a sum of -dependent sequence of random variables and a standard normal distribution in the Kolmogorov (uniform) metric and hence to prove not only a central limit theorem, but also bounds on the rates of convergence for the given metric.

Contents

Formal definition

The zero bias transform may be applied to both discrete and continuous random variables. The zero bias transform of a density function f(t), defined for all real numbers t ≥ 0, is the function g(s), defined by

Probability density function Function whose integral over a region describes the probability of an event occurring in that region

In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample in the sample space can be interpreted as providing a relative likelihood that the value of the random variable would equal that sample. In other words, while the absolute likelihood for a continuous random variable to take on any particular value is 0, the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would equal one sample compared to the other sample.

Real number Number representing a continuous quantity

In mathematics, a real number is a value of a continuous quantity that can represent a distance along a line. The adjective real in this context was introduced in the 17th century by René Descartes, who distinguished between real and imaginary roots of polynomials. The real numbers include all the rational numbers, such as the integer −5 and the fraction 4/3, and all the irrational numbers, such as 2. Included within the irrationals are the transcendental numbers, such as π (3.14159265...). In addition to measuring distance, real numbers can be used to measure quantities such as time, mass, energy, velocity, and many more.

where s and t are real numbers and f(t) is the density or mass function of the random variable T. [1]

An equivalent but alternative approach is to deduce the nature of the transformed random variable by evaluating the expected value

where the right-side superscript denotes a zero biased random variable whereas the left hand side expectation represents the original random variable. An example from each approach is given in the examples section beneath.

If the random variable is discrete the integral becomes a sum from positive infinity to s. The zero bias transform is taken for a mean zero, variance 1 random variable which may require a location-scale transform to the random variable.

Applications

The zero bias transformation arises in applications where a normal approximation is desired. Similar to Stein's method the zero bias transform is often applied to sums of random variables with each summand having finite variance an mean zero.

The zero bias transform has been applied to CDO tranche pricing. [2]

Examples

1. Consider a Bernoulli(p) random variable B with Pr(B = 0) = 1  p. The zero bias transform of T = (B  p) is:

where h is the derivative of H. From there it follows that the random variable S is a continuous uniform random variable on the support (−p, 1  p). This example shows how the zero bias transform smooths a discrete distribution into a continuous distribution.

2. Consider the continuous uniform on the support .

This example shows that the zero bias transform takes continuous symmetric distributions and makes them unimodular.

Related Research Articles

Lorenz curve graphical representation of the distribution of income or wealth; shows, for the bottom x% of households, what percentage (y%) of the total income they have

In economics, the Lorenz curve is a graphical representation of the distribution of income or of wealth. It was developed by Max O. Lorenz in 1905 for representing inequality of the wealth distribution.

In mathematics, the Laplace transform is an integral transform named after its inventor Pierre-Simon Laplace. It transforms a function of a real variable t to a function of a complex variable s. The transform has many applications in science and engineering.

Normal distribution probability distribution

In probability theory, the normaldistribution is a very common continuous probability distribution. Normal distributions are important in statistics and are often used in the natural and social sciences to represent real-valued random variables whose distributions are not known. A random variable with a Gaussian distribution is said to be normally distributed and is called a normal deviate.

Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set of axioms. Typically these axioms formalise probability in terms of a probability space, which assigns a measure taking values between 0 and 1, termed the probability measure, to a set of outcomes called the sample space. Any specified subset of these outcomes is called an event.

Random variable variable whose possible values are numerical outcomes of a random phenomenon

In probability and statistics, a random variable, random quantity, aleatory variable, or stochastic variable is described informally as a variable whose values depend on outcomes of a random phenomenon. The formal mathematical treatment of random variables is a topic in probability theory. In that context, a random variable is understood as a measurable function defined on a probability space whose outcomes are typically real numbers.

Dirac delta function pseudo-function δ such that an integral of δ(x-c)f(x) always takes the value of f(c)

In mathematics, the Dirac delta function is a generalized function or distribution introduced by the physicist Paul Dirac. It is used to model the density of an idealized point mass or point charge as a function equal to zero everywhere except for zero and whose integral over the entire real line is equal to one. As there is no function that has these properties, the computations made by the theoretical physicists appeared to mathematicians as nonsense until the introduction of distributions by Laurent Schwartz to formalize and validate the computations. As a distribution, the Dirac delta function is a linear functional that maps every function to its value at zero. The Kronecker delta function, which is usually defined on a discrete domain and takes values 0 and 1, is a discrete analog of the Dirac delta function.

In probability theory, the central limit theorem (CLT) establishes that, in some situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution even if the original variables themselves are not normally distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions.

Heaviside step function discontinuous function whose value is zero for negative numbers and one for positive numbers

The Heaviside step function, or the unit step function, usually denoted by H or θ, is a discontinuous function, named after Oliver Heaviside (1850–1925), whose value is zero for negative arguments and one for positive arguments. It is an example of the general class of step functions, all of which can be represented as linear combinations of translations of this one.

Probability mass function Discrete-variable probability distribution

In probability and statistics, a probability mass function (PMF) is a function that gives the probability that a discrete random variable is exactly equal to some value. The probability mass function is often the primary means of defining a discrete probability distribution, and such functions exist for either scalar or multivariate random variables whose domain is discrete.

In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability and the value 0 with probability that is, the probability distribution of any single experiment that asks a yes–no question; the question results in a boolean-valued outcome, a single bit whose value is success/yes/true/one with probability p and failure/no/false/zero with probability q. It can be used to represent a coin toss where 1 and 0 would represent "heads" and "tails", respectively, and p would be the probability of the coin landing on heads or tails, respectively. In particular, unfair coins would have

In mathematics, a Gaussian function, often simply referred to as a Gaussian, is a function of the form:

Rayleigh distribution

In probability theory and statistics, the Rayleigh distribution is a continuous probability distribution for nonnegative-valued random variables. It is essentially a chi distribution with two degrees of freedom.

Stable distribution

In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.

Uniform distribution (continuous) uniform distribution on an interval

In probability theory and statistics, the continuous uniform distribution or rectangular distribution is a family of symmetric probability distributions such that for each member of the family, all intervals of the same length on the distribution's support are equally probable. The support is defined by the two parameters, a and b, which are its minimum and maximum values. The distribution is often abbreviated U(a,b). It is the maximum entropy probability distribution for a random variable X under no constraint other than that it is contained in the distribution's support.

Characteristic function (probability theory) real-valued random variable completely defines its probability distribution

In probability theory and statistics, the characteristic function of any real-valued random variable completely defines its probability distribution. If a random variable admits a probability density function, then the characteristic function is the Fourier transform of the probability density function. Thus it provides the basis of an alternative route to analytical results compared with working directly with probability density functions or cumulative distribution functions. There are particularly simple results for the characteristic functions of distributions defined by the weighted sums of random variables.

Differential entropy is a concept in information theory that began as an attempt by Shannon to extend the idea of (Shannon) entropy, a measure of average surprisal of a random variable, to continuous probability distributions. Unfortunately, Shannon did not derive this formula, and rather just assumed it was the correct continuous analogue of discrete entropy, but it is not. The actual continuous version of discrete entropy is the limiting density of discrete points (LDDP). Differential entropy is commonly encountered in the literature, but it is a limiting case of the LDDP, and one that loses its fundamental association with discrete entropy.

Continuous wavelets of compact support can be built [1], which are related to the beta distribution. The process is derived from probability distributions using blur derivative. These new wavelets have just one cycle, so they are termed unicycle wavelets. They can be viewed as a soft variety of Haar wavelets whose shape is fine-tuned by two parameters and . Closed-form expressions for beta wavelets and scale functions as well as their spectra are derived. Their importance is due to the Central Limit Theorem by Gnedenko and Kolmogorov applied for compactly supported signals [2].

In statistics, the probability integral transform or transformation relates to the result that data values that are modelled as being random variables from any given continuous distribution can be converted to random variables having a standard uniform distribution. This holds exactly provided that the distribution being used is the true distribution of the random variables; if the distribution is one fitted to the data, the result will hold approximately in large samples.

Skew normal distribution

In probability theory and statistics, the skew normal distribution is a continuous probability distribution that generalises the normal distribution to allow for non-zero skewness.

References

  1. Goldstein, Larry; Reinert, Gesine (1997), "Stein's Method and the Zero Bias Transformation with Application to Simple Random Sampling" (PDF), The Annals of Applied Probability, 7 (4): 935–952
  2. Karoui, N. El; Jiao, Y. (2009). "Stein's method and zero bias transformation for CDO tranche pricing". Finance and Stochastics. 13 (2): 151–180. doi:10.1007/s00780-008-0084-6.