Group family

Last updated April 05, 2019

In probability theory, especially as that field is used in statistics, a group family of probability distributions is a family obtained by subjecting a random variable with a fixed distribution to a suitable family of transformations such as a location-scale family, or otherwise a family of probability distributions acted upon by a group.^[1]

Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set of axioms. Typically these axioms formalise probability in terms of a probability space, which assigns a measure taking values between 0 and 1, termed the probability measure, to a set of outcomes called the sample space. Any specified subset of these outcomes is called an event.

Statistics is a branch of mathematics dealing with data collection, organization, analysis, interpretation and presentation. In applying statistics to, for example, a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model process to be studied. Populations can be diverse topics such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments. See glossary of probability and statistics.

In probability theory and statistics, a probability distribution is a mathematical function that provides the probabilities of occurrence of different possible outcomes in an experiment. In more technical terms, the probability distribution is a description of a random phenomenon in terms of the probabilities of events. For instance, if the random variable $X$ is used to denote the outcome of a coin toss, then the probability distribution of $X$ would take the value 0.5 for $X = heads$ , and 0.5 for $X = tails$ . Examples of random phenomena can include the results of an experiment or survey.

The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics. The theory covers approaches to statistical-decision problems and to statistical inference, and the actions and deductions that satisfy the basic principles stated for these different approaches. Within a given approach, statistical theory gives ways of comparing statistical procedures; it can find a best possible procedure within a given context for given statistical problems, or can provide guidance on the choice between alternative procedures.

An ancillary statistic is a measure of a sample whose distribution does not depend on the parameters of the model. An ancillary statistic is a pivotal quantity that is also a statistic. Ancillary statistics can be used to construct prediction intervals.

Types of group families

A group family can be generated by subjecting a random variable with a fixed distribution to some suitable transformations.^[1] Different types of group families are as follows :

In probability and statistics, a random variable, random quantity, aleatory variable, or stochastic variable is a variable whose possible values are outcomes of a random phenomenon. More specifically, a random variable is defined as a function that maps the outcomes of an unpredictable process to numerical quantities, typically real numbers. It is a variable, in the sense that it depends on the outcome of an underlying process providing the input to this function, and it is random in the sense that the underlying process is assumed to be random.

In mathematics, particularly in semigroup theory, a transformation is a function f that maps a set X to itself, i.e. f : X → X. In other areas of mathematics, a transformation may simply be any function, regardless of domain and codomain. This wider sense shall not be considered in this article; refer instead to the article on function for that sense.

Location Family

This family is obtained by adding a constant to a random variable. Let $X$ be a random variable and $a\in R$ be a constant. Let ${\textstyle Y=X+a}$ . Then

F_{Y}(y)=P(Y\leq y)=P(X+a\leq y)=P(X\leq y-a)=F_{X}(y-a)

For a fixed distribution , as $a$ varies from $-\infty$ to $\infty$ , the distributions that we obtain constitute the location family.

Scale Family

This family is obtained by multiplying a random variable with a constant. Let $X$ be a random variable and $c\in R^{+}$ be a constant. Let ${\textstyle Y=cX}$ . Then

F_{Y}(y)=P(Y\leq y)=P(cX\leq y)=P(X\leq y/c)=F_{X}(y/c)

Location - Scale Family

This family is obtained by multiplying a random variable with a constant and then adding some other constant to it. Let $X$ be a random variable , $a\in R$ and $c\in R^{+}$ be constants. Let $Y=cX+a$ . Then

F_{Y}(y)=P(Y\leq y)=P(cX+a\leq y)=P(X\leq (y-a)/c)=F_{X}((y-a)/c)

Note that it is important that ${\textstyle a\in R}$ and $c\in R^{+}$ in order to satisfy the properties mentioned in the following section.

Properties of the transformations

The transformation applied to the random variable must satisfy the following properties.^[1]

Closure under composition
Closure under inversion

Related Research Articles

In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable $, or just distribution function of, evaluated at, is the probability that will take a value less than or equal to .$

In probability theory, the expected value of a random variable, intuitively, is the long-run average value of repetitions of the same experiment it represents. For example, the expected value in rolling a six-sided die is 3.5, because the average of all the numbers that come up is 3.5 as the number of rolls approaches infinity. In other words, the law of large numbers states that the arithmetic mean of the values almost surely converges to the expected value as the number of repetitions approaches infinity. The expected value is also known as the expectation, mathematical expectation, EV, average, mean value, mean, or first moment.

Information entropy is the average rate at which information is produced by a stochastic source of data.

In probability theory, the central limit theorem (CLT) establishes that, in some situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution even if the original variables themselves are not normally distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions.

In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample in the sample space can be interpreted as providing a relative likelihood that the value of the random variable would equal that sample. In other words, while the absolute likelihood for a continuous random variable to take on any particular value is 0, the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would equal one sample compared to the other sample.

In probability theory, there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some limit random variable is an important concept in probability theory, and its applications to statistics and stochastic processes. The same concepts are known in more general mathematics as stochastic convergence and they formalize the idea that a sequence of essentially random or unpredictable events can sometimes be expected to settle down into a behavior that is essentially unchanging when items far enough into the sequence are studied. The different possible notions of convergence relate to how such a behavior can be characterized: two readily understood behaviors are that the sequence eventually takes a constant value, and that values in the sequence continue to change but can be described by an unchanging probability distribution.

In probability theory, Chebyshev's inequality guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more than a certain distance from the mean. Specifically, no more than 1/k² of the distribution's values can be more than k standard deviations away from the mean. The rule is often called Chebyshev's theorem, about the range of standard deviations around the mean, in statistics. The inequality has great utility because it can be applied to any probability distribution in which the mean and variance are defined. For example, it can be used to prove the weak law of large numbers.

In numerical analysis and computational statistics, rejection sampling is a basic technique used to generate observations from a distribution. It is also commonly called the acceptance-rejection method or "accept-reject algorithm" and is a type of exact simulation method. The method works for any distribution in $with a density.$

A Dynkin system, named after Eugene Dynkin, is a collection of subsets of another universal set $satisfying a set of axioms weaker than those of σ-algebra. Dynkin systems are sometimes referred to as λ-systems or d-system . These set families have applications in measure theory and probability.$

Differential entropy is a concept in information theory that began as an attempt by Shannon to extend the idea of (Shannon) entropy, a measure of average surprisal of a random variable, to continuous probability distributions. Unfortunately, Shannon did not derive this formula, and rather just assumed it was the correct continuous analogue of discrete entropy, but it is not. The actual continuous version of discrete entropy is the limiting density of discrete points (LDDP). Differential entropy is commonly encountered in the literature, but it is a limiting case of the LDDP, and one that loses its fundamental association with discrete entropy.

In mathematics, a $π$ -system on a set Ω is a collection P of certain subsets of Ω, such that

In probability theory, an indecomposable distribution is a probability distribution that cannot be represented as the distribution of the sum of two or more non-constant independent random variables: Z ≠ X + Y. If it can be so expressed, it is decomposable:Z = X + Y. If, further, it can be expressed as the distribution of the sum of two or more independent identically distributed random variables, then it is divisible:Z = X₁ + X₂.

In statistics, the probability integral transform or transformation relates to the result that data values that are modelled as being random variables from any given continuous distribution can be converted to random variables having a standard uniform distribution. This holds exactly provided that the distribution being used is the true distribution of the random variables; if the distribution is one fitted to the data, the result will hold approximately in large samples.

In mathematics, uniform integrability is an important concept in real analysis, functional analysis and measure theory, and plays a vital role in the theory of martingales. The definition used in measure theory is closely related to, but not identical to, the definition typically used in probability.

In probability theory and statistics, a stochastic order quantifies the concept of one random variable being "bigger" than another. These are usually partial orders, so that one random variable $may be neither stochastically greater than, less than nor equal to another random variable . Many different orders exist, which have different applications.$

In probability theory, Kolmogorov's Three-Series Theorem, named after Andrey Kolmogorov, gives a criterion for the almost sure convergence of an infinite series of random variables in terms of the convergence of three different series involving properties of their probability distributions. Kolmogorov's three-series theorem, combined with Kronecker's lemma, can be used to give a relatively easy proof of the Strong Law of Large Numbers.

A product distribution is a probability distribution constructed as the distribution of the product of random variables having two other known distributions. Given two statistically independent random variables X and Y, the distribution of the random variable Z that is formed as the product

In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usually abbreviated as i.i.d. or iid or IID. Herein, i.i.d. is used, because it is the most prevalent.

References

1 2 3 Lehmann, E. L.; George Casella (1998). Theory of Point Estimation (2nd ed.). Springer. ISBN 0-387-98502-6.
↑ Cox, D.R. (2006) Principles of Statistical Inference, CUP. ISBN 0-521-68567-2 (Section 4.4.2)

This statistics-related article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[:0-1] 1 2 3 Lehmann, E. L.; George Casella (1998). Theory of Point Estimation (2nd ed.). Springer. ISBN 0-387-98502-6.

[2] Cox, D.R. (2006) Principles of Statistical Inference, CUP. ISBN 0-521-68567-2 (Section 4.4.2)