Matrix variate Dirichlet distribution

Last updated April 23, 2019

In statistics, the matrix variate Dirichlet distribution is a generalization of the matrix variate beta distribution.

Statistics is a branch of mathematics dealing with data collection, organization, analysis, interpretation and presentation. In applying statistics to, for example, a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model process to be studied. Populations can be diverse topics such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments. See glossary of probability and statistics.

In statistics, the matrix variate beta distribution is a generalization of the beta distribution. If $is a positive definite matrix with a matrix variate beta distribution, and are real parameters, we write . The probability density function for is:$

Suppose $U_{1},\ldots ,U_{r}$ are $p\times p$ positive definite matrices with $\sum _{i=1}^{r}U_{i}<I_{p}$ , where $I_{p}$ is the $p\times p$ identity matrix. Then we say that the $U_{i}$ have a matrix variate Dirichlet distribution, $\left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(a_{1},\ldots ,a_{r};a_{r+1}\right)$ , if their joint probability density function is

In linear algebra, the identity matrix, or sometimes ambiguously called a unit matrix, of size n is the n × n square matrix with ones on the main diagonal and zeros elsewhere. It is denoted by I_n, or simply by I if the size is immaterial or can be trivially determined by the context. Less frequently, some mathematics books use U or E to represent the identity matrix, meaning "unit matrix" and the German word "Einheitsmatrix", respectively.

In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample in the sample space can be interpreted as providing a relative likelihood that the value of the random variable would equal that sample. In other words, while the absolute likelihood for a continuous random variable to take on any particular value is 0, the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would equal one sample compared to the other sample.

\left\{\beta _{p}\left(a_{1},\ldots ,a_{r},a_{r+1}\right)\right\}^{-1}\prod _{i=1}^{r}\det \left(U_{i}\right)^{a_{i}-(p+1)/2}\det \left(I_{p}-\sum _{i=1}^{r}U_{i}\right)^{a_{r+1}-(p+1)/2}

where $a_{i}>(p-1)/2,i=1,\ldots ,r+1$ and $\beta _{p}\left(\cdots \right)$ is the multivariate beta function.

If we write $U_{r+1}=I_{p}-\sum _{i=1}^{r}U_{i}$ then the PDF takes the simpler form

\left\{\beta _{p}\left(a_{1},\ldots ,a_{r+1}\right)\right\}^{-1}\prod _{i=1}^{r+1}\det \left(U_{i}\right)^{a_{i}-(p+1)/2},

on the understanding that $\sum _{i=1}^{r}U_{i}=I_{p}$ .

Theorems

generalization of chi square-Dirichlet result

Suppose $S_{i}\sim W_{p}\left(n_{i},\Sigma \right),i=1,\ldots ,r+1$ are independently distributed Wishart $p\times p$ positive definite matrices. Then, defining $U_{i}=S^{-1/2}S_{i}\left(S^{-1/2}\right)^{T}$ (where $S=\sum _{i=1}^{r+1}S_{i}$ is the sum of the matrices and $S^{1/2}\left(S^{-1/2}\right)^{T}$ is any reasonable factorization of $S$ ), we have

In statistics, the Wishart distribution is a generalization to multiple dimensions of the gamma distribution. It is named in honor of John Wishart, who first formulated the distribution in 1928.

\left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(n_{1}/2,...,n_{r+1}/2\right).

Marginal distribution

If $\left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(a_{1},\ldots ,a_{r+1}\right)$ , and if $s\leq r$ , then:

\left(U_{1},\ldots ,U_{s}\right)\sim D_{p}\left(a_{1},\ldots ,a_{s},\sum _{i=s+1}^{r+1}a_{i}\right)

Contitional distribution

Also, with the same notation as above, the density of $\left(U_{s+1},\ldots ,U_{r}\right)\left|\left(U_{1},\ldots ,U_{s}\right)\right.$ is given by

{\frac {\prod _{i=s+1}^{r+1}\det \left(U_{i}\right)^{a_{i}-(p+1)/2}}{\beta _{p}\left(a_{s+1},\ldots ,a_{r+1}\right)\det \left(I_{p}-\sum _{i=1}^{s}U_{i}\right)^{\sum _{i=s+1}^{r+1}a_{i}-(p+1)/2}}}

where we write $U_{r+1}=I_{p}-\sum _{i=1}^{r}U_{i}$ .

partitioned distribution

Suppose $\left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(a_{1},\ldots ,a_{r+1}\right)$ and suppose that $S_{1},\ldots ,S_{t}$ is a partition of $\left[r+1\right]=\left\{1,\ldots r+1\right\}$ (that is, $\cup _{i=1}^{t}S_{i}=\left[r+1\right]$ and $S_{i}\cap S_{j}=\emptyset$ if $i\neq j$ ). Then, writing $U_{(j)}=\sum _{i\in S_{j}}U_{i}$ and $a_{(j)}=\sum _{i\in S_{j}}a_{i}$ (with $U_{r+1}=I_{p}-\sum _{i=1}^{r}U_{r}$ ), we have:

\left(U_{(1)},\ldots U_{(t)}\right)\sim D_{p}\left(a_{(1)},\ldots ,a_{(t)}\right).

partitions

Suppose $\left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(a_{1},\ldots ,a_{r+1}\right)$ . Define

U_{i}=\left({\begin{array}{rr}U_{11(i)}&U_{12(i)}\\U_{21(i)}&U_{22(i)}\end{array}}\right)\qquad i=1,\ldots ,r

where $U_{11(i)}$ is $p_{1}\times p_{1}$ and $U_{22(i)}$ is $p_{2}\times p_{2}$ . Writing the Schur complement $U_{22\cdot 1(i)}=U_{21(i)}U_{11(i)}^{-1}U_{12(i)}$ we have

In linear algebra and the theory of matrices, the Schur complement of a block matrix is defined as follows.

\left(U_{11(1)},\ldots ,U_{11(r)}\right)\sim D_{p_{1}}\left(a_{1},\ldots ,a_{r+1}\right)

and

\left(U_{22.1(1)},\ldots ,U_{22.1(r)}\right)\sim D_{p_{2}}\left(a_{1}-p_{1}/2,\ldots ,a_{r}-p_{1}/2,a_{r+1}-p_{1}/2+p_{1}r/2\right).

Related Research Articles

Pauli matrices Matrices important in quantum mechanics and the study of spin

In mathematical physics and mathematics, the Pauli matrices are a set of three $2 \times 2$ complex matrices which are Hermitian and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries. They are

In probability theory and statistics, the exponential distribution is the probability distribution that describes the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

Angular displacement angle in radians (degrees, revolutions) through which a point or line has been rotated in a specified sense about a specified axis

Angular displacement of a body is the angle in radians through which a point revolves around a centre or line has been rotated in a specified sense about a specified axis. When a body rotates about its axis, the motion cannot simply be analyzed as a particle, as in circular motion it undergoes a changing velocity and acceleration at any time (t). When dealing with the rotation of a body, it becomes simpler to consider the body itself rigid. A body is generally considered rigid when the separations between all the particles remains constant throughout the body's motion, so for example parts of its mass are not flying off. In a realistic sense, all things can be deformable, however this impact is minimal and negligible. Thus the rotation of a rigid body over a fixed axis is referred to as rotational motion.

In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are three different parametrizations in common use:

With a shape parameter k and a scale parameter θ.
With a shape parameter α = k and an inverse scale parameter β = 1/θ, called a rate parameter.
With a shape parameter k and a mean parameter μ = kθ = α/β.

In linear algebra, an n-by-n square matrix A is called invertible if there exists an n-by-n square matrix B such that

In statistics, the matrix normal distribution or matrix Gaussian distribution is a probability distribution that is a generalization of the multivariate normal distribution to matrix-valued random variables.

In physics, the S-matrix or scattering matrix relates the initial state and the final state of a physical system undergoing a scattering process. It is used in quantum mechanics, scattering theory and quantum field theory (QFT).

In geometry, Euler's rotation theorem states that, in three-dimensional space, any displacement of a rigid body such that a point on the rigid body remains fixed, is equivalent to a single rotation about some axis that runs through the fixed point. It also means that the composition of two rotations is also a rotation. Therefore the set of rotations has a group structure, known as a rotation group.

Dirichlet distribution probability distribution

In probability and statistics, the Dirichlet distribution, often denoted $, is a family of continuous multivariate probability distributions parameterized by a vector of positive reals. It is a multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution (MBD) . Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution.$

In natural language processing, latent Dirichlet allocation (LDA) is a generative statistical model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. For example, if observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word's presence is attributable to one of the document's topics. LDA is an example of a topic model.

In statistics, the generalized Dirichlet distribution (GD) is a generalization of the Dirichlet distribution with a more general covariance structure and almost twice the number of parameters. Random variables with a GD distribution are not completely neutral.

In statistics, principal component regression (PCR) is a regression analysis technique that is based on principal component analysis (PCA). Typically, it considers regressing the outcome on a set of covariates based on a standard linear regression model, but uses PCA for estimating the unknown regression coefficients in the model.

In linear algebra, the computation of the permanent of a matrix is a problem that is thought to be more difficult than the computation of the determinant of a matrix despite the apparent similarity of the definitions.

The multivariate stable distribution is a multivariate probability distribution that is a multivariate generalisation of the univariate stable distribution. The multivariate stable distribution defines linear relations between stable distribution marginals. In the same way as for the univariate case, the distribution is defined in terms of its characteristic function.

In mathematics, the Lindström–Gessel–Viennot lemma provides a way to count the number of tuples of non-intersecting lattice paths.

In statistics, the matrix t-distribution is the generalization of the multivariate t-distribution from vectors to matrices. The matrix t-distribution shares the same relationship with the multivariate t-distribution that the matrix normal distribution shares with the multivariate normal distribution. For example, the matrix t-distribution is the compound distribution that results from sampling from a matrix normal distribution having sampled the covariance matrix of the matrix normal from an inverse Wishart distribution.

In linear algebra, a branch of mathematics, a compound matrix is a matrix whose entries are all minors, of a given size, of another matrix. Compound matrices are closely related to exterior algebras.

In statistics, the inverse Dirichlet distribution is a derivation of the matrix variate Dirichlet distribution. It is related to the inverse Wishart distribution.

References

A. K. Gupta and D. K. Nagar 1999. "Matrix variate distributions". Chapman and Hall.

This statistics-related article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.