# Multivariate analysis of variance

Last updated

In statistics, multivariate analysis of variance (MANOVA) is a procedure for comparing multivariate sample means. As a multivariate procedure, it is used when there are two or more dependent variables, [1] and is often followed by significance tests involving individual dependent variables separately. [2]

## Relationship with ANOVA

MANOVA is a generalized form of univariate analysis of variance (ANOVA), [1] although, unlike univariate ANOVA, it uses the covariance between outcome variables in testing the statistical significance of the mean differences.

Where sums of squares appear in univariate analysis of variance, in multivariate analysis of variance certain positive-definite matrices appear. The diagonal entries are the same kinds of sums of squares that appear in univariate ANOVA. The off-diagonal entries are corresponding sums of products. Under normality assumptions about error distributions, the counterpart of the sum of squares due to error has a Wishart distribution.

MANOVA is based on the product of model variance matrix, ${\displaystyle \Sigma _{\text{model}}}$ and inverse of the error variance matrix, ${\displaystyle \Sigma _{\text{res}}^{-1}}$, or ${\displaystyle A=\Sigma _{\text{model}}\times \Sigma _{\text{res}}^{-1}}$. The hypothesis that ${\displaystyle \Sigma _{\text{model}}=\Sigma _{\text{residual}}}$ implies that the product ${\displaystyle A\sim I}$. [3] Invariance considerations imply the MANOVA statistic should be a measure of magnitude of the singular value decomposition of this matrix product, but there is no unique choice owing to the multi-dimensional nature of the alternative hypothesis.

The most common [4] [5] statistics are summaries based on the roots (or eigenvalues) ${\displaystyle \lambda _{p}}$ of the ${\displaystyle A}$ matrix:

• Samuel Stanley Wilks' ${\displaystyle \Lambda _{\text{Wilks}}=\prod _{1,\ldots ,p}(1/(1+\lambda _{p}))=\det(I+A)^{-1}=\det(\Sigma _{\text{res}})/\det(\Sigma _{\text{res}}+\Sigma _{\text{model}})}$ distributed as lambda (Λ)
• the K. C. Sreedharan PillaiM. S. Bartlett trace, ${\displaystyle \Lambda _{\text{Pillai}}=\sum _{1,\ldots ,p}(\lambda _{p}/(1+\lambda _{p}))=\operatorname {tr} (A(I+A)^{-1})}$ [6]
• the Lawley–Hotelling trace, ${\displaystyle \Lambda _{\text{LH}}=\sum _{1,\ldots ,p}(\lambda _{p})=\operatorname {tr} (A)}$
• Roy's greatest root (also called Roy's largest root), ${\displaystyle \Lambda _{\text{Roy}}=\max _{p}(\lambda _{p})}$

Discussion continues over the merits of each, [1] although the greatest root leads only to a bound on significance which is not generally of practical interest. A further complication is that, except for the Roy's greatest root, the distribution of these statistics under the null hypothesis is not straightforward and can only be approximated except in a few low-dimensional cases. [7] An algorithm for the distribution of the Roy's largest root under the null hypothesis was derived in [8] while the distribution under the alternative is studied in. [9]

The best-known approximation for Wilks' lambda was derived by C. R. Rao.

In the case of two groups, all the statistics are equivalent and the test reduces to Hotelling's T-square.

## Correlation of dependent variables

MANOVA's power is affected by the correlations of the dependent variables and by the effect sizes associated with those variables. For example, when there are two groups and two dependent variables, MANOVA's power is lowest when the correlation equals the ratio of the smaller to the larger standardized effect size. [10]

## Related Research Articles

In probability theory, a normaldistribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its mean. In other words, it measures how far a set of numbers is spread out from their average value. Variance has a central role in statistics, where some ideas that use it include descriptive statistics, statistical inference, hypothesis testing, goodness of fit, and Monte Carlo sampling. Variance is an important tool in the sciences, where statistical analysis of data is common. The variance is the square of the standard deviation, the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by , , or .

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

In statistics, the Wishart distribution is a generalization to multiple dimensions of the gamma distribution. It is named in honor of John Wishart, who first formulated the distribution in 1928.

In statistics, particularly in hypothesis testing, the Hotelling's T-squared distribution (T2), proposed by Harold Hotelling, is a multivariate probability distribution that is tightly related to the F-distribution and is most notable for arising as the distribution of a set of sample statistics that are natural generalizations of the statistics underlying the Student's t-distribution.

In statistics, sometimes the covariance matrix of a multivariate random variable is not known but has to be estimated. Estimation of covariance matrices then deals with the question of how to approximate the actual covariance matrix on the basis of a sample from the multivariate distribution. Simple cases, where observations are complete, can be dealt with by using the sample covariance matrix. The sample covariance matrix (SCM) is an unbiased and efficient estimator of the covariance matrix if the space of covariance matrices is viewed as an extrinsic convex cone in Rp×p; however, measured using the intrinsic geometry of positive-definite matrices, the SCM is a biased and inefficient estimator. In addition, if the random variable has normal distribution, the sample covariance matrix has Wishart distribution and a slightly differently scaled version of it is the maximum likelihood estimate. Cases involving missing data require deeper considerations. Another issue is the robustness to outliers, to which sample covariance matrices are highly sensitive.

Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features that characterizes or separates two or more classes of objects or events. The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification.

In multivariate statistics, if is a vector of random variables, and is an -dimensional symmetric matrix, then the scalar quantity is known as a quadratic form in .

In statistics, Bayesian linear regression is an approach to linear regression in which the statistical analysis is undertaken within the context of Bayesian inference. When the regression model has errors that have a normal distribution, and if a particular form of prior distribution is assumed, explicit results are available for the posterior probability distributions of the model's parameters.

Covariance matrix adaptation evolution strategy (CMA-ES) is a particular kind of strategy for numerical optimization. Evolution strategies (ES) are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continuous optimization problems. They belong to the class of evolutionary algorithms and evolutionary computation. An evolutionary algorithm is broadly based on the principle of biological evolution, namely the repeated interplay of variation and selection: in each generation (iteration) new individuals are generated by variation, usually in a stochastic way, of the current parental individuals. Then, some individuals are selected to become the parents in the next generation based on their fitness or objective function value . Like this, over the generation sequence, individuals with better and better -values are generated.

In statistics, Wilks' lambda distribution, is a probability distribution used in multivariate hypothesis testing, especially with regard to the likelihood-ratio test and multivariate analysis of variance (MANOVA).

In statistics, the bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. In statistics, "bias" is an objective property of an estimator. Bias can also be measured with respect to the median, rather than the mean, in which case one distinguishes median-unbiased from the usual mean-unbiasedness property. Bias is a distinct concept from consistency. Consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased; see bias versus consistency for more.

Mauchly's sphericity test or Mauchly's W is a statistical test used to validate a repeated measures analysis of variance (ANOVA). It was developed in 1940 by John Mauchly.

In probability and statistics, the class of exponential dispersion models (EDM) is a set of probability distributions that represents a generalisation of the natural exponential family. Exponential dispersion models play an important role in statistical theory, in particular in generalized linear models because they have a special structure which enables deductions to be made about appropriate statistical inference.

Multivariate analysis of covariance (MANCOVA) is an extension of analysis of covariance (ANCOVA) methods to cover cases where there is more than one dependent variable and where the control of concomitant continuous independent variables – covariates – is required. The most prominent benefit of the MANCOVA design over the simple MANOVA is the 'factoring out' of noise or error that has been introduced by the covariant. A commonly used multivariate version of the ANOVA F-statistic is Wilks' Lambda (Λ), which represents the ratio between the error variance and the effect variance.

In probability theory and statistics, the normal-inverse-gamma distribution is a four-parameter family of multivariate continuous probability distributions. It is the conjugate prior of a normal distribution with unknown mean and variance.

In probability theory and statistics, the generalized chi-squared distribution is the distribution of a linear sum of independent noncentral chi-square variables and a normal variable, or equivalently, the distribution of a quadratic form of a multinormal variable. There are several other such generalizations for which the same term is sometimes used; some of them are special cases of the family discussed here, for example the gamma distribution.

The multivariate stable distribution is a multivariate probability distribution that is a multivariate generalisation of the univariate stable distribution. The multivariate stable distribution defines linear relations between stable distribution marginals. In the same way as for the univariate case, the distribution is defined in terms of its characteristic function.

In probability theory and statistics, the normal-inverse-Wishart distribution is a multivariate four-parameter family of continuous probability distributions. It is the conjugate prior of a multivariate normal distribution with unknown mean and covariance matrix.

The growth curve model in statistics is a specific multivariate linear model, also known as GMANOVA. It generalizes MANOVA by allowing post-matrices, as seen in the definition.

## References

1. Warne, R. T. (2014). "A primer on multivariate analysis of variance (MANOVA) for behavioral scientists". Practical Assessment, Research & Evaluation. 19 (17): 1–10.
2. Stevens, J. P. (2002). Applied multivariate statistics for the social sciences. Mahwah, NJ: Lawrence Erblaum.
3. Carey, Gregory. "Multivariate Analysis of Variance (MANOVA): I. Theory" (PDF). Retrieved 2011-03-22.
4. Garson, G. David. "Multivariate GLM, MANOVA, and MANCOVA" . Retrieved 2011-03-22.
5. UCLA: Academic Technology Services, Statistical Consulting Group. "Stata Annotated Output – MANOVA" . Retrieved 2011-03-22.
6. "MANOVA Basic Concepts – Real Statistics Using Excel". www.real-statistics.com. Retrieved 5 April 2018.
7. Chiani, M. (2016), "Distribution of the largest root of a matrix for Roy's test in multivariate analysis of variance", Journal of Multivariate Analysis , 143: 467–471, arXiv:, doi:10.1016/j.jmva.2015.10.007
8. I.M. Johnstone, B. Nadler "Roy's largest root test under rank-one alternatives" arXiv preprint arXiv:1310.6581 (2013)
9. Frane, Andrew (2015). "Power and Type I Error Control for Univariate Comparisons in Multivariate Two-Group Designs". Multivariate Behavioral Research. 50 (2): 233–247. doi:10.1080/00273171.2014.968836. PMID   26609880.