Varimax rotation

Last updated January 19, 2022

In statistics, a varimax rotation is used to simplify the expression of a particular sub-space in terms of just a few major items each. The actual coordinate system is unchanged, it is the orthogonal basis that is being rotated to align with those coordinates. The sub-space found with principal component analysis or factor analysis is expressed as a dense basis with many non-zero weights which makes it hard to interpret. Varimax is so called because it maximizes the sum of the variances of the squared loadings (squared correlations between variables and factors). Preserving orthogonality requires that it is a rotation that leaves the sub-space invariant. Intuitively, this is achieved if, (a) any given variable has a high loading on a single factor but near-zero loadings on the remaining factors and if (b) any given factor is constituted by only a few variables with very high loadings on this factor while the remaining variables have near-zero loadings on this factor. If these conditions hold, the factor loading matrix is said to have "simple structure," and varimax rotation brings the loading matrix closer to such simple structure (as much as the data allow). From the perspective of individuals measured on the variables, varimax seeks a basis that most economically represents each individual—that is, each individual can be well described by a linear combination of only a few basis functions.

Rotation in factor analysis

A summary of the use of varimax rotation and of other types of factor rotation is presented in this article on factor analysis.

Implementations

In the R programming language the varimax method is implemented in several packages including stats (function varimax( )), or in contributed packages including GPArotation or psych.
In SAS varimax rotation is available in PROC FACTOR using ROTATE = VARIMAX.^[2]

Notes

↑ Henry F. Kaiser (September 1958). "The varimax criterion for analytic rotation in factor analysis". Psychometrika . 23 (3). doi:10.1007/BF02289233.
↑ "SAS/STAT(R) 9.22 User's Guide". support.sas.com.

External links

This article incorporates public domain material from the National Institute of Standards and Technology website https://www.nist.gov .

This statistics-related article is a stub. You can help Wikipedia by expanding it.

Related Research Articles

Principal component analysis Conversion of observations of possibly-correlated variables into values of fewer, uncorrelated variables

The principal components of a collection of points in a real coordinate space are a sequence of $unit vectors, where the -th vector is the direction of a line that best fits the data while being orthogonal to the first vectors. Here, a best-fitting line is defined as one that minimizes the average squared distance from the points to the line. These directions constitute an orthonormal basis in which different individual dimensions of the data are linearly uncorrelated. Principal component analysis (PCA) is the process of computing the principal components and using them to perform a change of basis on the data, sometimes using only the first few principal components and ignoring the rest.$

In linear algebra, a symmetric matrix is a square matrix that is equal to its transpose. Formally,

In mathematics, a square matrix is a matrix with the same number of rows and columns. An n-by-n matrix is known as a square matrix of order $.$ Any two square matrices of the same order can be added and multiplied.

In mathematics, particularly in linear algebra, a skew-symmetricmatrix is a square matrix whose transpose equals its negative. That is, it satisfies the condition

In mathematics, a Hermitian matrix is a complex square matrix that is equal to its own conjugate transpose—that is, the element in the $i$ -th row and $j$ -th column is equal to the complex conjugate of the element in the $j$ -th row and $i$ -th column, for all indices $i$ and $j$ :

In linear algebra, a square matrix $is called diagonalizable or non-defective if it is similar to a diagonal matrix, i.e., if there exists an invertible matrix and a diagonal matrix such that, or equivalently . For a finite-dimensional vector space, a linear map is called diagonalizable if there exists an ordered basis of consisting of eigenvectors of . These definitions are equivalent: if has a matrix representation as above, then the column vectors of form a basis consisting of eigenvectors of, and the diagonal entries of are the corresponding eigenvalues of; with respect to this eigenvector basis, is represented by . Diagonalization is the process of finding the above and .$

Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed variables mainly reflect the variations in two unobserved (underlying) variables. Factor analysis searches for such joint variations in response to unobserved latent variables. The observed variables are modelled as linear combinations of the potential factors plus "error" terms, hence factor analysis can be thought of as a special case of errors-in-variables models.

In mathematics, the Rayleigh quotient for a given complex Hermitian matrix M and nonzero vector x is defined as:

In mathematics, spectral theory is an inclusive term for theories extending the eigenvector and eigenvalue theory of a single square matrix to a much broader theory of the structure of operators in a variety of mathematical spaces. It is a result of studies of linear algebra and the solutions of systems of linear equations and their generalizations. The theory is connected to that of analytic functions because the spectral properties of an operator are related to analytic functions of the spectral parameter.

The Gell-Mann matrices, developed by Murray Gell-Mann, are a set of eight linearly independent 3×3 traceless Hermitian matrices used in the study of the strong interaction in particle physics. They span the Lie algebra of the SU(3) group in the defining representation.

In mathematics, the polar decomposition of a square real or complex matrix $is a factorization of the form, where is a unitary matrix and is a positive semi-definite Hermitian matrix, both square and of the same size.$

In the mathematical field of graph theory, the Laplacian matrix, also called the graph Laplacian, admittance matrix, Kirchhoff matrix or discrete Laplacian, is a matrix representation of a graph. The Laplacian matrix can be used to find many useful properties of a graph. Together with Kirchhoff's theorem, it can be used to calculate the number of spanning trees for a given graph. The sparsest cut of a graph can be approximated through the second smallest eigenvalue of its Laplacian by Cheeger's inequality. It can also be used to construct low dimensional embeddings, which can be useful for a variety of machine learning applications.

In linear algebra, an eigenvector or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted by $, is the factor by which the eigenvector is scaled.$

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. In economic theory, the non-linear least squares method is applied in (i) the probit regression, (ii) threshold regression, (iii) smooth regression, (iv) logistic link regression, (v) Box-Cox transformed regressors.

In multivariate statistics, exploratory factor analysis (EFA) is a statistical method used to uncover the underlying structure of a relatively large set of variables. EFA is a technique within factor analysis whose overarching goal is to identify the underlying relationships between measured variables. It is commonly used by researchers when developing a scale and serves to identify a set of latent constructs underlying a battery of measured variables. It should be used when the researcher has no a priori hypothesis about factors or patterns of measured variables. Measured variables are any one of several attributes of people that may be observed and measured. Examples of measured variables could be the physical height, weight, and pulse rate of a human being. Usually, researchers would have a large number of measured variables, which are assumed to be related to a smaller number of "unobserved" factors. Researchers must carefully consider the number of measured variables to include in the analysis. EFA procedures are more accurate when each factor is represented by multiple measured variables in the analysis.

In mathematics, the method of steepest descent or saddle-point method is an extension of Laplace's method for approximating an integral, where one deforms a contour integral in the complex plane to pass near a stationary point, in roughly the direction of steepest descent or stationary phase. The saddle-point approximation is used with integrals in the complex plane, whereas Laplace’s method is used with real integrals.

Plasticity theory for rocks is concerned with the response of rocks to loads beyond the elastic limit. Historically, conventional wisdom has it that rock is brittle and fails by fracture while plasticity is identified with ductile materials. In field scale rock masses, structural discontinuities exist in the rock indicating that failure has taken place. Since the rock has not fallen apart, contrary to expectation of brittle behavior, clearly elasticity theory is not the last work.

Functional principal component analysis (FPCA) is a statistical method for investigating the dominant modes of variation of functional data. Using this method, a random function is represented in the eigenbasis, which is an orthonormal basis of the Hilbert space L² that consists of the eigenfunctions of the autocovariance operator. FPCA represents functional data in the most parsimonious way, in the sense that when using a fixed number of basis functions, the eigenfunction basis explains more variation than any other basis expansion. FPCA can be applied for representing random functions, or in functional regression and classification.

Regularized least squares (RLS) is a family of methods for solving the least-squares problem while using regularization to further constrain the resulting solution.

In statistics, the complex Wishart distribution is a complex version of the Wishart distribution. It is the distribution of $times the sample Hermitian covariance matrix of zero-mean independent Gaussian random variables. It has support for Hermitian positive definite matrices.$

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Henry F. Kaiser (September 1958). "The varimax criterion for analytic rotation in factor analysis". Psychometrika . 23 (3). doi:10.1007/BF02289233.

[2] "SAS/STAT(R) 9.22 User's Guide". support.sas.com.

[1]

[2]