In multilinear algebra, the tensor rank decomposition [1] or rank-R decomposition is the decomposition of a tensor as a sum of R rank-1 tensors, where R is minimal. Computing this decomposition is an open problem.[ clarification needed ]
Canonical polyadic decomposition (CPD) is a variant of the tensor rank decomposition, in which the tensor is approximated as a sum of K rank-1 tensors for a user-specified K. The CP decomposition has found some applications in linguistics and chemometrics. It was introduced by Frank Lauren Hitchcock in 1927 [2] and later rediscovered several times, notably in psychometrics. [3] [4] The CP decomposition is referred to as CANDECOMP, [3] PARAFAC, [4] or CANDECOMP/PARAFAC (CP). Note that the PARAFAC2 rank decomposition is a variation of the CP decomposition. [5]
Another popular generalization of the matrix SVD known as the higher-order singular value decomposition computes orthonormal mode matrices and has found applications in econometrics, signal processing, computer vision, computer graphics, and psychometrics.
A scalar variable is denoted by lower case italic letters, and an upper bound scalar is denoted by an upper case italic letter, .
Indices are denoted by a combination of lowercase and upper case italic letters, . Multiple indices that one might encounter when referring to the multiple modes of a tensor are conveniently denoted by where .
A vector is denoted by a lower case bold Times Roman, and a matrix is denoted by bold upper case letters .
A higher order tensor is denoted by calligraphic letters,. An element of an -order tensor is denoted by or .
A data tensor is a collection of multivariate observations organized into a M-way array where M=C+1. Every tensor may be represented with a suitably large as a linear combination of rank-1 tensors:
where and where . When the number of terms is minimal in the above expression, then is called the rank of the tensor, and the decomposition is often referred to as a (tensor) rank decomposition, minimal CP decomposition, or Canonical Polyadic Decomposition (CPD). If the number of terms is not minimal, then the above decomposition is often referred to as CANDECOMP/PARAFAC, Polyadic decomposition'.
Contrary to the case of matrices, computing the rank of a tensor is NP-hard. [6] The only notable well-understood case consists of tensors in , whose rank can be obtained from the Kronecker–Weierstrass normal form of the linear matrix pencil that the tensor represents. [7] A simple polynomial-time algorithm exists for certifying that a tensor is of rank 1, namely the higher-order singular value decomposition.
The rank of the tensor of zeros is zero by convention. The rank of a tensor is one, provided that .
The rank of a tensor depends on the field over which the tensor is decomposed. It is known that some real tensors may admit a complex decomposition whose rank is strictly less than the rank of a real decomposition of the same tensor. As an example, [8] consider the following real tensor
where . The rank of this tensor over the reals is known to be 3, while its complex rank is only 2 because it is the sum of a complex rank-1 tensor with its complex conjugate, namely
where .
In contrast, the rank of real matrices will never decrease under a field extension to : real matrix rank and complex matrix rank coincide for real matrices.
The generic rank is defined as the least rank such that the closure in the Zariski topology of the set of tensors of rank at most is the entire space . In the case of complex tensors, tensors of rank at most form a dense set : every tensor in the aforementioned space is either of rank less than the generic rank, or it is the limit in the Euclidean topology of a sequence of tensors from . In the case of real tensors, the set of tensors of rank at most only forms an open set of positive measure in the Euclidean topology. There may exist Euclidean-open sets of tensors of rank strictly higher than the generic rank. All ranks appearing on open sets in the Euclidean topology are called typical ranks. The smallest typical rank is called the generic rank; this definition applies to both complex and real tensors. The generic rank of tensor spaces was initially studied in 1983 by Volker Strassen. [9]
As an illustration of the above concepts, it is known that both 2 and 3 are typical ranks of while the generic rank of is 2. Practically, this means that a randomly sampled real tensor (from a continuous probability measure on the space of tensors) of size will be a rank-1 tensor with probability zero, a rank-2 tensor with positive probability, and rank-3 with positive probability. On the other hand, a randomly sampled complex tensor of the same size will be a rank-1 tensor with probability zero, a rank-2 tensor with probability one, and a rank-3 tensor with probability zero. It is even known that the generic rank-3 real tensor in will be of complex rank equal to 2.
The generic rank of tensor spaces depends on the distinction between balanced and unbalanced tensor spaces. A tensor space , where , is called unbalanced whenever
and it is called balanced otherwise.
When the first factor is very large with respect to the other factors in the tensor product, then the tensor space essentially behaves as a matrix space. The generic rank of tensors living in an unbalanced tensor spaces is known to equal
almost everywhere. More precisely, the rank of every tensor in an unbalanced tensor space , where is some indeterminate closed set in the Zariski topology, equals the above value. [10]
The expected generic rank of tensors living in a balanced tensor space is equal to
almost everywhere for complex tensors and on a Euclidean-open set for real tensors, where
More precisely, the rank of every tensor in , where is some indeterminate closed set in the Zariski topology, is expected to equal the above value. [11] For real tensors, is the least rank that is expected to occur on a set of positive Euclidean measure. The value is often referred to as the expected generic rank of the tensor space because it is only conjecturally correct. It is known that the true generic rank always satisfies
The Abo–Ottaviani–Peterson conjecture [11] states that equality is expected, i.e., , with the following exceptional cases:
In each of these exceptional cases, the generic rank is known to be . Note that while the set of tensors of rank 3 in is defective (13 and not the expected 14), the generic rank in that space is still the expected one, 4. Similarly, the set of tensors of rank 5 in is defective (44 and not the expected 45), but the generic rank in that space is still the expected 6.
The AOP conjecture has been proved completely in a number of special cases. Lickteig showed already in 1985 that , provided that . [12] In 2011, a major breakthrough was established by Catalisano, Geramita, and Gimigliano who proved that the expected dimension of the set of rank tensors of format is the expected one except for rank 3 tensors in the 4 factor case, yet the expected rank in that case is still 4. As a consequence, for all binary tensors. [13]
The maximum rank that can be admitted by any of the tensors in a tensor space is unknown in general; even a conjecture about this maximum rank is missing. Presently, the best general upper bound states that the maximum rank of , where , satisfies
where is the (least) generic rank of . [14] It is well-known that the foregoing inequality may be strict. For instance, the generic rank of tensors in is two, so that the above bound yields , while it is known that the maximum rank equals 3. [8]
A rank- tensor is called a border tensor if there exists a sequence of tensors of rank at most whose limit is . If is the least value for which such a convergent sequence exists, then it is called the border rank of . For order-2 tensors, i.e., matrices, rank and border rank always coincide, however, for tensors of order they may differ. Border tensors were first studied in the context of fast approximate matrix multiplication algorithms by Bini, Lotti, and Romani in 1980. [15]
A classic example of a border tensor is the rank-3 tensor
It can be approximated arbitrarily well by the following sequence of rank-2 tensors
as . Therefore, its border rank is 2, which is strictly less than its rank. When the two vectors are orthogonal, this example is also known as a W state.
It follows from the definition of a pure tensor that if and only if there exist such that and for all m. For this reason, the parameters of a rank-1 tensor are called identifiable or essentially unique. A rank- tensor is called identifiable if every of its tensor rank decompositions is the sum of the same set of distinct tensors where the 's are of rank 1. An identifiable rank- thus has only one essentially unique decomposition and all tensor rank decompositions of can be obtained by permuting the order of the summands. Observe that in a tensor rank decomposition all the 's are distinct, for otherwise the rank of would be at most .
Order-2 tensors in , i.e., matrices, are not identifiable for . This follows essentially from the observation where is an invertible matrix, , , and . It can be shown [16] that for every , where is a closed set in the Zariski topology, the decomposition on the right-hand side is a sum of a different set of rank-1 tensors than the decomposition on the left-hand side, entailing that order-2 tensors of rank are generically not identifiable.
The situation changes completely for higher-order tensors in with and all . For simplicity in notation, assume without loss of generality that the factors are ordered such that . Let denote the set of tensors of rank bounded by . Then, the following statement was proved to be correct using a computer-assisted proof for all spaces of dimension , [17] and it is conjectured to be valid in general: [17] [18] [19]
There exists a closed set in the Zariski topology such that every tensoris identifiable ( is called generically identifiable in this case), unless either one of the following exceptional cases holds:
In these exceptional cases, the generic (and also minimum) number of complex decompositions is
In summary, the generic tensor of order and rank that is not identifiability-unbalanced is expected to be identifiable (modulo the exceptional cases in small spaces).
The rank approximation problem asks for the rank- decomposition closest (in the usual Euclidean topology) to some rank- tensor , where . That is, one seeks to solve
where is the Frobenius norm.
It was shown in a 2008 paper by de Silva and Lim [8] that the above standard approximation problem may be ill-posed. A solution to aforementioned problem may sometimes not exist because the set over which one optimizes is not closed. As such, a minimizer may not exist, even though an infimum would exist. In particular, it is known that certain so-called border tensors may be approximated arbitrarily well by a sequence of tensor of rank at most , even though the limit of the sequence converges to a tensor of rank strictly higher than . The rank-3 tensor
can be approximated arbitrarily well by the following sequence of rank-2 tensors
as . This example neatly illustrates the general principle that a sequence of rank- tensors that converges to a tensor of strictly higher rank needs to admit at least two individual rank-1 terms whose norms become unbounded. Stated formally, whenever a sequence
has the property that (in the Euclidean topology) as , then there should exist at least such that
as . This phenomenon is often encountered when attempting to approximate a tensor using numerical optimization algorithms. It is sometimes called the problem of diverging components. It was, in addition, shown that a random low-rank tensor over the reals may not admit a rank-2 approximation with positive probability, leading to the understanding that the ill-posedness problem is an important consideration when employing the tensor rank decomposition.
A common partial solution to the ill-posedness problem consists of imposing an additional inequality constraint that bounds the norm of the individual rank-1 terms by some constant. Other constraints that result in a closed set, and, thus, well-posed optimization problem, include imposing positivity or a bounded inner product strictly less than unity between the rank-1 terms appearing in the sought decomposition.
Alternating algorithms:
Direct algorithms:
General optimization algorithms:
General polynomial system solving algorithms:
In machine learning, the CP-decomposition is the central ingredient in learning probabilistic latent variables models via the technique of moment-matching. For example, consider the multi-view model [32] which is a probabilistic latent variable model. In this model, the generation of samples are posited as follows: there exists a hidden random variable that is not observed directly, given which, there are several conditionally independent random variables known as the different "views" of the hidden variable. For example, assume there are three views of a -state categorical hidden variable . Then the empirical third moment of this latent variable model is a rank 3 tensor and can be decomposed as: .
In applications such as topic modeling, this can be interpreted as the co-occurrence of words in a document. Then the coefficients in the decomposition of this empirical moment tensor can be interpreted as the probability of choosing a specific topic and each column of the factor matrix corresponds to probabilities of words in the vocabulary in the corresponding topic.
In mathematics, a product is the result of multiplication, or an expression that identifies objects to be multiplied, called factors. For example, 21 is the product of 3 and 7, and is the product of and . When one factor is an integer, the product is called a multiple.
In mathematics, the tensor product of two vector spaces V and W is a vector space to which is associated a bilinear map that maps a pair to an element of denoted .
In mathematics, the Lp spaces are function spaces defined using a natural generalization of the p-norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue, although according to the Bourbaki group they were first introduced by Frigyes Riesz.
In mathematics, the exterior algebra or Grassmann algebra of a vector space is an associative algebra that contains which has a product, called exterior product or wedge product and denoted with , such that for every vector in The exterior algebra is named after Hermann Grassmann, and the names of the product come from the "wedge" symbol and the fact that the product of two elements of is "outside"
The Fock space is an algebraic construction used in quantum mechanics to construct the quantum states space of a variable or unknown number of identical particles from a single particle Hilbert space H. It is named after V. A. Fock who first introduced it in his 1932 paper "Konfigurationsraum und zweite Quantelung".
In mathematics, the Kronecker product, sometimes denoted by ⊗, is an operation on two matrices of arbitrary size resulting in a block matrix. It is a specialization of the tensor product from vectors to matrices and gives the matrix of the tensor product linear map with respect to a standard choice of basis. The Kronecker product is to be distinguished from the usual matrix multiplication, which is an entirely different operation. The Kronecker product is also sometimes called matrix direct product.
In abstract algebra and multilinear algebra, a multilinear form on a vector space over a field is a map
In mathematics, a distinctive feature of algebraic geometry is that some line bundles on a projective variety can be considered "positive", while others are "negative". The most important notion of positivity is that of an ample line bundle, although there are several related classes of line bundles. Roughly speaking, positivity properties of a line bundle are related to having many global sections. Understanding the ample line bundles on a given variety amounts to understanding the different ways of mapping into projective spaces. In view of the correspondence between line bundles and divisors, there is an equivalent notion of an ample divisor.
In algebraic topology, a Steenrod algebra was defined by Henri Cartan to be the algebra of stable cohomology operations for mod cohomology.
In algebraic geometry, Proj is a construction analogous to the spectrum-of-a-ring construction of affine schemes, which produces objects with the typical properties of projective spaces and projective varieties. The construction, while not functorial, is a fundamental tool in scheme theory.
In mathematics, Schubert calculus is a branch of algebraic geometry introduced in the nineteenth century by Hermann Schubert in order to solve various counting problems of projective geometry and, as such, is viewed as part of enumerative geometry. Giving it a more rigorous foundation was the aim of Hilbert's 15th problem. It is related to several more modern concepts, such as characteristic classes, and both its algorithmic aspects and applications remain of current interest. The term Schubert calculus is sometimes used to mean the enumerative geometry of linear subspaces of a vector space, which is roughly equivalent to describing the cohomology ring of Grassmannians. Sometimes it is used to mean the more general enumerative geometry of algebraic varieties that are homogenous spaces of simple Lie groups. Even more generally, Schubert calculus is sometimes understood as encompassing the study of analogous questions in generalized cohomology theories.
In mathematics, a holomorphic vector bundle is a complex vector bundle over a complex manifold X such that the total space E is a complex manifold and the projection map π : E → X is holomorphic. Fundamental examples are the holomorphic tangent bundle of a complex manifold, and its dual, the holomorphic cotangent bundle. A holomorphic line bundle is a rank one holomorphic vector bundle.
In quantum computing and quantum communication, a stabilizer code is a class of quantum codes for performing quantum error correction. The toric code, and surface codes more generally, are types of stabilizer codes considered very important for the practical realization of quantum information processing.
In mathematics, Hochschild homology (and cohomology) is a homology theory for associative algebras over rings. There is also a theory for Hochschild homology of certain functors. Hochschild cohomology was introduced by Gerhard Hochschild (1945) for algebras over a field, and extended to algebras over more general rings by Henri Cartan and Samuel Eilenberg (1956).
In multilinear algebra, the higher-order singular value decomposition (HOSVD) of a tensor is a specific orthogonal Tucker decomposition. It may be regarded as one type of generalization of the matrix singular value decomposition. It has applications in computer vision, computer graphics, machine learning, scientific computing, and signal processing. Some aspects can be traced as far back as F. L. Hitchcock in 1928, but it was L. R. Tucker who developed for third-order tensors the general Tucker decomposition in the 1960s, further advocated by L. De Lathauwer et al. in their Multilinear SVD work that employs the power method, or advocated by Vasilescu and Terzopoulos that developed M-mode SVD a parallel algorithm that employs the matrix SVD.
In mathematics, the Kodaira–Spencer map, introduced by Kunihiko Kodaira and Donald C. Spencer, is a map associated to a deformation of a scheme or complex manifold X, taking a tangent space of a point of the deformation space to the first cohomology group of the sheaf of vector fields on X.
In stochastic analysis, a rough path is a generalization of the notion of smooth path allowing to construct a robust solution theory for controlled differential equations driven by classically irregular signals, for example a Wiener process. The theory was developed in the 1990s by Terry Lyons. Several accounts of the theory are available.
In algebraic geometry, a derived scheme is a homotopy-theoretic generalization of a scheme in which classical commutative rings are replaced with derived versions such as differential graded algebras, commutative simplicial rings, or commutative ring spectra.
In multilinear algebra, applying a map that is the tensor product of linear maps to a tensor is called a multilinear multiplication.
In mathematics, the hypergraph regularity method is a powerful tool in extremal graph theory that refers to the combined application of the hypergraph regularity lemma and the associated counting lemma. It is a generalization of the graph regularity method, which refers to the use of Szemerédi's regularity and counting lemmas.