Tensor reshaping

Last updated March 25, 2024

In multilinear algebra, a reshaping of tensors is any bijection between the set of indices of an order- $M$ tensor and the set of indices of an order- $L$ tensor, where $L<M$ . The use of indices presupposes tensors in coordinate representation with respect to a basis. The coordinate representation of a tensor can be regarded as a multi-dimensional array, and a bijection from one set of indices to another therefore amounts to a rearrangement of the array elements into an array of a different shape. Such a rearrangement constitutes a particular kind of linear map between the vector space of order- $M$ tensors and the vector space of order- $L$ tensors.

Definition

Given a positive integer $M$ , the notation $[M]$ refers to the set $\{1,\dots ,M\}$ of the first $M$ positive integers.

For each integer $m$ where $1\leq m\leq M$ for a positive integer $M$ , let $V_{m}$ denote an $I_{m}$ -dimensional vector space over a field $F$ . Then there are vector space isomorphisms (linear maps)

{\begin{aligned}V_{1}\otimes \cdots \otimes V_{M}&\simeq F^{I_{1}}\otimes \cdots \otimes F^{I_{M}}\\&\simeq F^{I_{\pi _{1}}}\otimes \cdots \otimes F^{I_{\pi _{M}}}\\&\simeq F^{I_{\pi _{1}}I_{\pi _{2}}}\otimes F^{I_{\pi _{3}}}\otimes \cdots \otimes F^{I_{\pi _{M}}}\\&\simeq F^{I_{\pi _{1}}I_{\pi _{3}}}\otimes F^{I_{\pi _{2}}}\otimes F^{I_{\pi _{4}}}\otimes \cdots \otimes F^{I_{\pi _{M}}}\\&\,\,\,\vdots \\&\simeq F^{I_{1}I_{2}\ldots I_{M}},\end{aligned}}

where $\pi \in {\mathfrak {S}}_{M}$ is any permutation and ${\mathfrak {S}}_{M}$ is the symmetric group on $M$ elements. Via these (and other) vector space isomorphisms, a tensor can be interpreted in several ways as an order- $L$ tensor where $L\leq M$ .

Coordinate representation

The first vector space isomorphism on the list above, $V_{1}\otimes \cdots \otimes V_{M}\simeq F^{I_{1}}\otimes \cdots \otimes F^{I_{M}}$ , gives the coordinate representation of an abstract tensor. Assume that each of the $M$ vector spaces $V_{m}$ has a basis $\{v_{1}^{m},v_{2}^{m},\ldots ,v_{I_{m}}^{m}\}$ . The expression of a tensor with respect to this basis has the form

{\mathcal {A}}=\sum _{i_{1}=1}^{I_{1}}\ldots \sum _{i_{M}=1}^{I_{M}}a_{i_{1},i_{2},\ldots ,i_{M}}v_{i_{1}}^{1}\otimes v_{i_{2}}^{2}\otimes \cdots \otimes v_{i_{M}}^{M},

where the coefficients $a_{i_{1},i_{2},\ldots ,i_{M}}$ are elements of $F$ . The coordinate representation of ${\mathcal {A}}$ is

\sum _{i_{1}=1}^{I_{1}}\ldots \sum _{i_{M}=1}^{I_{M}}a_{i_{1},i_{2},\ldots ,i_{M}}\mathbf {e} _{i_{1}}^{1}\otimes \mathbf {e} _{i_{2}}^{2}\otimes \cdots \otimes \mathbf {e} _{i_{M}}^{M},

where $\mathbf {e} _{i}^{m}$ is the $i^{\text{th}}$ standard basis vector of $F^{I_{m}}$ . This can be regarded as a M-way array whose elements are the coefficients $a_{i_{1},i_{2},\ldots ,i_{M}}$ .

General flattenings

For any permutation $\pi \in {\mathfrak {S}}_{M}$ there is a canonical isomorphism between the two tensor products of vector spaces $V_{1}\otimes V_{2}\otimes \cdots \otimes V_{M}$ and $V_{\pi (1)}\otimes V_{\pi (2)}\otimes \cdots \otimes V_{\pi (M)}$ . Parentheses are usually omitted from such products due to the natural isomorphism between $V_{i}\otimes (V_{j}\otimes V_{k})$ and $(V_{i}\otimes V_{j})\otimes V_{k}$ , but may, of course, be reintroduced to emphasize a particular grouping of factors. In the grouping,

(V_{\pi (1)}\otimes \cdots \otimes V_{\pi (r_{1})})\otimes (V_{\pi (r_{1}+1)}\otimes \cdots \otimes V_{\pi (r_{2})})\otimes \cdots \otimes (V_{\pi (r_{L-1}+1)}\otimes \cdots \otimes V_{\pi (r_{L})}),

there are $L$ groups with $r_{l}-r_{l-1}$ factors in the $l^{\text{th}}$ group (where $r_{0}=0$ and $r_{L}=M$ ).

Letting $S_{l}=(\pi (r_{l-1}+1),\pi (r_{l-1}+2),\ldots ,\pi (r_{l}))$ for each $l$ satisfying $1\leq l\leq L$ , an $(S_{1},S_{2},\ldots ,S_{L})$ -flattening of a tensor ${\mathcal {A}}$ , denoted ${\mathcal {A}}_{(S_{1},S_{2},\ldots ,S_{L})}$ , is obtained by applying the two processes above within each of the $L$ groups of factors. That is, the coordinate representation of the $l^{\text{th}}$ group of factors is obtained using the isomorphism $(V_{\pi (r_{l-1}+1)}\otimes V_{\pi (r_{l-1}+2)}\otimes \cdots \otimes V_{\pi (r_{l})})\simeq (F^{I_{\pi (r_{l-1}+1)}}\otimes F^{I_{\pi (r_{l-1}+2)}}\otimes \cdots \otimes F^{I_{\pi (r_{l})}})$ , which requires specifying bases for all of the vector spaces $V_{k}$ . The result is then vectorized using a bijection $\mu _{l}:[I_{\pi (r_{l-1}+1)}]\times [I_{\pi (r_{l-1}+2)}]\times \cdots \times [I_{\pi (r_{l})}]\to [I_{S_{l}}]$ to obtain an element of $F^{I_{S_{l}}}$ , where ${\textstyle I_{S_{l}}:=\prod _{i=r_{l-1}+1}^{r_{l}}I_{\pi (i)}}$ , the product of the dimensions of the vector spaces in the $l^{\text{th}}$ group of factors. The result of applying these isomorphisms within each group of factors is an element of $F^{I_{S_{1}}}\otimes \cdots \otimes F^{I_{S_{L}}}$ , which is a tensor of order $L$ .

Vectorization

By means of a bijective map ${\displaystyle \mu$ , a vector space isomorphism between $F^{I_{1}}\otimes \cdots \otimes F^{I_{M}}$ and $F^{I_{1}\cdots I_{M}}$ is constructed via the mapping $\mathbf {e} _{i_{1}}^{1}\otimes \cdots \mathbf {e} _{i_{m}}^{m}\otimes \cdots \otimes \mathbf {e} _{i_{M}}^{M}\mapsto \mathbf {e} _{\mu (i_{1},i_{2},\ldots ,i_{M})},$ where for every natural number $i$ such that $1\leq i\leq I_{1}\cdots I_{M}$ , the vector $\mathbf {e} _{i}$ denotes the ith standard basis vector of $F^{i_{1}\cdots i_{M}}$ . In such a reshaping, the tensor is simply interpreted as a vector in $F^{I_{1}\cdots I_{M}}$ . This is known as vectorization, and is analogous to vectorization of matrices. A standard choice of bijection $\mu$ is such that

\operatorname {vec} ({\mathcal {A}})={\begin{bmatrix}a_{1,1,\ldots ,1}&a_{2,1,\ldots ,1}&\cdots &a_{n_{1},1,\ldots ,1}&a_{1,2,1,\ldots ,1}&\cdots &a_{I_{1},I_{2},\ldots ,I_{M}}\end{bmatrix}}^{T},

which is consistent with the way in which the colon operator in Matlab and GNU Octave reshapes a higher-order tensor into a vector. In general, the vectorization of ${\mathcal {A}}$ is the vector $[a_{\mu ^{-1}(i)}]_{i=1}^{I_{1}\cdots I_{M}}$ .

The vectorization of ${\mathcal {A}}$ denoted with $vec({\mathcal {A}})$ or ${\mathcal {A}}_{[:]}$ is an $[S_{1},S_{2}]$ -reshaping where $S_{1}=(1,2,\ldots ,M)$ and $S_{2}=\emptyset$ .

Mode-m Flattening / Mode-m Matrixization

Let ${\mathcal {A}}\in F^{I_{1}}\otimes F^{I_{2}}\otimes \cdots \otimes F^{I_{M}}$ be the coordinate representation of an abstract tensor with respect to a basis. Mode-m matrixizing (a.k.a. flattening) of ${\mathcal {A}}$ is an $[S_{1},S_{2}]$ -reshaping in which $S_{1}=(m)$ and $S_{2}=(1,2,\ldots ,m-1,m+1,\ldots ,M)$ . Usually, a standard matrixizing is denoted by

{\mathbf {A} }_{[m]}={\mathcal {A}}_{[S_{1},S_{2}]}

This reshaping is sometimes called matrixizing, matricizing, flattening or unfolding in the literature. A standard choice for the bijections $\mu _{1},\ \mu _{2}$ is the one that is consistent with the reshape function in Matlab and GNU Octave, namely

{\mathbf {A} }_{[m]}:={\begin{bmatrix}a_{1,1,\ldots ,1,1,1,\ldots ,1}&a_{2,1,\ldots ,1,1,1,\ldots ,1}&\cdots &a_{I_{1},I_{2},\ldots ,I_{m-1},1,I_{m+1},\ldots ,I_{M}}\\a_{1,1,\ldots ,1,2,1,\ldots ,1}&a_{2,1,\ldots ,1,2,1,\ldots ,1}&\cdots &a_{I_{1},I_{2},\ldots ,I_{m-1},2,I_{m+1},\ldots ,I_{M}}\\\vdots &\vdots &&\vdots \\a_{1,1,\ldots ,1,I_{m},1,\ldots ,1}&a_{2,1,\ldots ,1,I_{m},1,\ldots ,1}&\cdots &a_{I_{1},I_{2},\ldots ,I_{m-1},I_{m},I_{m+1},\ldots ,I_{M}}\end{bmatrix}}

Definition Mode-m Matrixizing:^[1]

[{\mathbf {A} }_{[m]}]_{jk}=a_{i_{1}\dots i_{m}\dots i_{M}},\;\;{\text{ where }}j=i_{m}{\text{ and }}k=1+\sum _{n=0 \atop n\neq m}^{M}(i_{n}-1)\prod _{l=0 \atop l\neq m}^{n-1}I_{l}.

The mode-m matrixizing of a tensor ${\mathcal {A}}\in F^{I_{1}\times ...I_{M}},$ is defined as the matrix ${\mathbf {A} }_{[m]}\in F^{I_{m}\times (I_{1}\dots I_{m-1}I_{m+1}\dots I_{M})}$ . As the parenthetical ordering indicates, the mode-m column vectors are arranged by sweeping all the other mode indices through their ranges, with smaller mode indexes varying more rapidly than larger ones; thus

Related Research Articles

In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced samples of a function into a same-length sequence of equally-spaced samples of the discrete-time Fourier transform (DTFT), which is a complex-valued function of frequency. The interval at which the DTFT is sampled is the reciprocal of the duration of the input sequence. An inverse DFT (IDFT) is a Fourier series, using the DTFT samples as coefficients of complex sinusoids at the corresponding DTFT frequencies. It has the same sample-values as the original input sequence. The DFT is therefore said to be a frequency domain representation of the original input sequence. If the original sequence spans all the non-zero values of a function, its DTFT is continuous, and the DFT provides discrete samples of one cycle. If the original sequence is one cycle of a periodic function, the DFT provides all the non-zero values of one DTFT cycle.

In mathematics, a product is the result of multiplication, or an expression that identifies objects to be multiplied, called factors. For example, 21 is the product of 3 and 7, and $is the product of and . When one factor is an integer, the product is called a multiple .$

In mathematics, a tensor is an algebraic object that describes a multilinear relationship between sets of algebraic objects related to a vector space. Tensors may map between different objects such as vectors, scalars, and even other tensors. There are many types of tensors, including scalars and vectors, dual vectors, multilinear maps between vector spaces, and even some operations such as the dot product. Tensors are defined independent of any basis, although they are often referred to by their components in a basis related to a particular coordinate system; those components form an array, which can be thought of as a high-dimensional matrix.

In mathematics, the tensor product $of two vector spaces V and W is a vector space to which is associated a bilinear map Failed to parse : {\displaystyle V\times W \rightarrow V\otimes W} that maps a pair to an element of denoted$

In mathematics, the $L p$ spaces are function spaces defined using a natural generalization of the $p$ -norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue, although according to the Bourbaki group they were first introduced by Frigyes Riesz.

In mathematics, the exterior product or wedge product of vectors is an algebraic construction used in geometry to study areas, volumes, and their higher-dimensional analogs. The exterior product of two vectors u and v, denoted by u ∧ v, is called a bivector and lives in a space called the exterior square, a vector space that is distinct from the original space of vectors. The magnitude of u ∧ v can be interpreted as the area of the parallelogram with sides u and v, which in three dimensions can also be computed using the cross product of the two vectors. Like the cross product, the exterior product is anticommutative, meaning that u ∧ v = −(v ∧ u) for all vectors u and v, but, unlike the cross product, the exterior product is associative. One way to visualize a bivector is as a family of parallelograms all lying in the same plane, having the same area and orientation, which is a choice of rotational direction within the plane (clockwise or counterclockwise from some view).

In mathematics, the Hodge star operator or Hodge star is a linear map defined on the exterior algebra of a finite-dimensional oriented vector space endowed with a nondegenerate symmetric bilinear form. Applying the operator to an element of the algebra produces the Hodge dual of the element. This map was introduced by W. V. D. Hodge.

In mathematics, the Grassmannian $is a differentiable manifold that parameterizes the set of all -dimensional linear subspaces of an -dimensional vector space over a field . For example, the Grassmannian is the space of lines through the origin in, so it is the same as the projective space of one dimension lower than . When is a real or complex vector space, Grassmannians are compact smooth manifolds, of dimension . In general they have the structure of a nonsingular projective algebraic variety.$

In mathematics, the Kronecker product, sometimes denoted by ⊗, is an operation on two matrices of arbitrary size resulting in a block matrix. It is a specialization of the tensor product from vectors to matrices and gives the matrix of the tensor product linear map with respect to a standard choice of basis. The Kronecker product is to be distinguished from the usual matrix multiplication, which is an entirely different operation. The Kronecker product is also sometimes called matrix direct product.

In abstract algebra and multilinear algebra, a multilinear form on a vector space $over a field is a map$

In continuum mechanics, the finite strain theory—also called large strain theory, or large deformation theory—deals with deformations in which strains and/or rotations are large enough to invalidate assumptions inherent in infinitesimal strain theory. In this case, the undeformed and deformed configurations of the continuum are significantly different, requiring a clear distinction between them. This is commonly the case with elastomers, plastically deforming materials and other fluids and biological soft tissue.

<span class="mw-page-title-main">Hyperelastic material</span> Constitutive model for ideally elastic material

A hyperelastic or Green elastic material is a type of constitutive model for ideally elastic material for which the stress–strain relationship derives from a strain energy density function. The hyperelastic material is a special case of a Cauchy elastic material.

In quantum computing and quantum communication, a stabilizer code is a class of quantum codes for performing quantum error correction. The toric code, and surface codes more generally, are types of stabilizer codes considered very important for the practical realization of quantum information processing.

In mathematics, Hochschild homology (and cohomology) is a homology theory for associative algebras over rings. There is also a theory for Hochschild homology of certain functors. Hochschild cohomology was introduced by Gerhard Hochschild (1945) for algebras over a field, and extended to algebras over more general rings by Henri Cartan and Samuel Eilenberg (1956).

In multilinear algebra, the tensor rank decomposition or the $decomposition of a tensor is the decomposition of a tensor in terms of a sum of minimum tensors. This is an open problem.$

In multilinear algebra, the higher-order singular value decomposition (HOSVD) of a tensor is a specific orthogonal Tucker decomposition. It may be regarded as one type of generalization of the matrix singular value decomposition. It has applications in computer vision, computer graphics, machine learning, scientific computing, and signal processing. Some aspects can be traced as far back as F. L. Hitchcock in 1928, but it was L. R. Tucker who developed for third-order tensors the general Tucker decomposition in the 1960s, further advocated by L. De Lathauwer et al. in their Multilinear SVD work that employs the power method, or advocated by Vasilescu and Terzopoulos that developed M-mode SVD a parallel algorithm that employs the matrix SVD.

In mathematics, the Kodaira–Spencer map, introduced by Kunihiko Kodaira and Donald C. Spencer, is a map associated to a deformation of a scheme or complex manifold X, taking a tangent space of a point of the deformation space to the first cohomology group of the sheaf of vector fields on X.

In algebraic geometry, a derived scheme is a homotopy-theoretic generalization of a scheme in which classical commutative rings are replaced with derived versions such as cdgas, commutative simplicial rings, or commutative ring spectra.

In mathematics, given an action $of a group scheme G on a scheme X over a base scheme S, an equivariant sheaf F on X is a sheaf of -modules together with the isomorphism of -modules$

In multilinear algebra, applying a map that is the tensor product of linear maps to a tensor is called a multilinear multiplication.

References

↑ Vasilescu, M. Alex O. (2009), "Multilinear (Tensor) Algebraic Framework for Computer Graphics, Computer Vision and Machine Learning" (PDF), University of Toronto, p. 21

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Vasilescu2009-1] Vasilescu, M. Alex O. (2009), "Multilinear (Tensor) Algebraic Framework for Computer Graphics, Computer Vision and Machine Learning" (PDF), University of Toronto, p. 21

[1]