Direct linear transformation

Last updated October 21, 2024

Direct linear transformation (DLT) is an algorithm which solves a set of variables from a set of similarity relations:

Introduction

An ordinary system of linear equations

\mathbf {x} _{k}=\mathbf {A} \,\mathbf {y} _{k}

for

\,k=1,\ldots ,N

can be solved, for example, by rewriting it as a matrix equation $\mathbf {X} =\mathbf {A} \,\mathbf {Y}$ where matrices $\mathbf {X}$ and $\mathbf {Y}$ contain the vectors $\mathbf {x} _{k}$ and $\mathbf {y} _{k}$ in their respective columns. Given that there exists a unique solution, it is given by

\mathbf {A} =\mathbf {X} \,\mathbf {Y} ^{T}\,(\mathbf {Y} \,\mathbf {Y} ^{T})^{-1}.

Solutions can also be described in the case that the equations are over or under determined.

What makes the direct linear transformation problem distinct from the above standard case is the fact that the left and right sides of the defining equation can differ by an unknown multiplicative factor which is dependent on k. As a consequence, $\mathbf {A}$ cannot be computed as in the standard case. Instead, the similarity relations are rewritten as proper linear homogeneous equations which then can be solved by a standard method. The combination of rewriting the similarity equations as homogeneous linear equations and solving them by standard methods is referred to as a direct linear transformation algorithm or DLT algorithm. DLT is attributed to Ivan Sutherland. ^[2]

Example

Suppose that $k\in \{1,...,N\}$ . Let $\mathbf {x} _{k}=(x_{1k},x_{2k})\in \mathbb {R} ^{2}$ and $\mathbf {y} _{k}=(y_{1k},y_{2k},y_{3k})\in \mathbb {R} ^{3}$ be two known vectors, and we want to find the $2\times 3$ matrix $\mathbf {A}$ such that

\alpha _{k}\,\mathbf {x} _{k}=\mathbf {A} \,\mathbf {y} _{k}

where $\alpha _{k}\neq 0$ is the unknown scalar factor related to equation k.

To get rid of the unknown scalars and obtain homogeneous equations, define the anti-symmetric matrix

\mathbf {H} ={\begin{pmatrix}0&-1\\1&0\end{pmatrix}}

and multiply both sides of the equation with $\mathbf {x} _{k}^{T}\,\mathbf {H}$ from the left

{\begin{aligned}(\mathbf {x} _{k}^{T}\,\mathbf {H} )\,\alpha _{k}\,\mathbf {x} _{k}&=(\mathbf {x} _{k}^{T}\,\mathbf {H} )\,\mathbf {A} \,\mathbf {y} _{k}\\\alpha _{k}\,\mathbf {x} _{k}^{T}\,\mathbf {H} \,\mathbf {x} _{k}&=\mathbf {x} _{k}^{T}\,\mathbf {H} \,\mathbf {A} \,\mathbf {y} _{k}\end{aligned}}

Since $\mathbf {x} _{k}^{T}\,\mathbf {H} \,\mathbf {x} _{k}=0,$ the following homogeneous equations, which no longer contain the unknown scalars, are at hand

\mathbf {x} _{k}^{T}\,\mathbf {H} \,\mathbf {A} \,\mathbf {y} _{k}=0

In order to solve $\mathbf {A}$ from this set of equations, consider the elements of the vectors $\mathbf {x} _{k}$ and $\mathbf {y} _{k}$ and matrix $\mathbf {A}$ :

\mathbf {x} _{k}={\begin{pmatrix}x_{1k}\\x_{2k}\end{pmatrix}}

,

\mathbf {y} _{k}={\begin{pmatrix}y_{1k}\\y_{2k}\\y_{3k}\end{pmatrix}}

, and

\mathbf {A} ={\begin{pmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\end{pmatrix}}

and the above homogeneous equation becomes

0=a_{11}\,x_{2k}\,y_{1k}-a_{21}\,x_{1k}\,y_{1k}+a_{12}\,x_{2k}\,y_{2k}-a_{22}\,x_{1k}\,y_{2k}+a_{13}\,x_{2k}\,y_{3k}-a_{23}\,x_{1k}\,y_{3k}

for

\,k=1,\ldots ,N.

This can also be written in the matrix form:

0=\mathbf {b} _{k}^{T}\,\mathbf {a}

for

\,k=1,\ldots ,N

where $\mathbf {b} _{k}$ and $\mathbf {a}$ both are 6-dimensional vectors defined as

\mathbf {b} _{k}={\begin{pmatrix}x_{2k}\,y_{1k}\\-x_{1k}\,y_{1k}\\x_{2k}\,y_{2k}\\-x_{1k}\,y_{2k}\\x_{2k}\,y_{3k}\\-x_{1k}\,y_{3k}\end{pmatrix}}

and

\mathbf {a} ={\begin{pmatrix}a_{11}\\a_{21}\\a_{12}\\a_{22}\\a_{13}\\a_{23}\end{pmatrix}}.

So far, we have 1 equation and 6 unknowns. A set of homogeneous equations can be written in the matrix form

\mathbf {0} =\mathbf {B} \,\mathbf {a}

where $\mathbf {B}$ is a $N\times 6$ matrix which holds the known vectors $\mathbf {b} _{k}$ in its rows. The unknown $\mathbf {a}$ can be determined, for example, by a singular value decomposition of $\mathbf {B}$ ; $\mathbf {a}$ is a right singular vector of $\mathbf {B}$ corresponding to a singular value that equals zero. Once $\mathbf {a}$ has been determined, the elements of matrix $\mathbf {A}$ can rearranged from vector $\mathbf {a}$ . Notice that the scaling of $\mathbf {a}$ or $\mathbf {A}$ is not important (except that it must be non-zero) since the defining equations already allow for unknown scaling.

In practice the vectors $\mathbf {x} _{k}$ and $\mathbf {y} _{k}$ may contain noise which means that the similarity equations are only approximately valid. As a consequence, there may not be a vector $\mathbf {a}$ which solves the homogeneous equation $\mathbf {0} =\mathbf {B} \,\mathbf {a}$ exactly. In these cases, a total least squares solution can be used by choosing $\mathbf {a}$ as a right singular vector corresponding to the smallest singular value of $\mathbf {B} .$

More general cases

The above example has $\mathbf {x} _{k}\in \mathbb {R} ^{2}$ and $\mathbf {y} _{k}\in \mathbb {R} ^{3}$ , but the general strategy for rewriting the similarity relations into homogeneous linear equations can be generalized to arbitrary dimensions for both $\mathbf {x} _{k}$ and $\mathbf {y} _{k}.$

If $\mathbf {x} _{k}\in \mathbb {R} ^{2}$ and $\mathbf {y} _{k}\in \mathbb {R} ^{q}$ the previous expressions can still lead to an equation

0=\mathbf {x} _{k}^{T}\,\mathbf {H} \,\mathbf {A} \,\mathbf {y} _{k}

for

\,k=1,\ldots ,N

where $\mathbf {A}$ now is $2\times q.$ Each k provides one equation in the $2q$ unknown elements of $\mathbf {A}$ and together these equations can be written $\mathbf {B} \,\mathbf {a} =\mathbf {0}$ for the known $N\times 2\,q$ matrix $\mathbf {B}$ and unknown 2q-dimensional vector $\mathbf {a} .$ This vector can be found in a similar way as before.

In the most general case $\mathbf {x} _{k}\in \mathbb {R} ^{p}$ and $\mathbf {y} _{k}\in \mathbb {R} ^{q}$ . The main difference compared to previously is that the matrix $\mathbf {H}$ now is $p\times p$ and anti-symmetric. When $p>2$ the space of such matrices is no longer one-dimensional, it is of dimension

M={\frac {p\,(p-1)}{2}}.

This means that each value of k provides M homogeneous equations of the type

0=\mathbf {x} _{k}^{T}\,\mathbf {H} _{m}\,\mathbf {A} \,\mathbf {y} _{k}

for

\,m=1,\ldots ,M

and for

\,k=1,\ldots ,N

where $\mathbf {H} _{m}$ is a M-dimensional basis of the space of $p\times p$ anti-symmetric matrices.

Example p = 3

In the case that p = 3 the following three matrices $\mathbf {H} _{m}$ can be chosen

\mathbf {H} _{1}={\begin{pmatrix}0&0&0\\0&0&-1\\0&1&0\end{pmatrix}}

,

\mathbf {H} _{2}={\begin{pmatrix}0&0&1\\0&0&0\\-1&0&0\end{pmatrix}}

,

\mathbf {H} _{3}={\begin{pmatrix}0&-1&0\\1&0&0\\0&0&0\end{pmatrix}}.

In this particular case, the homogeneous linear equations can be written as

\mathbf {0} =[\mathbf {x} _{k}]_{\times }\,\mathbf {A} \,\mathbf {y} _{k}

for

\,k=1,\ldots ,N

where $[\mathbf {x} _{k}]_{\times }$ is the matrix representation of the vector cross product. Notice that this last equation is vector valued; the left hand side is the zero element in $\mathbb {R} ^{3}$ .

Each value of k provides three homogeneous linear equations in the unknown elements of $\mathbf {A}$ . However, since $[\mathbf {x} _{k}]_{\times }$ has rank = 2, at most two equations are linearly independent. In practice, therefore, it is common to only use two of the three matrices $\mathbf {H} _{m}$ , for example, for m=1, 2. However, the linear dependency between the equations is dependent on $\mathbf {x} _{k}$ , which means that in unlucky cases it would have been better to choose, for example, m=2,3. As a consequence, if the number of equations is not a concern, it may be better to use all three equations when the matrix $\mathbf {B}$ is constructed.

The linear dependence between the resulting homogeneous linear equations is a general concern for the case p > 2 and has to be dealt with either by reducing the set of anti-symmetric matrices $\mathbf {H} _{m}$ or by allowing $\mathbf {B}$ to become larger than necessary for determining $\mathbf {a} .$

Related Research Articles

In mathematics, the determinant is a scalar-valued function of the entries of a square matrix. The determinant of a matrix $A$ is commonly denoted $det(A)$ , $det A$ , or $| A |$ . Its value characterizes some properties of the matrix and the linear map represented, on a given basis, by the matrix. In particular, the determinant is nonzero if and only if the matrix is invertible and the corresponding linear map is an isomorphism.

In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced samples of a function into a same-length sequence of equally-spaced samples of the discrete-time Fourier transform (DTFT), which is a complex-valued function of frequency. The interval at which the DTFT is sampled is the reciprocal of the duration of the input sequence. An inverse DFT (IDFT) is a Fourier series, using the DTFT samples as coefficients of complex sinusoids at the corresponding DTFT frequencies. It has the same sample-values as the original input sequence. The DFT is therefore said to be a frequency domain representation of the original input sequence. If the original sequence spans all the non-zero values of a function, its DTFT is continuous, and the DFT provides discrete samples of one cycle. If the original sequence is one cycle of a periodic function, the DFT provides all the non-zero values of one DTFT cycle.

In mathematics, and more specifically in linear algebra, a linear map is a mapping $between two vector spaces that preserves the operations of vector addition and scalar multiplication. The same names and the same definition are also used for the more general case of modules over a ring; see Module homomorphism.$

In mathematical physics and mathematics, the Pauli matrices are a set of three $2 \times 2$ complex matrices that are traceless, Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables, each of which clusters around a mean value.

In geometry, a normal is an object that is perpendicular to a given object. For example, the normal line to a plane curve at a given point is the line perpendicular to the tangent line to the curve at the point.

<span class="mw-page-title-main">Symplectic group</span> Mathematical group

In mathematics, the name symplectic group can refer to two different, but closely related, collections of mathematical groups, denoted $Sp(2 n, F)$ and $Sp(n)$ for positive integer n and field F (usually C or R). The latter is called the compact symplectic group and is also denoted by $. Many authors prefer slightly different notations, usually differing by factors of 2 . The notation used here is consistent with the size of the most common matrices which represent the groups. In Cartan's classification of the simple Lie algebras, the Lie algebra of the complex group Sp(2 n, C) is denoted C n, and Sp(n) is the compact real form of Sp(2 n, C) . Note that when we refer to the (compact) symplectic group it is implied that we are talking about the collection of (compact) symplectic groups, indexed by their dimension n .$

In mathematics, particularly linear algebra, an orthonormal basis for an inner product space $with finite dimension is a basis for whose vectors are orthonormal, that is, they are all unit vectors and orthogonal to each other. For example, the standard basis for a Euclidean space is an orthonormal basis, where the relevant inner product is the dot product of vectors. The image of the standard basis under a rotation or reflection is also orthonormal, and every orthonormal basis for arises in this fashion. An orthonormal basis can be derived from an orthogonal basis via normalization. The choice of an origin and an orthonormal basis forms a coordinate frame known as an orthonormal frame .$

In mathematics and physical science, spherical harmonics are special functions defined on the surface of a sphere. They are often employed in solving partial differential equations in many scientific fields. The table of spherical harmonics contains a list of common spherical harmonics.

In mathematics, the conjugate transpose, also known as the Hermitian transpose, of an $complex matrix is an matrix obtained by transposing and applying complex conjugation to each entry. There are several notations, such as or,, or .$

<span class="mw-page-title-main">Rank–nullity theorem</span> In linear algebra, relation between 3 dimensions

The rank–nullity theorem is a theorem in linear algebra, which asserts:

In mathematics, and in particular linear algebra, the Moore–Penrose inverse⁠ $⁠$ of a matrix ⁠ $⁠$ , often called the pseudoinverse, is the most widely known generalization of the inverse matrix. It was independently described by E. H. Moore in 1920, Arne Bjerhammar in 1951, and Roger Penrose in 1955. Earlier, Erik Ivar Fredholm had introduced the concept of a pseudoinverse of integral operators in 1903. The terms pseudoinverse and generalized inverse are sometimes used as synonyms for the Moore–Penrose inverse of a matrix, but sometimes applied to other elements of algebraic structures which share some but not all properties expected for an inverse element.

In mathematics, a homogeneous function is a function of several variables such that the following holds: If each of the function's arguments is multiplied by the same scalar, then the function's value is multiplied by some power of this scalar; the power is called the degree of homogeneity, or simply the degree. That is, if $k$ is an integer, a function $f$ of $n$ variables is homogeneous of degree $k$ if

<span class="mw-page-title-main">Barycentric coordinate system</span> Coordinate system that is defined by points instead of vectors

In geometry, a barycentric coordinate system is a coordinate system in which the location of a point is specified by reference to a simplex. The barycentric coordinates of a point can be interpreted as masses placed at the vertices of the simplex, such that the point is the center of mass of these masses. These masses can be zero or negative; they are all positive if and only if the point is inside the simplex.

In mathematics, the kernel of a linear map, also known as the null space or nullspace, is the part of the domain which is mapped to the zero vector of the co-domain; the kernel is always a linear subspace of the domain. That is, given a linear map $L : V \to W$ between two vector spaces $V$ and $W$ , the kernel of $L$ is the vector space of all elements $v$ of $V$ such that $L (v) = 0$ , where $0$ denotes the zero vector in $W$ , or more symbolically:

In linear algebra, a generalized eigenvector of an $matrix is a vector which satisfies certain criteria which are more relaxed than those for an (ordinary) eigenvector.$

<span class="mw-page-title-main">Hyperboloid model</span> Model of n-dimensional hyperbolic geometry

In geometry, the hyperboloid model, also known as the Minkowski model after Hermann Minkowski, is a model of n-dimensional hyperbolic geometry in which points are represented by points on the forward sheet S⁺ of a two-sheeted hyperboloid in (n+1)-dimensional Minkowski space or by the displacement vectors from the origin to those points, and m-planes are represented by the intersections of (m+1)-planes passing through the origin in Minkowski space with S⁺ or by wedge products of m vectors. Hyperbolic space is embedded isometrically in Minkowski space; that is, the hyperbolic distance function is inherited from Minkowski space, analogous to the way spherical distance is inherited from Euclidean distance when the n-sphere is embedded in (n+1)-dimensional Euclidean space.

In mathematics, a system of equations is considered overdetermined if there are more equations than unknowns. An overdetermined system is almost always inconsistent when constructed with random coefficients. However, an overdetermined system will have solutions in some cases, for example if some equation occurs several times in the system, or if some equations are linear combinations of the others.

In mathematics, an ordinary differential equation (ODE) is a differential equation (DE) dependent on only a single independent variable. As with other DE, its unknown(s) consists of one function(s) and involves the derivatives of those functions. The term "ordinary" is used in contrast with partial differential equations (PDEs) which may be with respect to more than one independent variable, and, less commonly, in contrast with stochastic differential equations (SDEs) where the progression is random.

In linear algebra, a convergent matrix is a matrix that converges to the zero matrix under matrix exponentiation.

References

↑ Abdel-Aziz, Y.I.; Karara, H.M. (2015-02-01). "Direct Linear Transformation from Comparator Coordinates into Object Space Coordinates in Close-Range Photogrammetry". Photogrammetric Engineering & Remote Sensing. 81 (2). American Society for Photogrammetry and Remote Sensing: 103–107. doi: 10.14358/pers.81.2.103 . ISSN 0099-1112.
↑ Sutherland, Ivan E. (April 1974), "Three-dimensional data input by tablet", Proceedings of the IEEE, 62 (4): 453–461, doi:10.1109/PROC.1974.9449

Richard Hartley and Andrew Zisserman (2003). Multiple View Geometry in computer vision. Cambridge University Press. ISBN 978-0-521-54051-3.

External links

Homography Estimation by Elan Dubrofsky (§2.1 sketches the "Basic DLT Algorithm")

A DLT Solver based on MATLAB by Hsiang-Jen (Johnny) Chien

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Abdel-Aziz, Y.I.; Karara, H.M. (2015-02-01). "Direct Linear Transformation from Comparator Coordinates into Object Space Coordinates in Close-Range Photogrammetry". Photogrammetric Engineering & Remote Sensing. 81 (2). American Society for Photogrammetry and Remote Sensing: 103–107. doi: 10.14358/pers.81.2.103 . ISSN 0099-1112.

[Sutherland-2] Sutherland, Ivan E. (April 1974), "Three-dimensional data input by tablet", Proceedings of the IEEE, 62 (4): 453–461, doi:10.1109/PROC.1974.9449

[1]

[2]