Motion field

Last updated

In computer vision, the motion field is an ideal representation of motion in three-dimensional space (3D) as it is projected onto a camera image. Given a simplified camera model, each point in the image is the projection of some point in the 3D scene but the position of the projection of a fixed point in space can vary with time. The motion field can formally be defined as the time derivative of the image position of all image points given that they correspond to fixed 3D points. This means that the motion field can be represented as a function which maps image coordinates to a 2-dimensional vector. The motion field is an ideal description of the projected 3D motion in the sense that it can be formally defined but in practice it is normally only possible to determine an approximation of the motion field from the image data.

Contents

Introduction

An illustration of some 3D points and their corresponding image points as described by the pinhole camera model. As the 3D points are moving in space, the corresponding image points are also moving. The motion field consists of the motion vectors in the image for all points in the image. Motionfield.svg
An illustration of some 3D points and their corresponding image points as described by the pinhole camera model. As the 3D points are moving in space, the corresponding image points are also moving. The motion field consists of the motion vectors in the image for all points in the image.

A camera model maps each point in 3D space to a 2D image point according to some mapping functions :

Assuming that the scene depicted by the camera is dynamic; it consists of objects moving relative each other, objects which deform, and possibly also the camera is moving relative to the scene, a fixed point in 3D space is mapped to varying points in the image. Differentiating the previous expression with respect to time gives

Here

is the motion field and the vector u is dependent both on the image position as well as on the time t. Similarly,

is the motion of the corresponding 3D point and its relation to the motion field is given by

where is the image position dependent matrix

This relation implies that the motion field, at a specific image point, is invariant to 3D motions which lies in the null space of . For example, in the case of a pinhole camera all 3D motion components which are directed to or from the camera focal point cannot be detected in the motion field.

Special cases

The motion field is defined as:

where

.

where

Relation to optical flow

The motion field is an ideal construction, based on the idea that it is possible to determine the motion of each image point, and above it is described how this 2D motion is related to 3D motion. In practice, however, the true motion field can only be approximated based on measurements on image data. The problem is that in most cases each image point has an individual motion which therefore has to be locally measured by means of a neighborhood operation on the image data. As consequence, the correct motion field cannot be determined for certain types of neighborhood and instead an approximation, often referred to as the optical flow, has to be used. For example, a neighborhood which has a constant intensity may correspond to a non-zero motion field, but the optical flow is zero since no local image motion can be measured. Similarly, a neighborhood which is intrinsic 1-dimensional (for example, an edge or line) can correspond to an arbitrary motion field, but the optical flow can only capture the normal component of the motion field. There are also other effects, such as image noise, 3D occlusion, temporal aliasing, which are inherent to any method for measuring optical flow and causes the resulting optical flow to deviate from the true motion field.

In short, the motion field cannot be correctly measured for all image points, and the optical flow is an approximation of the motion field. There are several different ways to compute the optical flow based on different criteria of how an optical estimation should be made.

Related Research Articles

Unit quaternions, known as versors, provide a convenient mathematical notation for representing spatial orientations and rotations of elements in three dimensional space. Specifically, they encode information about an axis-angle rotation about an arbitrary axis. Rotation and orientation quaternions have applications in computer graphics, computer vision, robotics, navigation, molecular dynamics, flight dynamics, orbital mechanics of satellites, and crystallographic texture analysis.

In mathematics, Itô's lemma or Itô's formula is an identity used in Itô calculus to find the differential of a time-dependent function of a stochastic process. It serves as the stochastic calculus counterpart of the chain rule. It can be heuristically derived by forming the Taylor series expansion of the function up to its second derivatives and retaining terms up to first order in the time increment and second order in the Wiener process increment. The lemma is widely employed in mathematical finance, and its best known application is in the derivation of the Black–Scholes equation for option values.

In special relativity, four-momentum (also called momentum–energy or momenergy) is the generalization of the classical three-dimensional momentum to four-dimensional spacetime. Momentum is a vector in three dimensions; similarly four-momentum is a four-vector in spacetime. The contravariant four-momentum of a particle with relativistic energy E and three-momentum p = (px, py, pz) = γmv, where v is the particle's three-velocity and γ the Lorentz factor, is

<span class="mw-page-title-main">Four-vector</span> 4-dimensional vector in relativity

In special relativity, a four-vector is an object with four components, which transform in a specific way under Lorentz transformations. Specifically, a four-vector is an element of a four-dimensional vector space considered as a representation space of the standard representation of the Lorentz group, the representation. It differs from a Euclidean vector in how its magnitude is determined. The transformations that preserve this magnitude are the Lorentz transformations, which include spatial rotations and boosts.

<span class="mw-page-title-main">3D projection</span> Design technique

A 3D projection is a design technique used to display a three-dimensional (3D) object on a two-dimensional (2D) surface. These projections rely on visual perspective and aspect analysis to project a complex object for viewing capability on a simpler plane.

<span class="mw-page-title-main">Euler equations (fluid dynamics)</span> Set of quasilinear hyperbolic equations governing adiabatic and inviscid flow

In fluid dynamics, the Euler equations are a set of quasilinear partial differential equations governing adiabatic and inviscid flow. They are named after Leonhard Euler. In particular, they correspond to the Navier–Stokes equations with zero viscosity and zero thermal conductivity.

A continuity equation or transport equation is an equation that describes the transport of some quantity. It is particularly simple and powerful when applied to a conserved quantity, but it can be generalized to apply to any extensive quantity. Since mass, energy, momentum, electric charge and other natural quantities are conserved under their respective appropriate conditions, a variety of physical phenomena may be described using continuity equations.

In continuum mechanics, the material derivative describes the time rate of change of some physical quantity of a material element that is subjected to a space-and-time-dependent macroscopic velocity field. The material derivative can serve as a link between Eulerian and Lagrangian descriptions of continuum deformation.

In geometry, the line element or length element can be informally thought of as a line segment associated with an infinitesimal displacement vector in a metric space. The length of the line element, which may be thought of as a differential arc length, is a function of the metric tensor and is denoted by .

<span class="mw-page-title-main">Control volume</span> Imaginary volume through which a substances flow is modeled and analyzed

In continuum mechanics and thermodynamics, a control volume (CV) is a mathematical abstraction employed in the process of creating mathematical models of physical processes. In an inertial frame of reference, it is a fictitious region of a given volume fixed in space or moving with constant flow velocity through which the continuuum flows. The closed surface enclosing the region is referred to as the control surface.

In physics, the algebra of physical space (APS) is the use of the Clifford or geometric algebra Cl3,0(R) of the three-dimensional Euclidean space as a model for (3+1)-dimensional spacetime, representing a point in spacetime via a paravector.

<span class="mw-page-title-main">Covariant formulation of classical electromagnetism</span> Ways of writing certain laws of physics

The covariant formulation of classical electromagnetism refers to ways of writing the laws of classical electromagnetism in a form that is manifestly invariant under Lorentz transformations, in the formalism of special relativity using rectilinear inertial coordinate systems. These expressions both make it simple to prove that the laws of classical electromagnetism take the same form in any inertial coordinate system, and also provide a way to translate the fields and forces from one frame to another. However, this is not as general as Maxwell's equations in curved spacetime or non-rectilinear coordinate systems.

In computer vision, the essential matrix is a matrix, that relates corresponding points in stereo images assuming that the cameras satisfy the pinhole camera model.

In differential calculus, there is no single uniform notation for differentiation. Instead, various notations for the derivative of a function or variable have been proposed by various mathematicians. The usefulness of each notation varies with the context, and it is sometimes advantageous to use more than one notation in a given context. The most common notations for differentiation are listed below.

The intent of this article is to highlight the important points of the derivation of the Navier–Stokes equations as well as its application and formulation for different families of fluids.

In computer vision a camera matrix or (camera) projection matrix is a matrix which describes the mapping of a pinhole camera from 3D points in the world to 2D points in an image.

<span class="mw-page-title-main">Pinhole camera model</span> Model of 3D points projected onto planar image via a lens-less aperture

The pinhole camera model describes the mathematical relationship between the coordinates of a point in three-dimensional space and its projection onto the image plane of an ideal pinhole camera, where the camera aperture is described as a point and no lenses are used to focus light. The model does not include, for example, geometric distortions or blurring of unfocused objects caused by lenses and finite sized apertures. It also does not take into account that most practical cameras have only discrete image coordinates. This means that the pinhole camera model can only be used as a first order approximation of the mapping from a 3D scene to a 2D image. Its validity depends on the quality of the camera and, in general, decreases from the center of the image to the edges as lens distortion effects increase.

In computer vision, triangulation refers to the process of determining a point in 3D space given its projections onto two, or more, images. In order to solve this problem it is necessary to know the parameters of the camera projection function from 3D to 2D for the cameras involved, in the simplest case represented by the camera matrices. Triangulation is sometimes also referred to as reconstruction or intersection.

The Cauchy momentum equation is a vector partial differential equation put forth by Cauchy that describes the non-relativistic momentum transport in any continuum.

For digital image processing, the Focus recovery from a defocused image is an ill-posed problem since it loses the component of high frequency. Most of the methods for focus recovery are based on depth estimation theory. The Linear canonical transform (LCT) gives a scalable kernel to fit many well-known optical effects. Using LCTs to approximate an optical system for imaging and inverting this system, theoretically permits recovery of a defocused image.

References