Optical flow

Last updated
The optic flow experienced by a rotating observer (in this case a fly). The direction and magnitude of optic flow at each location is represented by the direction and length of each arrow. Opticfloweg.png
The optic flow experienced by a rotating observer (in this case a fly). The direction and magnitude of optic flow at each location is represented by the direction and length of each arrow.

Optical flow or optic flow is the pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer and a scene. [1] [2] Optical flow can also be defined as the distribution of apparent velocities of movement of brightness pattern in an image. [3]

Contents

The concept of optical flow was introduced by the American psychologist James J. Gibson in the 1940s to describe the visual stimulus provided to animals moving through the world. [4] Gibson stressed the importance of optic flow for affordance perception, the ability to discern possibilities for action within the environment. Followers of Gibson and his ecological approach to psychology have further demonstrated the role of the optical flow stimulus for the perception of movement by the observer in the world; perception of the shape, distance and movement of objects in the world; and the control of locomotion. [5]

The term optical flow is also used by roboticists, encompassing related techniques from image processing and control of navigation including motion detection, object segmentation, time-to-contact information, focus of expansion calculations, luminance, motion compensated encoding, and stereo disparity measurement. [6] [7]

Estimation

Optical flow can be estimated in a number of ways. Broadly, optical flow estimation approaches can be divided into machine learning based models (sometimes called data-driven models), classical models (sometimes called knowledge-driven models) which do not use machine learning and hybrid models which use aspects of both learning based models and classical models. [8]

Classical Models

Many classical models use the intuitive assumption of brightness constancy; that even if a point moves between frames, its brightness stays constant. [9] To formalise this intuitive assumption, consider two consecutive frames from a video sequence, with intensity , where refer to pixel coordinates and refers to time. In this case, the brightness constancy constraint is

where is the displacement vector between a point in the first frame and the corresponding point in the second frame. By itself, the brightness constancy constraint cannot be solved for and at each pixel, since there is only one equation and two unknowns. This is known as the aperture problem . Therefore, additional constraints must be imposed to estimate the flow field. [10] [11]

Regularized Models

Perhaps the most natural approach to addressing the aperture problem is to apply a smoothness constraint or a regularization constraint to the flow field. One can combine both of these constraints to formulate estimating optical flow as an optimization problem, where the goal is to minimize the cost function of the form,

where is the extent of the images , is the gradient operator, is a constant, and is a loss function. [9] [10]

This optimisation problem is difficult to solve owing to its non-linearity. To address this issue, one can use a variational approach and linearise the brightness constancy constraint using a first order Taylor series approximation. Specifically, the brightness constancy constraint is approximated as,

For convenience, the derivatives of the image, , and are often condensed to become , and . Doing so, allows one to rewrite the linearised brightness constancy constraint as, [11]

The optimization problem can now be rewritten as

For the choice of , this method is the same as the Horn-Schunck method. [12] Of course, other choices of cost function have been used such as , which is a differentiable variant of the norm. [9] [13]

To solve the aforementioned optimization problem, one can use the Euler-Lagrange equations to provide a system of partial differential equations for each point in . In the simplest case of using , these equations are,

where denotes the Laplace operator. Since the image data is made up of discrete pixels, these equations are discretised. Doing so yields a system of linear equations which can be solved for at each pixel, using an iterative scheme such as Gauss-Seidel. [12]

An alternate approach is to discretize the optimisation problem and then perform a search of the possible values without linearising it. [14] This search is often performed using Max-flow min-cut theorem algorithms, linear programming or belief propagation methods.

Parametric Models

Instead of applying the regularization constraint on a point by point basis as per a regularized model, one can group pixels into regions and estimate the motion of these regions. This is known as a parametric model, since the motion of these regions is parameterized. In formulating optical flow estimation in this way, one makes the assumption that the motion field in each region be fully characterised by a set of parameters. Therefore, the goal of a parametric model is to estimate the motion parameters that minimise a loss function which can be written as,

where is the set of parameters determining the motion in the region , is data cost term, is a weighting function that determines the influence of pixel on the total cost, and and are frames 1 and 2 from a pair of consecutive frames. [9]

The simplest parametric model is the Lucas-Kanade method. This uses rectangular regions and parameterises the motion as purely translational. The Lucas-Kanade method uses the original brightness constancy constrain as the data cost term and selects . This yields the local loss function,

Other possible local loss functions include the negative normalized cross-correlation between the two frames. [15]

Learning Based Models

Instead of seeking to model optical flow directly, one can train a machine learning system to estimate optical flow. Since 2015, when FlowNet [16] was proposed, learning based models have been applied to optical flow and have gained prominence. Initially, these approaches were based on Convolutional Neural Networks arranged in a U-Net architecture. However, with the advent of transformer architecture in 2017, transformer based models have gained prominence. [17]

Uses

Motion estimation and video compression have developed as a major aspect of optical flow research. While the optical flow field is superficially similar to a dense motion field derived from the techniques of motion estimation, optical flow is the study of not only the determination of the optical flow field itself, but also of its use in estimating the three-dimensional nature and structure of the scene, as well as the 3D motion of objects and the observer relative to the scene, most of them using the image Jacobian. [18]

Optical flow was used by robotics researchers in many areas such as: object detection and tracking, image dominant plane extraction, movement detection, robot navigation and visual odometry. [6] Optical flow information has been recognized as being useful for controlling micro air vehicles. [19]

The application of optical flow includes the problem of inferring not only the motion of the observer and objects in the scene, but also the structure of objects and the environment. Since awareness of motion and the generation of mental maps of the structure of our environment are critical components of animal (and human) vision, the conversion of this innate ability to a computer capability is similarly crucial in the field of machine vision. [20]

The optical flow vector of a moving object in a video sequence. Optical flow example v2.png
The optical flow vector of a moving object in a video sequence.

Consider a five-frame clip of a ball moving from the bottom left of a field of vision, to the top right. Motion estimation techniques can determine that on a two dimensional plane the ball is moving up and to the right and vectors describing this motion can be extracted from the sequence of frames. For the purposes of video compression (e.g., MPEG), the sequence is now described as well as it needs to be. However, in the field of machine vision, the question of whether the ball is moving to the right or if the observer is moving to the left is unknowable yet critical information. Not even if a static, patterned background were present in the five frames, could we confidently state that the ball was moving to the right, because the pattern might have an infinite distance to the observer.

Optical flow sensor

Various configurations of optical flow sensors exist. One configuration is an image sensor chip connected to a processor programmed to run an optical flow algorithm. Another configuration uses a vision chip, which is an integrated circuit having both the image sensor and the processor on the same die, allowing for a compact implementation. [21] [22] An example of this is a generic optical mouse sensor used in an optical mouse. In some cases the processing circuitry may be implemented using analog or mixed-signal circuits to enable fast optical flow computation using minimal current consumption.

One area of contemporary research is the use of neuromorphic engineering techniques to implement circuits that respond to optical flow, and thus may be appropriate for use in an optical flow sensor. [23] Such circuits may draw inspiration from biological neural circuitry that similarly responds to optical flow.

Optical flow sensors are used extensively in computer optical mice, as the main sensing component for measuring the motion of the mouse across a surface.

Optical flow sensors are also being used in robotics applications, primarily where there is a need to measure visual motion or relative motion between the robot and other objects in the vicinity of the robot. The use of optical flow sensors in unmanned aerial vehicles (UAVs), for stability and obstacle avoidance, is also an area of current research. [24]

See also

Related Research Articles

In particle physics, the Dirac equation is a relativistic wave equation derived by British physicist Paul Dirac in 1928. In its free form, or including electromagnetic interactions, it describes all spin-1/2 massive particles, called "Dirac particles", such as electrons and quarks for which parity is a symmetry. It is consistent with both the principles of quantum mechanics and the theory of special relativity, and was the first theory to account fully for special relativity in the context of quantum mechanics. It was validated by accounting for the fine structure of the hydrogen spectrum in a completely rigorous way. It has become vital in the building of the Standard Model.

<span class="mw-page-title-main">Potential flow</span> Velocity field as the gradient of a scalar function

In fluid dynamics, potential flow or irrotational flow refers to a description of a fluid flow with no vorticity in it. Such a description typically arises in the limit of vanishing viscosity, i.e., for an inviscid fluid and with no vorticity present in the flow.

<span class="mw-page-title-main">Beta distribution</span> Probability distribution

In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] or in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.

<span class="mw-page-title-main">Bloch's theorem</span> Fundamental theorem in condensed matter physics

In condensed matter physics, Bloch's theorem states that solutions to the Schrödinger equation in a periodic potential can be expressed as plane waves modulated by periodic functions. The theorem is named after the Swiss physicist Felix Bloch, who discovered the theorem in 1929. Mathematically, they are written

A continuity equation or transport equation is an equation that describes the transport of some quantity. It is particularly simple and powerful when applied to a conserved quantity, but it can be generalized to apply to any extensive quantity. Since mass, energy, momentum, electric charge and other natural quantities are conserved under their respective appropriate conditions, a variety of physical phenomena may be described using continuity equations.

<span class="mw-page-title-main">Dirichlet distribution</span> Probability distribution

In probability and statistics, the Dirichlet distribution, often denoted , is a family of continuous multivariate probability distributions parameterized by a vector of positive reals. It is a multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution (MBD). Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact, the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution.

<span class="mw-page-title-main">Large eddy simulation</span> Mathematical model for turbulence

Large eddy simulation (LES) is a mathematical model for turbulence used in computational fluid dynamics. It was initially proposed in 1963 by Joseph Smagorinsky to simulate atmospheric air currents, and first explored by Deardorff (1970). LES is currently applied in a wide variety of engineering applications, including combustion, acoustics, and simulations of the atmospheric boundary layer.

In differential geometry, the four-gradient is the four-vector analogue of the gradient from vector calculus.

<span class="mw-page-title-main">Two-state quantum system</span> Simple quantum mechanical system

In quantum mechanics, a two-state system is a quantum system that can exist in any quantum superposition of two independent quantum states. The Hilbert space describing such a system is two-dimensional. Therefore, a complete basis spanning the space will consist of two independent states. Any two-state system can also be seen as a qubit.

The Horn–Schunck method of estimating optical flow is a global method which introduces a global constraint of smoothness to solve the aperture problem.

Photon polarization is the quantum mechanical description of the classical polarized sinusoidal plane electromagnetic wave. An individual photon can be described as having right or left circular polarization, or a superposition of the two. Equivalently, a photon can be described as having horizontal or vertical linear polarization, or a superposition of the two.

The derivation of the Navier–Stokes equations as well as their application and formulation for different families of fluids, is an important exercise in fluid dynamics with applications in mechanical engineering, physics, chemistry, heat transfer, and electrical engineering. A proof explaining the properties and bounds of the equations, such as Navier–Stokes existence and smoothness, is one of the important unsolved problems in mathematics.

Turbulent diffusion is the transport of mass, heat, or momentum within a system due to random and chaotic time dependent motions. It occurs when turbulent fluid systems reach critical conditions in response to shear flow, which results from a combination of steep concentration gradients, density gradients, and high velocities. It occurs much more rapidly than molecular diffusion and is therefore extremely important for problems concerning mixing and transport in systems dealing with combustion, contaminants, dissolved oxygen, and solutions in industry. In these fields, turbulent diffusion acts as an excellent process for quickly reducing the concentrations of a species in a fluid or environment, in cases where this is needed for rapid mixing during processing, or rapid pollutant or contaminant reduction for safety.

<span class="mw-page-title-main">Stokes' theorem</span> Theorem in vector calculus

Stokes' theorem, also known as the Kelvin–Stokes theorem after Lord Kelvin and George Stokes, the fundamental theorem for curls or simply the curl theorem, is a theorem in vector calculus on . Given a vector field, the theorem relates the integral of the curl of the vector field over some surface, to the line integral of the vector field around the boundary of the surface. The classical theorem of Stokes can be stated in one sentence:

Geometrical acoustics or ray acoustics is a branch of acoustics that studies propagation of sound on the basis of the concept of acoustic rays, defined as lines along which the acoustic energy is transported. This concept is similar to geometrical optics, or ray optics, that studies light propagation in terms of optical rays. Geometrical acoustics is an approximate theory, valid in the limiting case of very small wavelengths, or very high frequencies. The principal task of geometrical acoustics is to determine the trajectories of sound rays. The rays have the simplest form in a homogeneous medium, where they are straight lines. If the acoustic parameters of the medium are functions of spatial coordinates, the ray trajectories become curvilinear, describing sound reflection, refraction, possible focusing, etc. The equations of geometric acoustics have essentially the same form as those of geometric optics. The same laws of reflection and refraction hold for sound rays as for light rays. Geometrical acoustics does not take into account such important wave effects as diffraction. However, it provides a very good approximation when the wavelength is very small compared to the characteristic dimensions of inhomogeneous inclusions through which the sound propagates.

In financial mathematics and stochastic optimization, the concept of risk measure is used to quantify the risk involved in a random outcome or risk position. Many risk measures have hitherto been proposed, each having certain characteristics. The entropic value at risk (EVaR) is a coherent risk measure introduced by Ahmadi-Javid, which is an upper bound for the value at risk (VaR) and the conditional value at risk (CVaR), obtained from the Chernoff inequality. The EVaR can also be represented by using the concept of relative entropy. Because of its connection with the VaR and the relative entropy, this risk measure is called "entropic value at risk". The EVaR was developed to tackle some computational inefficiencies of the CVaR. Getting inspiration from the dual representation of the EVaR, Ahmadi-Javid developed a wide class of coherent risk measures, called g-entropic risk measures. Both the CVaR and the EVaR are members of this class.

In communications technology, the technique of compressed sensing (CS) may be applied to the processing of speech signals under certain conditions. In particular, CS can be used to reconstruct a sparse vector from a smaller number of measurements, provided the signal can be represented in sparse domain. "Sparse domain" refers to a domain in which only a few measurements have non-zero values.

In fluid dynamics, Beltrami flows are flows in which the vorticity vector and the velocity vector are parallel to each other. In other words, Beltrami flow is a flow in which the Lamb vector is zero. It is named after the Italian mathematician Eugenio Beltrami due to his derivation of the Beltrami vector field, while initial developments in fluid dynamics were done by the Russian scientist Ippolit S. Gromeka in 1881.

A synthetic air data system (SADS) is an alternative air data system that can produce synthetic air data quantities without directly measuring the air data. It uses other information such as GPS, wind information, the aircraft's attitude, and aerodynamic properties to estimate or infer the air data quantities. Though air data includes altitude, airspeed, pressures, air temperature, Mach number, and flow angles, existing known SADS primarily focuses on estimating airspeed, Angle of Attack, and Angle of sideslip. SADS is used to monitor the primary air data system if there is an anomaly due to sensor faults or system faults. It can also be potentially used as a backup to provide air data estimates for any aerial vehicle.

<span class="mw-page-title-main">Rayleigh–Kuo criterion</span> Stability condition for fluids

The Rayleigh–Kuo criterion is a stability condition for a fluid. This criterion determines whether or not a barotropic instability can occur, leading to the presence of vortices. The Kuo criterion states that for barotropic instability to occur, the gradient of the absolute vorticity must change its sign at some point within the boundaries of the current. Note that this criterion is a necessary condition, so if it does not hold it is not possible for a barotropic instability to form. But it is not a sufficient condition, meaning that if the criterion is met, this does not automatically mean that the fluid is unstable. If the criterion is not met, it is certain that the flow is stable.

References

  1. Burton, Andrew; Radford, John (1978). Thinking in Perspective: Critical Essays in the Study of Thought Processes. Routledge. ISBN   978-0-416-85840-2.
  2. Warren, David H.; Strelow, Edward R. (1985). Electronic Spatial Sensing for the Blind: Contributions from Perception. Springer. ISBN   978-90-247-2689-9.
  3. Horn, Berthold K.P.; Schunck, Brian G. (August 1981). "Determining optical flow" (PDF). Artificial Intelligence. 17 (1–3): 185–203. doi:10.1016/0004-3702(81)90024-2. hdl:1721.1/6337.
  4. Gibson, J.J. (1950). The Perception of the Visual World. Houghton Mifflin.
  5. Royden, C. S.; Moore, K. D. (2012). "Use of speed cues in the detection of moving objects by moving observers". Vision Research. 59: 17–24. doi: 10.1016/j.visres.2012.02.006 . PMID   22406544. S2CID   52847487.
  6. 1 2 Aires, Kelson R. T.; Santana, Andre M.; Medeiros, Adelardo A. D. (2008). Optical Flow Using Color Information (PDF). ACM New York, NY, USA. ISBN   978-1-59593-753-7.
  7. Beauchemin, S. S.; Barron, J. L. (1995). "The computation of optical flow". ACM Computing Surveys. 27 (3). ACM New York, USA: 433–466. doi: 10.1145/212094.212141 . S2CID   1334552.
  8. Zhai, Mingliang; Xiang, Xuezhi; Lv, Ning; Kong, Xiangdong (2021). "Optical flow and scene flow estimation: A survey". Pattern Recognition. 114: 107861. doi:10.1016/j.patcog.2021.107861.
  9. 1 2 3 4 Fortun, Denis; Bouthemy, Patrick; Kervrann, Charles (2015-05-01). "Optical flow modeling and computation: A survey". Computer Vision and Image Understanding. 134: 1–21. doi:10.1016/j.cviu.2015.02.008 . Retrieved 2024-12-23.
  10. 1 2 Brox, Thomas; Bruhn, Andrés; Papenberg, Nils; Weickert, Joachim (2004). "High Accuracy Optical Flow Estimation Based on a Theory for Warping". Computer Vision - ECCV 2004. ECCV 2004. Berlin, Heidelberg: Springer Berlin Heidelberg. pp. 25–36.
  11. 1 2 Baker, Simon; Scharstein, Daniel; Lewis, J. P.; Roth, Stefan; Black, Michael J.; Szeliski, Richard (1 March 2011). "A Database and Evaluation Methodology for Optical Flow". International Journal of Computer Vision. 92 (1): 1–31. doi:10.1007/s11263-010-0390-2. ISSN   1573-1405 . Retrieved 25 Dec 2024.
  12. 1 2
  13. Sun, Deqing; Roth, Stefan; Black, "Micahel J." (2010). "Secrets of optical flow estimation and their principles". 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE. pp. 2432–2439.
  14. Steinbr¨ucker, Frank; Pock, Thomas; Cremers, Daniel; Weickert, Joachim (2009). "Large Displacement Optical Flow Computation without Warping". 2009 IEEE 12th International Conference on Computer Vision. 2009 IEEE 12th International Conference on Computer Vision. IEEE. pp. 1609–1614.
  15. Lucas, Bruce D.; Kanade, Takeo (1981-08-24). An iterative image registration technique with an application to stereo vision. Proceedings of the 7th International Joint Conference on Artificial intelligence - Volume 2. IJCAI'81. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. pp. 674–679.
  16. Dosovitskiy, Alexey; Fischer, Philipp; Ilg, Eddy; Hausser, Philip; Hazirbas, Caner; Golkov, Vladimir; Smagt, Patrick van der; Cremers, Daniel; Brox, Thomas (2015). FlowNet: Learning Optical Flow with Convolutional Networks. 2015 IEEE International Conference on Computer Vision (ICCV). IEEE. pp. 2758–2766. doi:10.1109/ICCV.2015.316. ISBN   978-1-4673-8391-2.
  17. Alfarano, Andrea; Maiano, Luca; Papa, Lorenzo; Amerini, Irene (2024). "Estimating optical flow: A comprehensive review of the state of the art". Computer Vision and Image Understanding. 249: 104160. doi:10.1016/j.cviu.2024.104160.
  18. Corke, Peter (8 May 2017). "The Image Jacobian". QUT Robot Academy.
  19. Barrows, G. L.; Chahl, J. S.; Srinivasan, M. V. (2003). "Biologically inspired visual sensing and flight control". Aeronautical Journal. 107 (1069): 159–268. doi:10.1017/S0001924000011891. S2CID   108782688 via Cambridge University Press.
  20. Brown, Christopher M. (1987). Advances in Computer Vision. Lawrence Erlbaum Associates. ISBN   978-0-89859-648-9.
  21. Moini, Alireza (2000). Vision Chips. Boston, MA: Springer US. ISBN   9781461552673. OCLC   851803922.
  22. Mead, Carver (1989). Analog VLSI and neural systems . Reading, Mass.: Addison-Wesley. ISBN   0201059924. OCLC   17954003.
  23. Stocker, Alan A. (2006). Analog VLSI circuits for the perception of visual motion. Chichester, England: John Wiley & Sons. ISBN   0470034882. OCLC   71521689.
  24. Floreano, Dario; Zufferey, Jean-Christophe; Srinivasan, Mandyam V.; Ellington, Charlie, eds. (2009). Flying insects and robots. Heidelberg: Springer. ISBN   9783540893936. OCLC   495477442.