Optical flow

Last updated
The optic flow experienced by a rotating observer (in this case a fly). The direction and magnitude of optic flow at each location is represented by the direction and length of each arrow. Opticfloweg.png
The optic flow experienced by a rotating observer (in this case a fly). The direction and magnitude of optic flow at each location is represented by the direction and length of each arrow.

Optical flow or optic flow is the pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer and a scene. [1] [2] Optical flow can also be defined as the distribution of apparent velocities of movement of brightness pattern in an image. [3]

Contents

The concept of optical flow was introduced by the American psychologist James J. Gibson in the 1940s to describe the visual stimulus provided to animals moving through the world. [4] Gibson stressed the importance of optic flow for affordance perception, the ability to discern possibilities for action within the environment. Followers of Gibson and his ecological approach to psychology have further demonstrated the role of the optical flow stimulus for the perception of movement by the observer in the world; perception of the shape, distance and movement of objects in the world; and the control of locomotion. [5]

The term optical flow is also used by roboticists, encompassing related techniques from image processing and control of navigation including motion detection, object segmentation, time-to-contact information, focus of expansion calculations, luminance, motion compensated encoding, and stereo disparity measurement. [6] [7]

Estimation

Sequences of ordered images allow the estimation of motion as either instantaneous image velocities or discrete image displacements. [7] Fleet and Weiss provide a tutorial introduction to gradient based optical flow. [8] John L. Barron, David J. Fleet, and Steven Beauchemin provide a performance analysis of a number of optical flow techniques. It emphasizes the accuracy and density of measurements. [9]

The optical flow methods try to calculate the motion between two image frames which are taken at times and at every voxel position. These methods are called differential since they are based on local Taylor series approximations of the image signal; that is, they use partial derivatives with respect to the spatial and temporal coordinates.

For a (2D + t)-dimensional case (3D or n-D cases are similar) a voxel at location with intensity will have moved by , and between the two image frames, and the following brightness constancy constraint can be given:

Assuming the movement to be small, the image constraint at with Taylor series can be developed to get:

higher-order terms

By truncating the higher order terms (which performs a linearization) it follows that:

or, dividing by ,

which results in

where are the and components of the velocity or optical flow of and , and are the derivatives of the image at in the corresponding directions. , and can be written for the derivatives in the following.

Thus:

or

This is an equation in two unknowns and cannot be solved as such. This is known as the aperture problem of the optical flow algorithms. To find the optical flow another set of equations is needed, given by some additional constraint. All optical flow methods introduce additional conditions for estimating the actual flow.

Methods for determination

Many of these, in addition to the current state-of-the-art algorithms are evaluated on the Middlebury Benchmark Dataset. [13] [14] Other popular benchmark datasets are KITTI and Sintel.

Uses

Motion estimation and video compression have developed as a major aspect of optical flow research. While the optical flow field is superficially similar to a dense motion field derived from the techniques of motion estimation, optical flow is the study of not only the determination of the optical flow field itself, but also of its use in estimating the three-dimensional nature and structure of the scene, as well as the 3D motion of objects and the observer relative to the scene, most of them using the image Jacobian. [15]

Optical flow was used by robotics researchers in many areas such as: object detection and tracking, image dominant plane extraction, movement detection, robot navigation and visual odometry. [6] Optical flow information has been recognized as being useful for controlling micro air vehicles. [16]

The application of optical flow includes the problem of inferring not only the motion of the observer and objects in the scene, but also the structure of objects and the environment. Since awareness of motion and the generation of mental maps of the structure of our environment are critical components of animal (and human) vision, the conversion of this innate ability to a computer capability is similarly crucial in the field of machine vision. [17]

The optical flow vector of a moving object in a video sequence. Optical flow example v2.png
The optical flow vector of a moving object in a video sequence.

Consider a five-frame clip of a ball moving from the bottom left of a field of vision, to the top right. Motion estimation techniques can determine that on a two dimensional plane the ball is moving up and to the right and vectors describing this motion can be extracted from the sequence of frames. For the purposes of video compression (e.g., MPEG), the sequence is now described as well as it needs to be. However, in the field of machine vision, the question of whether the ball is moving to the right or if the observer is moving to the left is unknowable yet critical information. Not even if a static, patterned background were present in the five frames, could we confidently state that the ball was moving to the right, because the pattern might have an infinite distance to the observer.

Optical flow sensor

Various configurations of optical flow sensors exist. One configuration is an image sensor chip connected to a processor programmed to run an optical flow algorithm. Another configuration uses a vision chip, which is an integrated circuit having both the image sensor and the processor on the same die, allowing for a compact implementation. [18] [19] An example of this is a generic optical mouse sensor used in an optical mouse. In some cases the processing circuitry may be implemented using analog or mixed-signal circuits to enable fast optical flow computation using minimal current consumption.

One area of contemporary research is the use of neuromorphic engineering techniques to implement circuits that respond to optical flow, and thus may be appropriate for use in an optical flow sensor. [20] Such circuits may draw inspiration from biological neural circuitry that similarly responds to optical flow.

Optical flow sensors are used extensively in computer optical mice, as the main sensing component for measuring the motion of the mouse across a surface.

Optical flow sensors are also being used in robotics applications, primarily where there is a need to measure visual motion or relative motion between the robot and other objects in the vicinity of the robot. The use of optical flow sensors in unmanned aerial vehicles (UAVs), for stability and obstacle avoidance, is also an area of current research. [21]

See also

Related Research Articles

<span class="mw-page-title-main">Least squares</span> Approximation method in statistics

The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems by minimizing the sum of the squares of the residuals made in the results of each individual equation.

<span class="mw-page-title-main">Stream function</span>

The stream function is defined for incompressible (divergence-free) flows in two dimensions – as well as in three dimensions with axisymmetry. The flow velocity components can be expressed as the derivatives of the scalar stream function. The stream function can be used to plot streamlines, which represent the trajectories of particles in a steady flow. The two-dimensional Lagrange stream function was introduced by Joseph Louis Lagrange in 1781. The Stokes stream function is for axisymmetrical three-dimensional flow, and is named after George Gabriel Stokes.

Fourier optics is the study of classical optics using Fourier transforms (FTs), in which the waveform being considered is regarded as made up of a combination, or superposition, of plane waves. It has some parallels to the Huygens–Fresnel principle, in which the wavefront is regarded as being made up of a combination of spherical wavefronts whose sum is the wavefront being studied. A key difference is that Fourier optics considers the plane waves to be natural modes of the propagation medium, as opposed to Huygens–Fresnel, where the spherical waves originate in the physical medium.

<span class="mw-page-title-main">Inverse kinematics</span> Computing joint values of a kinematic chain from a known end position

In computer animation and robotics, inverse kinematics is the mathematical process of calculating the variable joint parameters needed to place the end of a kinematic chain, such as a robot manipulator or animation character's skeleton, in a given position and orientation relative to the start of the chain. Given joint parameters, the position and orientation of the chain's end, e.g. the hand of the character or robot, can typically be calculated directly using multiple applications of trigonometric formulas, a process known as forward kinematics. However, the reverse operation is, in general, much more challenging.

<span class="mw-page-title-main">Image segmentation</span> Partitioning a digital image into segments

In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.

The Richardson–Lucy algorithm, also known as Lucy–Richardson deconvolution, is an iterative procedure for recovering an underlying image that has been blurred by a known point spread function. It was named after William Richardson and Leon B. Lucy, who described it independently.

Machine olfaction is the automated simulation of the sense of smell. An emerging application in modern engineering, it involves the use of robots or other automated systems to analyze air-borne chemicals. Such an apparatus is often called an electronic nose or e-nose. The development of machine olfaction is complicated by the fact that e-nose devices to date have responded to a limited number of chemicals, whereas odors are produced by unique sets of odorant compounds. The technology, though still in the early stages of development, promises many applications, such as: quality control in food processing, detection and diagnosis in medicine, detection of drugs, explosives and other dangerous or illegal substances, disaster response, and environmental monitoring.

In computer vision, the Lucas–Kanade method is a widely used differential method for optical flow estimation developed by Bruce D. Lucas and Takeo Kanade. It assumes that the flow is essentially constant in a local neighbourhood of the pixel under consideration, and solves the basic optical flow equations for all the pixels in that neighbourhood, by the least squares criterion.

The Horn–Schunck method of estimating optical flow is a global method which introduces a global constraint of smoothness to solve the aperture problem.

The following are common definitions related to the machine vision field.

Articulated body pose estimation in computer vision is the study of algorithms and systems that recover the pose of an articulated body, which consists of joints and rigid parts using image-based observations. It is one of the longest-lasting problems in computer vision because of the complexity of the models that relate observation with pose, and because of the variety of situations in which it would be useful.

Digital image correlation and tracking is an optical method that employs tracking and image registration techniques for accurate 2D and 3D measurements of changes in images. This method is often used to measure full-field [displacement field (mechanics)|displacement]] and strains, and it is widely applied in many areas of science and engineering. Compared to strain gauges and extensometers, digital image correlation methods provide finer details about deformation, due to the ability to provide both local and average data.

<span class="mw-page-title-main">3D reconstruction</span> Process of capturing the shape and appearance of real objects

In computer vision and computer graphics, 3D reconstruction is the process of capturing the shape and appearance of real objects. This process can be accomplished either by active or passive methods. If the model is allowed to change its shape in time, this is referred to as non-rigid or spatio-temporal reconstruction.

<span class="mw-page-title-main">Visual odometry</span> Determining the position and orientation of a robot by analyzing associated camera images

In robotics and computer vision, visual odometry is the process of determining the position and orientation of a robot by analyzing the associated camera images. It has been used in a wide variety of robotic applications, such as on the Mars Exploration Rovers.

Geometric feature learning is a technique combining machine learning and computer vision to solve visual tasks. The main goal of this method is to find a set of representative features of geometric form to represent an object by collecting geometric features from images and learning them using efficient machine learning methods. Humans solve visual tasks and can give fast response to the environment by extracting perceptual information from what they see. Researchers simulate humans' ability of recognizing objects to solve computer vision problems. For example, M. Mata et al.(2002) applied feature learning techniques to the mobile robot navigation tasks in order to avoid obstacles. They used genetic algorithms for learning features and recognizing objects (figures). Geometric feature learning methods can not only solve recognition problems but also predict subsequent actions by analyzing a set of sequential input sensory images, usually some extracting features of images. Through learning, some hypothesis of the next action are given and according to the probability of each hypothesis give a most probable action. This technique is widely used in the area of artificial intelligence.

Biological motion perception is the act of perceiving the fluid unique motion of a biological agent. The phenomenon was first documented by Swedish perceptual psychologist, Gunnar Johansson, in 1973. There are many brain areas involved in this process, some similar to those used to perceive faces. While humans complete this process with ease, from a computational neuroscience perspective there is still much to be learned as to how this complex perceptual problem is solved. One tool which many research studies in this area use is a display stimuli called a point light walker. Point light walkers are coordinated moving dots that simulate biological motion in which each dot represents specific joints of a human performing an action.

The Harris corner detector is a corner detection operator that is commonly used in computer vision algorithms to extract corners and infer features of an image. It was first introduced by Chris Harris and Mike Stephens in 1988 upon the improvement of Moravec's corner detector. Compared to its predecessor, Harris' corner detector takes the differential of the corner score into account with reference to direction directly, instead of using shifting patches for every 45 degree angles, and has been proved to be more accurate in distinguishing between edges and corners. Since then, it has been improved and adopted in many algorithms to preprocess images for subsequent applications.

Single-particle trajectories (SPTs) consist of a collection of successive discrete points causal in time. These trajectories are acquired from images in experimental data. In the context of cell biology, the trajectories are obtained by the transient activation by a laser of small dyes attached to a moving molecule.

<span class="mw-page-title-main">Retinomorphic sensor</span> Optical sensor

Retinomorphic sensors are a type of event-driven optical sensor which produce a signal in response to changes in light intensity, rather than to light intensity itself. This is in contrast to conventional optical sensors such as charge coupled device (CCD) or complementary metal oxide semiconductor (CMOS) based sensors, which output a signal that increases with increasing light intensity. Because they respond to movement only, retinomorphic sensors are hoped to enable faster tracking of moving objects than conventional image sensors, and have potential applications in autonomous vehicles, robotics, and neuromorphic engineering.

Projection filters are a set of algorithms based on stochastic analysis and information geometry, or the differential geometric approach to statistics, used to find approximate solutions for filtering problems for nonlinear state-space systems. The filtering problem consists of estimating the unobserved signal of a random dynamical system from partial noisy observations of the signal. The objective is computing the probability distribution of the signal conditional on the history of the noise-perturbed observations. This distribution allows for calculations of all statistics of the signal given the history of observations. If this distribution has a density, the density satisfies specific stochastic partial differential equations (SPDEs) called Kushner-Stratonovich equation, or Zakai equation. It is known that the nonlinear filter density evolves in an infinite dimensional function space.

References

  1. Burton, Andrew; Radford, John (1978). Thinking in Perspective: Critical Essays in the Study of Thought Processes. Routledge. ISBN   978-0-416-85840-2.
  2. Warren, David H.; Strelow, Edward R. (1985). Electronic Spatial Sensing for the Blind: Contributions from Perception. Springer. ISBN   978-90-247-2689-9.
  3. Horn, Berthold K.P.; Schunck, Brian G. (August 1981). "Determining optical flow" (PDF). Artificial Intelligence. 17 (1–3): 185–203. doi:10.1016/0004-3702(81)90024-2. hdl:1721.1/6337.
  4. Gibson, J.J. (1950). The Perception of the Visual World. Houghton Mifflin.
  5. Royden, C. S.; Moore, K. D. (2012). "Use of speed cues in the detection of moving objects by moving observers". Vision Research. 59: 17–24. doi: 10.1016/j.visres.2012.02.006 . PMID   22406544. S2CID   52847487.
  6. 1 2 Aires, Kelson R. T.; Santana, Andre M.; Medeiros, Adelardo A. D. (2008). Optical Flow Using Color Information (PDF). ACM New York, NY, USA. ISBN   978-1-59593-753-7.
  7. 1 2 3 Beauchemin, S. S.; Barron, J. L. (1995). "The computation of optical flow". ACM Computing Surveys. ACM New York, USA. 27 (3): 433–466. doi: 10.1145/212094.212141 . S2CID   1334552.
  8. Fleet, David J.; Weiss, Yair (2006). "Optical Flow Estimation" (PDF). In Paragios, Nikos; Chen, Yunmei; Faugeras, Olivier D. (eds.). Handbook of Mathematical Models in Computer Vision. Springer. pp. 237–257. ISBN   978-0-387-26371-7.
  9. Barron, John L.; Fleet, David J. & Beauchemin, Steven (1994). "Performance of optical flow techniques" (PDF). International Journal of Computer Vision. 12: 43–77. CiteSeerX   10.1.1.173.481 . doi:10.1007/bf01420984. S2CID   1290100.
  10. 1 2 Zhang, G.; Chanson, H. (2018). "Application of Local Optical Flow Methods to High-Velocity Free-surface Flows: Validation and Application to Stepped Chutes" (PDF). Experimental Thermal and Fluid Science. 90: 186–199. doi:10.1016/j.expthermflusci.2017.09.010.
  11. Glyn W. Humphreys and Vicki Bruce (1989). Visual Cognition. Psychology Press. ISBN   978-0-86377-124-8.
  12. B. Glocker; N. Komodakis; G. Tziritas; N. Navab; N. Paragios (2008). Dense Image Registration through MRFs and Efficient Linear Programming (PDF). Medical Image Analysis Journal.
  13. Baker, Simon; Scharstein, Daniel; Lewis, J. P.; Roth, Stefan; Black, Michael J.; Szeliski, Richard (March 2011). "A Database and Evaluation Methodology for Optical Flow". International Journal of Computer Vision. 92 (1): 1–31. doi: 10.1007/s11263-010-0390-2 . ISSN   0920-5691. S2CID   316800.
  14. Baker, Simon; Scharstein, Daniel; Lewis, J. P.; Roth, Stefan; Black, Michael J.; Szeliski, Richard. "Optical Flow". vision.middlebury.edu. Retrieved 2019-10-18.
  15. Corke, Peter (8 May 2017). "The Image Jacobian". QUT Robot Academy.
  16. Barrows, G. L.; Chahl, J. S.; Srinivasan, M. V. (2003). "Biologically inspired visual sensing and flight control". Aeronautical Journal. 107 (1069): 159–268. doi:10.1017/S0001924000011891. S2CID   108782688 via Cambridge University Press.
  17. Brown, Christopher M. (1987). Advances in Computer Vision. Lawrence Erlbaum Associates. ISBN   978-0-89859-648-9.
  18. Moini, Alireza (2000). Vision Chips. Boston, MA: Springer US. ISBN   9781461552673. OCLC   851803922.
  19. Mead, Carver (1989). Analog VLSI and neural systems . Reading, Mass.: Addison-Wesley. ISBN   0201059924. OCLC   17954003.
  20. Stocker, Alan A. (2006). Analog VLSI circuits for the perception of visual motion. Chichester, England: John Wiley & Sons. ISBN   0470034882. OCLC   71521689.
  21. Floreano, Dario; Zufferey, Jean-Christophe; Srinivasan, Mandyam V.; Ellington, Charlie, eds. (2009). Flying insects and robots. Heidelberg: Springer. ISBN   9783540893936. OCLC   495477442.