This article may be too technical for most readers to understand.(May 2019) |
Motion perception is the process of inferring the speed and direction of elements in a scene based on visual, vestibular and proprioceptive inputs. Although this process appears straightforward to most observers, it has proven to be a difficult problem from a computational perspective, and difficult to explain in terms of neural processing.
Motion perception is studied by many disciplines, including psychology (i.e. visual perception), neurology, neurophysiology, engineering, and computer science.
The inability to perceive motion is called akinetopsia and it may be caused by a lesion to cortical area V5 in the extrastriate cortex. Neuropsychological studies of a patient who could not see motion, seeing the world in a series of static "frames" instead, suggested that visual area V5 in humans [1] is homologous to motion processing area V5/MT in primates. [2] [3] [4]
When two or more stimuli are alternatively switched on and off, they can produce two distinct motion perceptions. The first, known as beta movement, is demonstrated in the yellow-ball figure and forms the basis for electronic news ticker displays. However, at faster alternation rates, and when the distance between the stimuli is optimal, an illusory "object"—matching the background color—appears to move between the stimuli, alternately occluding them. This phenomenon is called the phi phenomenon and is often described as an example of "pure" motion detection, uncontaminated by form cues, unlike beta movement. [5] Nevertheless, this description is somewhat paradoxical since creating such motion without figural percepts is impossible.
The phi phenomenon has been referred to as "first-order" motion perception. Werner E. Reichardt and Bernard Hassenstein have modelled it in terms of relatively simple "motion sensors" in the visual system, that have evolved to detect a change in luminance at one point on the retina and correlate it with a change in luminance at a neighbouring point on the retina after a short delay. Sensors that are proposed to work this way have been referred to as either Hassenstein-Reichardt detectors after the scientists Bernhard Hassenstein and Werner Reichardt, who first modelled them, [6] motion-energy sensors, [7] or Elaborated Reichardt Detectors. [8] These sensors are described as detecting motion by spatio-temporal correlation and are considered by some to be plausible models for how the visual system may detect motion. (Although, again, the notion of a "pure motion" detector suffers from the problem that there is no "pure motion" stimulus, i.e. a stimulus lacking perceived figure/ground properties). There is still considerable debate regarding the accuracy of the model and exact nature of this proposed process. It is not clear how the model distinguishes between movements of the eyes and movements of objects in the visual field, both of which produce changes in luminance on points on the retina.
Second-order motion is when the moving contour is defined by contrast, texture, flicker or some other quality that does not result in an increase in luminance or motion energy in the Fourier spectrum of the stimulus. [9] [10] There is much evidence to suggest that early processing of first- and second-order motion is carried out by separate pathways. [11] Second-order mechanisms have poorer temporal resolution and are low-pass in terms of the range of spatial frequencies to which they respond. (The notion that neural responses are attuned to frequency components of stimulation suffers from the lack of a functional rationale and has been generally criticized by G. Westheimer (2001) in an article called "The Fourier Theory of Vision.") Second-order motion produces a weaker motion aftereffect unless tested with dynamically flickering stimuli. [12]
The motion direction of a contour is ambiguous, because the motion component parallel to the line cannot be inferred based on the visual input. This means that a variety of contours of different orientations moving at different speeds can cause identical responses in a motion sensitive neuron in the visual system.
Some have speculated that, having extracted the hypothesized motion signals (first- or second-order) from the retinal image, the visual system must integrate those individual local motion signals at various parts of the visual field into a 2-dimensional or global representation of moving objects and surfaces. (It is not clear how this 2D representation is then converted into the perceived 3D percept) Further processing is required to detect coherent motion or "global motion" present in a scene. [13]
The ability of a subject to detect coherent motion is commonly tested using motion coherence discrimination tasks. For these tasks, dynamic random-dot patterns (also called random dot kinematograms) are used that consist in 'signal' dots moving in one direction and 'noise' dots moving in random directions. The sensitivity to motion coherence is assessed by measuring the ratio of 'signal' to 'noise' dots required to determine the coherent motion direction. The required ratio is called the motion coherence threshold.
As in other aspects of vision, the observer's visual input is generally insufficient to determine the true nature of stimulus sources, in this case their velocity in the real world. In monocular vision for example, the visual input will be a 2D projection of a 3D scene. The motion cues present in the 2D projection will by default be insufficient to reconstruct the motion present in the 3D scene. Put differently, many 3D scenes will be compatible with a single 2D projection. The problem of motion estimation generalizes to binocular vision when we consider occlusion or motion perception at relatively large distances, where binocular disparity is a poor cue to depth. This fundamental difficulty is referred to as the inverse problem. [14]
Nonetheless, some humans do perceive motion in depth. There are indications that the brain uses various cues, in particular temporal changes in disparity as well as monocular velocity ratios, for producing a sensation of motion in depth. [15] Two different binocular cues of the perception motion in depth are hypothesized: Inter-ocular velocity difference (IOVD) and changing disparity (CD) over time. Motion in depth based on inter-ocular velocity differences can be tested using dedicated binocularly uncorrelated random-dot kinematograms. [16] Study results indicate that the processing of these two binocular cues – IOVD and CD – may use fundamentally different low-level stimulus features, which may be processed jointly that later stages. [17] [18] Additionally, as monocular cue, also the changing size of retinal images contributes to motion in depth detection.
Detection and discrimination of motion can be improved by training with long-term results. Participants trained to detect the movements of dots on a screen in only one direction become particularly good at detecting small movements in the directions around that in which they have been trained. This improvement was still present 10 weeks later. However perceptual learning is highly specific. For example, the participants show no improvement when tested around other motion directions, or for other sorts of stimuli. [19]
A cognitive map is a type of mental representation which serves an individual to acquire, code, store, recall, and decode information about the relative locations and attributes of phenomena in their spatial environment. [20] [21] Place cells work with other types of neurons in the hippocampus and surrounding regions of the brain to perform this kind of spatial processing, [22] but the ways in which they function within the hippocampus are still being researched. [23]
Many species of mammals can keep track of spatial location even in the absence of visual, auditory, olfactory, or tactile cues, by integrating their movements—the ability to do this is referred to in the literature as path integration. A number of theoretical models have explored mechanisms by which path integration could be performed by neural networks. In most models, such as those of Samsonovich and McNaughton (1997) [24] or Burak and Fiete (2009), [25] the principal ingredients are (1) an internal representation of position, (2) internal representations of the speed and direction of movement, and (3) a mechanism for shifting the encoded position by the right amount when the animal moves. Because cells in the Medial Entorhinal Cortex (MEC) encode information about position (grid cells [26] ) and movement (head direction cells and conjunctive position-by-direction cells [27] ), this area is currently viewed as the most promising candidate for the place in the brain where path integration occurs.
Motion sensing using vision is crucial for detecting a potential mate, prey, or predator, and thus it is found both in vertebrates and invertebrates vision throughout a wide variety of species, although it is not universally found in all species. In vertebrates, the process takes place in retina and more specifically in retinal ganglion cells, which are neurons that receive input from bipolar cells and amacrine cells on visual information and process output to higher regions of the brain including, thalamus, hypothalamus, and mesencephalon.
The study of directionally selective units began with a discovery of such cells in the cerebral cortex of cats by David Hubel and Torsten Wiesel in 1959. Following the initial report, an attempt to understand the mechanism of directionally selective cells was pursued by Horace B. Barlow and William R. Levick in 1965. [28] Their in-depth experiments in rabbit's retina expanded the anatomical and physiological understanding of the vertebrate visual system and ignited the interest in the field. Numerous studies that followed thereafter have unveiled the mechanism of motion sensing in vision for the most part. Alexander Borst and Thomas Euler's 2011 review paper, "Seeing Things in Motion: Models, Circuits and Mechanisms". [29] discusses certain important findings from the early discoveries to the recent work on the subject, coming to the conclusion of the current status of the knowledge.
Direction selective (DS) cells in the retina are defined as neurons that respond differentially to the direction of a visual stimulus. According to Barlow and Levick (1965), the term is used to describe a group of neurons that "gives a vigorous discharge of impulses when a stimulus object is moved through its receptive field in one direction." [28] This direction in which a set of neurons respond most strongly to is their "preferred direction". In contrast, they do not respond at all to the opposite direction, "null direction". The preferred direction is not dependent on the stimulus—that is, regardless of the stimulus' size, shape, or color, the neurons respond when it is moving in their preferred direction, and do not respond if it is moving in the null direction. There are three known types of DS cells in the vertebrate retina of the mouse, ON/OFF DS ganglion cells, ON DS ganglion cells, and OFF DS ganglion cells. Each has a distinctive physiology and anatomy. Analogous directionally selective cells are not thought to exist in the primate retina. [30]
ON/OFF DS ganglion cells act as local motion detectors. They fire at the onset and offset of a stimulus (a light source). If a stimulus is moving in the direction of the cell's preference, it will fire at the leading and the trailing edge. Their firing pattern is time-dependent and is supported by the Reichardt-Hassenstain model, which detects spatiotemporal correlation between the two adjacent points. The detailed explanation of the Reichardt-Hassenstain model will be provided later in the section. The anatomy of ON/OFF cells is such that the dendrites extend to two sublaminae of the inner plexiform layer and make synapses with bipolar and amacrine cells. They have four subtypes, each with its own preference for direction.
Unlike ON/OFF DS ganglion cells that respond both to the leading and the trailing edge of a stimulus, ON DS ganglion cells are responsive only to a leading edge. The dendrites of ON DS ganglion cells are monostratified and extend into the inner sublamina of the inner plexiform layer. They have three subtypes with different directional preferences.
OFF DS ganglion cells act as a centripetal motion detector, and they respond only to the trailing edge of a stimulus. They are tuned to upward motion of a stimulus. The dendrites are asymmetrical and arbor in to the direction of their preference. [29]
The first DS cells in invertebrates were found in flies in a brain structure called the lobula plate. The lobula plate is one of the three stacks of the neuropils in the fly's optic lobe. The "tangential cells" of the lobula plate composed of roughly about 50 neurons, and they arborize extensively in the neuropile. The tangential cells are known to be directionally selective with distinctive directional preference. One of which is Horizontally Sensitive (HS) cells, such as the H1 neuron, that depolarize most strongly in response to stimulus moving in a horizontal direction (preferred direction). On the other hand, they hyperpolarize when the direction of motion is opposite (null direction). Vertically Sensitive (VS) cells are another group of cells that are most sensitive to vertical motion. They depolarize when a stimulus is moving downward and hyperpolarize when it is moving upward. Both HS and VS cells respond with a fixed preferred direction and a null direction regardless of the color or contrast of the background or the stimulus.
It is now known that motion detection in vision is based on the Hassenstein-Reichardt detector model. [31] This is a model used to detect correlation between the two adjacent points. It consists of two symmetrical subunits. Both subunits have a receptor that can be stimulated by an input (light in the case of visual system). In each subunit, when an input is received, a signal is sent to the other subunit. At the same time, the signal is delayed in time within the subunit, and after the temporal filter, is then multiplied by the signal received from the other subunit. Thus, within each subunit, the two brightness values, one received directly from its receptor with a time delay and the other received from the adjacent receptor, are multiplied. The multiplied values from the two subunits are then subtracted to produce an output. The direction of selectivity or preferred direction is determined by whether the difference is positive or negative. The direction which produces a positive outcome is the preferred direction.
In order to confirm that the Reichardt-Hassenstein model accurately describes the directional selectivity in the retina, the study was conducted using optical recordings of free cytosolic calcium levels after loading a fluorescent indicator dye into the fly tangential cells. The fly was presented uniformly moving gratings while the calcium concentration in the dendritic tips of the tangential cells was measured. The tangential cells showed modulations that matched the temporal frequency of the gratings, and the velocity of the moving gratings at which the neurons respond most strongly showed a close dependency on the pattern wavelength. This confirmed the accuracy of the model both at the cellular and the behavioral level. [32]
Although the details of the Hassenstein-Reichardt model have not been confirmed at an anatomical and physiological level, the site of subtraction in the model is now being localized to the tangential cells. When depolarizing current is injected into the tangential cell while presenting a visual stimulus, the response to the preferred direction of motion decreased, and the response to the null direction increased. The opposite was observed with hyperpolarizing current. The T4 and T5 cells, which have been selected as a strong candidate for providing input to the tangential cells, have four subtypes that each project into one of the four strata of the lobula plate that differ in the preferred orientation. [29]
One of the early works on DS cells in vertebrates was done on the rabbit retina by H. Barlow and W. Levick in 1965. Their experimental methods include variations to the slit-experiments and recording of the action potentials in the rabbit retina. The basic set-up of the slit experiment was they presented a moving black-white grating through a slit of various widths to a rabbit and recorded the action potentials in the retina. This early study had a large impact on the study of DS cells by laying down the foundation for later studies. The study showed that DS ganglion cells derive their property from the basis of sequence-discriminating activity of subunits, and that this activity may be the result of inhibitory mechanism in response to the motion of image in the null direction. It also showed that the DS property of retinal ganglion cells is distributed over the entire receptive field, and not limited to specific zones. Direction selectivity is contained for two adjacent points in the receptive field separated by as small as 1/4°, but selectivity decreased with larger separations. They used this to support their hypothesis that discrimination of sequences gives rise to direction selectivity because normal movement would activate adjacent points in a succession. [28]
ON/OFF DS ganglion cells can be divided into 4 subtypes differing in their directional preference, ventral, dorsal, nasal, or temporal. The cells of different subtypes also differ in their dendritic structure and synaptic targets in the brain. The neurons that were identified to prefer ventral motion were also found to have dendritic projections in the ventral direction. Also, the neurons that prefer nasal motion had asymmetric dendritic extensions in the nasal direction. Thus, a strong association between the structural and functional asymmetry in ventral and nasal direction was observed. With a distinct property and preference for each subtype, there was an expectation that they could be selectively labeled by molecular markers. The neurons that were preferentially responsive to vertical motion were indeed shown to be selectively expressed by a specific molecular marker. However, molecular markers for other three subtypes have not been yet found. [33]
The direction selective (DS) ganglion cells receive inputs from bipolar cells and starburst amacrine cells. The DS ganglion cells respond to their preferred direction with a large excitatory postsynaptic potential followed by a small inhibitory response. On the other hand, they respond to their null direction with a simultaneous small excitatory postsynaptic potential and a large inhibitory postsynaptic potential. Starburst amacrine cells have been viewed as a strong candidate for direction selectivity in ganglion cells because they can release both GABA and Ach. Their dendrites branch out radiantly from a soma, and there is a significant dendritic overlap. Optical measurements of Ca2+ concentration showed that they respond strongly to the centrifugal motion (the outward motion from the soma to the dendrites), while they don't respond well to the centripetal motion (the inward motion from the dendritic tips to the soma). When the starburst cells were ablated with toxins, direction selectivity was eliminated. Moreover, their release of neurotransmitters itself, specifically calcium ions, reflect direction selectivity, which may be presumably attributed to the synaptic pattern. The branching pattern is organized such that certain presynaptic input will have more influence on a given dendrite than others, creating a polarity in excitation and inhibition. Further evidence suggests that starburst cells release inhibitory neurotransmitters, GABA onto each other in a delayed and prolonged manner. This accounts for the temporal property of inhibition. [29]
In addition to spatial offset due to GABAergic synapses, the important role of chloride transporters has started to be discussed. The popular hypothesis is that starburst amacrine cells differentially express chloride transporters along the dendrites. Given this assumption, some areas along the dendrite will have a positive chloride-ion equilibrium potential relative to the resting potential while others have a negative equilibrium potential. This means that GABA at one area will be depolarizing and at another area hyperpolarizing, accounting for the spatial offset present between excitation and inhibition. [34]
Recent research (published March 2011) relying on serial block-face electron microscopy (SBEM) has led to identification of the circuitry that influences directional selectivity. This new technique provides detailed images of calcium flow and anatomy of dendrites of both starburst amacrine (SAC) and DS ganglion cells. By comparing the preferred directions of ganglion cells with their synapses on SAC's, Briggman et al. provide evidence for a mechanism primarily based on inhibitory signals from SAC's [35] based on an oversampled serial block-face scanning electron microscopy study of one sampled retina, that retinal ganglion cells may receive asymmetrical inhibitory inputs directly from starburst amacrine cells, and therefore computation of directional selectivity also occurs postsynaptically. Such postsynaptic models are unparsimonious, and so if any given starburst amacrine cells conveys motion information to retinal ganglion cells then any computing of 'local' direction selectivity postsynaptically by retinal ganglion cells is redundant and dysfunctional. An acetylcholine (ACh) transmission model of directionally selective starburst amacrine cells provides a robust topological underpinning of a motion sensing in the retina. [36]
The retina is the innermost, light-sensitive layer of tissue of the eye of most vertebrates and some molluscs. The optics of the eye create a focused two-dimensional image of the visual world on the retina, which then processes that image within the retina and sends nerve impulses along the optic nerve to the visual cortex to create visual perception. The retina serves a function which is in many ways analogous to that of the film or image sensor in a camera.
The visual system is the physiological basis of visual perception. The system detects, transduces and interprets information concerning light within the visible range to construct an image and build a mental model of the surrounding environment. The visual system is associated with the eye and functionally divided into the optical system and the neural system.
A photoreceptor cell is a specialized type of neuroepithelial cell found in the retina that is capable of visual phototransduction. The great biological importance of photoreceptors is that they convert light into signals that can stimulate biological processes. To be more specific, photoreceptor proteins in the cell absorb photons, triggering a change in the cell's membrane potential.
A retinal ganglion cell (RGC) is a type of neuron located near the inner surface of the retina of the eye. It receives visual information from photoreceptors via two intermediate neuron types: bipolar cells and retina amacrine cells. Retina amacrine cells, particularly narrow field cells, are important for creating functional subunits within the ganglion cell layer and making it so that ganglion cells can observe a small dot moving a small distance. Retinal ganglion cells collectively transmit image-forming and non-image forming visual information from the retina in the form of action potential to several regions in the thalamus, hypothalamus, and mesencephalon, or midbrain.
The receptive field, or sensory space, is a delimited medium where some physiological stimuli can evoke a sensory neuronal response in specific organisms.
As a part of the retina, bipolar cells exist between photoreceptors and ganglion cells. They act, directly or indirectly, to transmit signals from the photoreceptors to the ganglion cells.
In the anatomy of the eye, amacrine cells are interneurons in the retina. They are named from Greek a– 'non' makr– 'long' and in– 'fiber', because of their short neuronal processes. Amacrine cells are inhibitory neurons, and they project their dendritic arbors onto the inner plexiform layer (IPL), they interact with retinal ganglion cells, and bipolar cells or both of these.
Horizontal cells are the laterally interconnecting neurons having cell bodies in the inner nuclear layer of the retina of vertebrate eyes. They help integrate and regulate the input from multiple photoreceptor cells. Among their functions, horizontal cells are believed to be responsible for increasing contrast via lateral inhibition and adapting both to bright and dim light conditions. Horizontal cells provide inhibitory feedback to rod and cone photoreceptors. They are thought to be important for the antagonistic center-surround property of the receptive fields of many types of retinal ganglion cells.
Intrinsically photosensitive retinal ganglion cells (ipRGCs), also called photosensitive retinal ganglion cells (pRGC), or melanopsin-containing retinal ganglion cells (mRGCs), are a type of neuron in the retina of the mammalian eye. The presence of an additional photoreceptor was first suspected in 1927 when mice lacking rods and cones still responded to changing light levels through pupil constriction; this suggested that rods and cones are not the only light-sensitive tissue. However, it was unclear whether this light sensitivity arose from an additional retinal photoreceptor or elsewhere in the body. Recent research has shown that these retinal ganglion cells, unlike other retinal ganglion cells, are intrinsically photosensitive due to the presence of melanopsin, a light-sensitive protein. Therefore, they constitute a third class of photoreceptors, in addition to rod and cone cells.
Retinotopy is the mapping of visual input from the retina to neurons, particularly those neurons within the visual stream. For clarity, 'retinotopy' can be replaced with 'retinal mapping', and 'retinotopic' with 'retinally mapped'.
Complex cells can be found in the primary visual cortex (V1), the secondary visual cortex (V2), and Brodmann area 19 (V3).
The optokinetic reflex (OKR), also referred to as the optokinetic response, or optokinetic nystagmus (OKN), is a compensatory reflex that supports visual image stabilization. The purpose of OKR is to prevent image blur on the retina that would otherwise occur when an animal moves its head or navigates through its environment. This is achieved by the reflexive movement of the eyes in the same direction as image motion, so as to minimize the relative motion of the visual scene on the eye. OKR is best evoked by slow, rotational motion, and operates in coordination with several complementary reflexes that also support image stabilization, including the vestibulo-ocular reflex (VOR).
A bipolar neuron, or bipolar cell, is a type of neuron characterized by having both an axon and a dendrite extending from the soma in opposite directions. These neurons are predominantly found in the retina and olfactory system. The embryological period encompassing weeks seven through eight marks the commencement of bipolar neuron development.
Feature detection is a process by which the nervous system sorts or filters complex natural stimuli in order to extract behaviorally relevant cues that have a high probability of being associated with important objects or organisms in their environment, as opposed to irrelevant background or noise.
A parasol cell, sometimes called an M cell or M ganglion cell, is one type of retinal ganglion cell (RGC) located in the ganglion cell layer of the retina. These cells project to magnocellular cells in the lateral geniculate nucleus (LGN) as part of the magnocellular pathway in the visual system. They have large cell bodies as well as extensive branching dendrite networks and as such have large receptive fields. Relative to other RGCs, they have fast conduction velocities. While they do show clear center-surround antagonism, they receive no information about color. Parasol ganglion cells contribute information about the motion and depth of objects to the visual system.
Non-spiking neurons are neurons that are located in the central and peripheral nervous systems and function as intermediary relays for sensory-motor neurons. They do not exhibit the characteristic spiking behavior of action potential generating neurons.
Retinal waves are spontaneous bursts of action potentials that propagate in a wave-like fashion across the developing retina. These waves occur before rod and cone maturation and before vision can occur. The signals from retinal waves drive the activity in the dorsal lateral geniculate nucleus (dLGN) and the primary visual cortex. The waves are thought to propagate across neighboring cells in random directions determined by periods of refractoriness that follow the initial depolarization. Retinal waves are thought to have properties that define early connectivity of circuits and synapses between cells in the retina. There is still much debate about the exact role of retinal waves. Some contend that the waves are instructional in the formation of retinogeniculate pathways, while others argue that the activity is necessary but not instructional in the formation of retinogeniculate pathways.
Frank Werblin is Professor of the Graduate School, Division of Neurobiology, at the University of California, Berkeley.
Michal Rivlin is a Senior Scientist and Sara Lee Schupf Family Chair in Neurobiology at the Weizmann Institute of Science. She was awarded the 2019 Blavatnik Awards for Young Scientists for her research on the neuronal circuitry of the retina.
Laura Busse is a German neuroscientist and professor of Systemic Neuroscience within the Division of Neurobiology at the Ludwig Maximilian University of Munich. Busse's lab studies context-dependent visual processing in mouse models by performing large scale in vivo electrophysiological recordings in the thalamic and cortical circuits of awake and behaving mice.