Object recognition (cognitive science)

Last updated

Visual object recognition refers to the ability to identify the objects in view based on visual input. One important signature of visual object recognition is "object invariance", or the ability to identify objects across changes in the detailed context in which objects are viewed, including changes in illumination, object pose, and background context. [1]

Contents

Basic stages of object recognition

Neuropsychological evidence affirms that there are four specific stages identified in the process of object recognition. [2] [3] [4] These stages are:

Stage 1 Processing of basic object components, such as color, depth, and form.
Stage 2 These basic components are then grouped on the basis of similarity, providing information on distinct edges to the visual form. Subsequently, figure-ground segregation is able to take place.
Stage 3 The visual representation is matched with structural descriptions in memory.
Stage 4 Semantic attributes are applied to the visual representation, providing meaning, and thereby recognition.

Within these stages, there are more specific processes that take place to complete the different processing components. In addition, other existing models have proposed integrative hierarchies (top-down and bottom-up), as well as parallel processing, as opposed to this general bottom-up hierarchy.

Hierarchical recognition processing

Visual recognition processing is typically viewed as a bottom-up hierarchy in which information is processed sequentially with increasing complexities. During this process, lower-level cortical processors, such as the primary visual cortex, are at the bottom of the hierarchy. Higher-level cortical processors, such as the inferotemporal cortex (IT), are at the top, where visual recognition is facilitated. [5] A highly recognized bottom-up hierarchical theory is James DiCarlo's Untangling description [6] whereby each stage of the hierarchically arranged ventral visual pathway performs operations to gradually transform object representations into an easily extractable format. In contrast, an increasingly popular recognition processing theory, is that of top-down processing. One model, proposed by Moshe Bar (2003), describes a "shortcut" method in which early visual inputs are sent, partially analyzed, from the early visual cortex to the prefrontal cortex (PFC). Possible interpretations of the crude visual input is generated in the PFC and then sent to the inferotemporal cortex (IT) subsequently activating relevant object representations which are then incorporated into the slower, bottom-up process. This "shortcut" is meant to minimize the number of object representations required for matching thereby facilitating object recognition. [5] Lesion studies have supported this proposal with findings of slower response times for individuals with PFC lesions, suggesting use of only the bottom-up processing. [7]

Object constancy and theories of object recognition

A significant aspect of object recognition is that of object constancy: the ability to recognize an object across varying viewing conditions. These varying conditions include object orientation, lighting, and object variability (size, color, and other within-category differences). For the visual system to achieve object constancy, it must be able to extract a commonality in the object description across different viewpoints and the retinal descriptions.[9] Participants who did categorization and recognition tasks while undergoing a functional magnetic found as increased blood flow indicating activation in specific regions of the brain. The categorization task consisted of participants placing objects from canonical or unusual views as either indoor or outdoor objects. The recognition task occurs by presenting the participants with images that they had viewed previously. Half of these images were in the same orientation as previously shown, while the other half were presented in the opposing viewpoint. The brain regions implicated in mental rotation, such as the ventral and dorsal visual pathways and the prefrontal cortex, showed the greatest increase in blood flow during these tasks, demonstrating that they are critical for the ability to view objects from multiple angles. [8] Several theories have been generated to provide insight on how object constancy may be achieved for the purpose of object recognition including, viewpoint-invariant, viewpoint-dependent and multiple views theories.

Viewpoint-invariant theories

Viewpoint-invariant theories suggest that object recognition is based on structural information, such as individual parts, allowing for recognition to take place regardless of the object's viewpoint. Accordingly, recognition is possible from any viewpoint as individual parts of an object can be rotated to fit any particular view.[10][ citation needed ] This form of analytical recognition requires little memory as only structural parts need to be encoded, which can produce multiple object representations through the interrelations of these parts and mental rotation.[10][ citation needed ] Participants in a study were presented with one encoding view from each of 24 preselected objects, as well as five filler images. Objects were then represented in the central visual field at either the same orientation or a different orientation than the original image. Then participants were asked to name if the same or different depth- orientation views of these objects presented. [9] The same procedure was then executed when presenting the images to the left or right visual field. Viewpoint-dependent priming was observed when test views were presented directly to the right hemisphere, but not when test views were presented directly to the left hemisphere. The results support the model that objects are stored in a manner that is viewpoint dependent because the results did not depend on whether the same or a different set of parts could be recovered from the different-orientation views. [9]

3-D model representation

This model, proposed by Marr and Nishihara (1978), states that object recognition is achieved by matching 3-D model representations obtained from the visual object with 3-D model representations stored in memory as vertical shape precepts.[ clarification needed ] [10] Through the use of computer programs and algorithms, Yi Yungfeng (2009) was able to demonstrate the ability for the human brain to mentally construct 3D images using only the 2D images that appear on the retina. Their model also demonstrates a high degree of shape constancy conserved between 2D images, which allow the 3D image to be recognized. [10] The 3-D model representations obtained from the object are formed by first identifying the concavities of the object, which separate the stimulus into individual parts. Recent research suggests that an area of the brain, known as the caudal intraparietal area (CIP), is responsible for storing the slant and tilt of a plan surface that allow for concavity recognition. [11] Rosenburg et al. implanted monkeys with a scleral search coil for monitoring eye position while simultaneously recording single neuron activation from neurons within the CIP. During the experiment, monkeys sat 30 cm away from an LCD screen that displayed the visual stimuli. Binocular disparity cues were displayed on the screen by rendering stimuli as green-red anaglyphs and the slant-tilt curves ranged from 0 to 330. A single trial consisted of a fixation point and then the presentation of a stimulus for 1 second. Neuron activation were then recorded using the surgically inserted micro electrodes. These single neuron activation for specific concavities of objects lead to the discovery that each axis of an individual part of an object containing concavity are found in memory stores. [11] Identifying the principal axis of the object assists in the normalization process via mental rotation that is required because only the canonical description of the object is stored in memory. Recognition is acquired when the observed object viewpoint is mentally rotated to match the stored canonical description.[ citation needed ]

Figure 1. This image, created based on Biederman's (1987) Recognition by Components theory, is an example of how objects can be broken down into Geons. Breakdown of objects into Geons.png
Figure 1. This image, created based on Biederman's (1987) Recognition by Components theory, is an example of how objects can be broken down into Geons.

Recognition by components

An extension of Marr and Nishihara's model, the recognition-by-components theory, proposed by Biederman (1987), proposes that the visual information gained from an object is divided into simple geometric components, such as blocks and cylinders, also known as "geons" (geometric ions), and are then matched with the most similar object representation that is stored in memory to provide the object's identification (see Figure 1). [12]

Viewpoint-dependent theories

Viewpoint-dependent theories suggest that object recognition is affected by the viewpoint at which it is seen, implying that objects seen in novel viewpoints reduce the accuracy and speed of object identification. [13] This theory of recognition is based on a more holistic system rather than by parts, suggesting that objects are stored in memory with multiple viewpoints and angles. This form of recognition requires a lot of memory as each viewpoint must be stored. Accuracy of recognition also depends on how familiar the observed viewpoint of the object is. [14]

Multiple views theory

This theory proposes that object recognition lies on a viewpoint continuum where each viewpoint is recruited for different types of recognition. At one extreme of this continuum, viewpoint-dependent mechanisms are used for within-category discriminations, while at the other extreme, viewpoint-invariant mechanisms are used for the categorization of objects. [13]

Neural substrates

The Dorsal Stream is shown in green and the Ventral Stream in purple. Ventral-dorsal streams.svg
The Dorsal Stream is shown in green and the Ventral Stream in purple.

The dorsal and ventral stream

The visual processing of objects in the brain can be divided into two processing pathways: the dorsal stream (how/where), which extends from the visual cortex to the parietal lobes, and ventral stream (what), which extends from the visual cortex to the inferotemporal cortex (IT). The existence of these two separate visual processing pathways was first proposed by Ungerleider and Mishkin (1982) who, based on their lesion studies, suggested that the dorsal stream is involved in the processing of visual spatial information, such as object localization (where), and the ventral stream is involved in the processing of visual object identification information (what). [15] Since this initial proposal, it has been alternatively suggested that the dorsal pathway should be known as the 'How' pathway as the visual spatial information processed here provides us with information about how to interact with objects, [16] For the purpose of object recognition, the neural focus is on the ventral stream.

Functional specialization in the ventral stream

Within the ventral stream, various regions of proposed functional specialization have been observed in functional imaging studies. The brain regions most consistently found to display functional specialization are the fusiform face area (FFA), which shows increased activation for faces when compared with objects, the parahippocampal place area (PPA) for scenes vs. objects, the extrastriate body area (EBA) for body parts vs. objects, MT+/V5 for moving stimuli vs. static stimuli, and the Lateral Occipital Complex (LOC) for discernible shapes vs. scrambled stimuli. [17] (See also: Neural processing for individual categories of objects)

Structural processing: the lateral occipital complex

The lateral occipital complex (LOC) has been found to be particularly important for object recognition at the perceptual structural level. In an event-related [fMRI-en] study that looked at the adaptation of neurons activated in visual processing of objects, it was discovered that the similarity of an object's shape is necessary for subsequent adaptation in the LOC, but specific object features such as edges and contours are not. This suggests that activation in the LOC represents higher-level object shape information and not simple object features. [18] In a related [fMRI-en] study, the activation of the LOC, which occurred regardless of the presented object's visual cues such as motion, texture, or luminance contrasts, suggests that the different low-level visual cues used to define an object converge in "object-related areas" to assist in the perception and recognition process. [19] None of the mentioned higher-level object shape information seems to provide any [semantic-en] information about the object as the LOC shows a neuronal response to varying forms including non-familiar, abstract objects. [20]

Further experiments have proposed that the LOC consists of a hierarchical system for shape selectivity indicating greater selective activation in the posterior regions for fragments of objects whereas the [anterior-en] regions show greater activation for full or partial objects. [21] This is consistent with previous research that suggests a hierarchical representation in the ventral temporal cortex where primary feature processing occurs in the posterior regions and the integration of these features into a whole and meaningful object occurs in the [anterior-en] regions. [22]

Semantic Processing

Semantic associations allow for faster object recognition. When an object has previously been associated with some sort of semantic meaning, people are more prone to correctly identify the object. Research has shown that semantic associations allow for a much quicker recognition of an object, even when the object is being viewed at varying angles. When objects are viewed at increasingly deviated angles from the traditional plane of view, objects that held learned semantic associations had lower response times compared to objects that did not hold any learned semantic associations. [23] Thus, when object recognition becomes increasingly difficult, semantic associations allow recognition to be much easier. Similarly, a subject can be primed to recognize an object by observing an action that is simply related to the target object. This shows that objects have a set of sensory, motor and semantic associations that allow a person to correctly recognize an object. [24] This supports the claim that the brain utilizes multiple parts when trying to accurately identify an object.

Through information provided from [neuropsychological-en] patients, dissociation of recognition processing have been identified between structural and [semantic-en] processing as structural, colour, and associative information can be selectively impaired. In one PET study, areas found to be involved in associative semantic processing include the left anterior superior/middle temporal gyrus and the left temporal pole comparative to structural and colour information, as well as the right temporal pole comparative to colour decision tasks only. [25] These results indicate that stored perceptual knowledge and semantic knowledge involve separate cortical regions in object recognition as well as indicating that there are hemispheric differences in the temporal regions.

Research has also provided evidence which indicates that visual semantic information converges in the fusiform gyri of the inferotemporal lobes. In a study that compared the semantic knowledge of category versus attributes, it was found that they play separate roles in how they contribute to recognition. For categorical comparisons, the lateral regions of the fusiform gyrus were activated by living objects, in comparison to nonliving objects which activated the medial regions. For attribute comparisons, it was found that the right fusiform gyrus was activated by global form, in comparison to local details which activated the left fusiform gyrus. These results suggest that the type of object category determines which region of the fusiform gyrus is activated for processing semantic recognition, whereas the attributes of an object determines the activation in either the left or right fusiform gyrus depending on whether global form or local detail is processed. [26]

In addition, it has been proposed that activation in [anterior-en] regions of the fusiform gyri indicate successful recognition. [27] However, levels of activation have been found to depend on the semantic relevance of the object. The term semantic relevance here refers to "a measure of the contribution of semantic features to the core meaning of a concept." [28] Results showed that objects with high semantic relevance, such as artefacts, created an increase in activation compared to objects with low semantic relevance, such as natural objects. [28] This is due to the proposed increased difficulty to distinguish between natural objects as they have very similar structural properties which makes them harder to identify in comparison to artefacts. [27] Therefore, the easier the object is to identify, the more likely it will be successfully recognized.

Another condition that affects successful object recognition performance is that of contextual facilitation. It is thought that during tasks of object recognition, an object is accompanied by a "context frame", which offers semantic information about the object's typical context. [29] It has been found that when an object is out of context, object recognition performance is hindered with slower response times and greater inaccuracies in comparison to recognition tasks when an object was in an appropriate context. [29] Based on results from a study using [fMRI-en], it has been proposed that there is a "context network" in the brain for contextually associated objects with activity largely found in the Parahippocampal cortex (PHC) and the Retrosplenial Complex (RSC). [30] Within the PHC, activity in the Parahippocampal Place Area (PPA), has been found to be preferential to scenes rather than objects; however, it has been suggested that activity in the PHC for solitary objects in tasks of contextual facilitation may be due to subsequent thought of the spatial scene in which the object is contextually represented. Further experimenting found that activation was found for both non-spatial and spatial contexts in the PHC, although activation from non-spatial contexts was limited to the [anterior-en] PHC and the posterior PHC for spatial contexts. [30]

Recognition memory

When someone sees an object, they know what the object is because they've seen it on a past occasion; this is recognition memory. Not only do abnormalities to the ventral (what) stream of the visual pathway affect our ability to recognize an object but also the way in which an object is presented to us. One notable characteristic of visual recognition memory is its remarkable capacity: even after seeing thousands of images on single trials, humans perform at high accuracy in subsequent memory tests and they remember considerable detail about the images that they have seen [31]

Context

Context allows for a much greater accuracy in object recognition. When an identifiable object is blurred, the accuracy of recognition is much greater when the object is placed in a familiar context. In addition to this, even an unfamiliar context allows for more accurate object recognition compared to the object being shown in isolation. [32] This can be attributed to the fact that objects are typically seen in some setting rather than no setting at all. When the setting the object is in is familiar to the viewer, it becomes much easier to determine what the object is. Though context is not required to correctly recognize, it is part of the association that one makes with a certain object.

Context becomes especially important when recognizing faces or emotions. When facial emotions are presented without any context, the ability to which someone is able to accurately describe the emotion being shown is significantly lower than when context is given. This phenomenon remains true across all age groups and cultures, signifying that context is essential in accurately identifying facial emotion for all individuals. [33]

Familiarity

Familiarity is a mechanism that is context-free in the sense that what one recognizes just feels familiar without spending time trying to find in what context one knows the object. [34] The ventro-lateral region of the frontal lobe is involved in memory encoding during incidental learning and then later maintaining and retrieving semantic memories. [34] Familiarity can induce perceptual processes different from those of unfamiliar objects which means that our perception of a finite number of familiar objects is unique. [35] Deviations from typical viewpoints and contexts can affect the efficiency for which an object is recognized most effectively. [35] It was found that not only are familiar objects recognized more efficiently when viewed from a familiar viewpoint opposed to an unfamiliar one, but also this principle applies to novel objects. This deduces to the thought that representations of objects in our brain are organized in more of a familiar fashion of the objects observed in the environment. [35] Recognition is not only largely driven by object shape and/or views but also by dynamic information. [36] Familiarity can benefit the perception of dynamic point-light displays, moving objects, the sex of faces, and face recognition. [35]

Recollection

Recollection shares many similarities with familiarity; however, it is context-dependent, requiring specific information from the inquired incident. [34]

Impairments

Loss of object recognition is called visual object agnosia. There are two broad categories of visual object agnosia: apperceptive and associative. When object agnosia occurs from a lesion in the dominant hemisphere, there is often a profound associated language disturbance, including loss of word meaning.

Effects of lesions in the ventral stream

Object recognition is a complex task and involves several different areas of the brain – not just one. If one area is damaged then object recognition can be impaired. The main area for object recognition takes place in the temporal lobe. For example, it was found that lesions to the perirhinal cortex in rats causes impairments in object recognition especially with an increase in feature ambiguity. [37] Neonatal aspiration lesions of the amygdaloid complex in monkeys appear to have resulted in a greater object memory loss than early hippocampal lesions. However, in adult monkeys, the object memory impairment is better accounted for by damage to the perirhinal and entorhinal cortex than by damage to the amygdaloid nuclei. [38] Combined amygdalohippocampal (A + H) lesions in rats impaired performance on an object recognition task when the retention intervals were increased beyond 0s and when test stimuli were repeated within a session. Damage to the [amygdala-en] or [hippocampus-en] does not affect object recognition, whereas A + H damage produces clear deficits. [39] In an object recognition task, the level of discrimination was significantly lower in the electrolytic lesions of globus pallidus (part of the basal ganglia) in rats compared to the Substantia- Innominata/Ventral Pallidum which was in turn worse compared to Control and Medial Septum/Vertical Diagonal Band of Broca groups; however, only globus pallidus did not discriminate between new and familiar objects. [40] These lesions damage the ventral (what) pathway of the visual processing of objects in the brain.

Visual agnosias

Agnosia is a rare occurrence and can be the result of a stroke, dementia, head injury, brain infection, or hereditary. [41] Apperceptive agnosia is a deficit in object perception creating an inability to understand the significance of objects. [34] Similarly, associative visual agnosia is the inability to understand the significance of objects; however, this time the deficit is in semantic memory. [34] Both of these agnosias can affect the pathway to object recognition, like Marr's Theory of Vision. More specifically unlike apperceptive agnosia, associative agnosic patients are more successful at drawing, copying, and matching tasks; however, these patients demonstrate that they can perceive but not recognize. [41] Integrative agnosia (a subtype of associative agnosia) is the inability to integrate separate parts to form a whole image. [34] With these types of agnosias there is damage to the ventral (what) stream of the visual processing pathway. Object orientation agnosia is the inability to extract the orientation of an object despite adequate object recognition. [34] With this type of agnosia there is damage to the dorsal (where) stream of the visual processing pathway. This can affect object recognition in terms of familiarity and even more so in unfamiliar objects and viewpoints. A difficulty in recognizing faces can be explained by prosopagnosia. Someone with prosopagnosia cannot identify the face but is still able to perceive age, gender, and emotional expression. [41] The brain region that specifies in facial recognition is the fusiform face area. Prosopagnosia can also be divided into apperceptive and associative subtypes. Recognition of individual chairs, cars, animals can also be impaired; therefore, these object share similar perceptual features with the face that are recognized in the fusiform face area. [41]

Alzheimer's disease

The distinction between category and attribute in semantic representation may inform our ability to assess semantic function in aging and disease states affecting semantic memory, such as Alzheimer's disease (AD). [42] Because of semantic memory deficits, persons with Alzheimer's disease have difficulties recognizing objects as the semantic memory is known to be used to retrieve information for naming and categorizing objects. [43] In fact, it is highly debated whether the semantic memory deficit in AD reflects the loss of semantic knowledge for particular categories and concepts or the loss of knowledge of perceptual features and attributes. [42]

See also

Related Research Articles

<span class="mw-page-title-main">Agnosia</span> Medical condition

Agnosia is the inability to process sensory information. Often there is a loss of ability to recognize objects, persons, sounds, shapes, or smells while the specific sense is not defective nor is there any significant memory loss. It is usually associated with brain injury or neurological illness, particularly after damage to the occipitotemporal border, which is part of the ventral stream. Agnosia only affects a single modality, such as vision or hearing. More recently, a top-down interruption is considered to cause the disturbance of handling perceptual information.

<span class="mw-page-title-main">Temporal lobe</span> One of the four lobes of the mammalian brain

The temporal lobe is one of the four major lobes of the cerebral cortex in the brain of mammals. The temporal lobe is located beneath the lateral fissure on both cerebral hemispheres of the mammalian brain.

<span class="mw-page-title-main">Face perception</span> Cognitive process of visually interpreting the human face

Facial perception is an individual's understanding and interpretation of the face. Here, perception implies the presence of consciousness and hence excludes automated facial recognition systems. Although facial recognition is found in other species, this article focuses on facial perception in humans.

<span class="mw-page-title-main">Prosopagnosia</span> Cognitive disorder of face perception

Prosopagnosia, also known as face blindness, is a cognitive disorder of face perception in which the ability to recognize familiar faces, including one's own face (self-recognition), is impaired, while other aspects of visual processing and intellectual functioning remain intact. The term originally referred to a condition following acute brain damage, but a congenital or developmental form of the disorder also exists, with a prevalence of 2.5%. The brain area usually associated with prosopagnosia is the fusiform gyrus, which activates specifically in response to faces. The functionality of the fusiform gyrus allows most people to recognize faces in more detail than they do similarly complex inanimate objects. For those with prosopagnosia, the method for recognizing faces depends on the less sensitive object-recognition system. The right hemisphere fusiform gyrus is more often involved in familiar face recognition than the left. It remains unclear whether the fusiform gyrus is specific for the recognition of human faces or if it is also involved in highly trained visual stimuli. Prosopoagnosic patients are under normal conditions able to recognize facial expressions and emotions.

<span class="mw-page-title-main">Fusiform gyrus</span> Gyrus of the temporal and occipital lobes of the brain

The fusiform gyrus, also known as the lateral occipitotemporal gyrus,is part of the temporal lobe and occipital lobe in Brodmann area 37. The fusiform gyrus is located between the lingual gyrus and parahippocampal gyrus above, and the inferior temporal gyrus below. Though the functionality of the fusiform gyrus is not fully understood, it has been linked with various neural pathways related to recognition. Additionally, it has been linked to various neurological phenomena such as synesthesia, dyslexia, and prosopagnosia.

Visual processing is a term that is used to refer to the brain's ability to use and interpret visual information from the world around us. The process of converting light energy into a meaningful image is a complex process that is facilitated by numerous brain structures and higher level cognitive processes. On an anatomical level, light energy first enters the eye through the cornea, where the light is bent. After passing through the cornea, light passes through the pupil and then lens of the eye, where it is bent to a greater degree and focused upon the retina. The retina is where a group of light-sensing cells, called photoreceptors are located. There are two types of photoreceptors: rods and cones. Rods are sensitive to dim light and cones are better able to transduce bright light. Photoreceptors connect to bipolar cells, which induce action potentials in retinal ganglion cells. These retinal ganglion cells form a bundle at the optic disc, which is a part of the optic nerve. The two optic nerves from each eye meet at the optic chiasm, where nerve fibers from each nasal retina cross which results in the right half of each eye's visual field being represented in the left hemisphere and the left half of each eye's visual fields being represented in the right hemisphere. The optic tract then diverges into two visual pathways, the geniculostriate pathway and the tectopulvinar pathway, which send visual information to the visual cortex of the occipital lobe for higher level processing.

<span class="mw-page-title-main">Visual memory</span> Ability to process visual and spatial information

Visual memory describes the relationship between perceptual processing and the encoding, storage and retrieval of the resulting neural representations. Visual memory occurs over a broad time range spanning from eye movements to years in order to visually navigate to a previously visited location. Visual memory is a form of memory which preserves some characteristics of our senses pertaining to visual experience. We are able to place in memory visual information which resembles objects, places, animals or people in a mental image. The experience of visual memory is also referred to as the mind's eye through which we can retrieve from our memory a mental image of original objects, places, animals or people. Visual memory is one of several cognitive systems, which are all interconnected parts that combine to form the human memory. Types of palinopsia, the persistence or recurrence of a visual image after the stimulus has been removed, is a dysfunction of visual memory.

<span class="mw-page-title-main">Language processing in the brain</span> How humans use words to communicate

In psycholinguistics, language processing refers to the way humans use words to communicate ideas and feelings, and how such communications are processed and understood. Language processing is considered to be a uniquely human ability that is not produced with the same grammatical understanding or systematicity in even human's closest primate relatives.

<span class="mw-page-title-main">Associative visual agnosia</span> Medical condition

Associative visual agnosia is a form of visual agnosia. It is an impairment in recognition or assigning meaning to a stimulus that is accurately perceived and not associated with a generalized deficit in intelligence, memory, language or attention. The disorder appears to be very uncommon in a "pure" or uncomplicated form and is usually accompanied by other complex neuropsychological problems due to the nature of the etiology. Affected individuals can accurately distinguish the object, as demonstrated by the ability to draw a picture of it or categorize accurately, yet they are unable to identify the object, its features or its functions.

The two-streams hypothesis is a model of the neural processing of vision as well as hearing. The hypothesis, given its initial characterisation in a paper by David Milner and Melvyn A. Goodale in 1992, argues that humans possess two distinct visual systems. Recently there seems to be evidence of two distinct auditory systems as well. As visual information exits the occipital lobe, and as sound leaves the phonological network, it follows two main pathways, or "streams". The ventral stream leads to the temporal lobe, which is involved with object and visual identification and recognition. The dorsal stream leads to the parietal lobe, which is involved with processing the object's spatial location relative to the viewer and with speech repetition.

Visual agnosia is an impairment in recognition of visually presented objects. It is not due to a deficit in vision, language, memory, or intellect. While cortical blindness results from lesions to primary visual cortex, visual agnosia is often due to damage to more anterior cortex such as the posterior occipital and/or temporal lobe(s) in the brain.[2] There are two types of visual agnosia: apperceptive agnosia and associative agnosia.

<span class="mw-page-title-main">Inferior temporal gyrus</span> One of three gyri of the temporal lobe of the brain

The inferior temporal gyrus is one of three gyri of the temporal lobe and is located below the middle temporal gyrus, connected behind with the inferior occipital gyrus; it also extends around the infero-lateral border on to the inferior surface of the temporal lobe, where it is limited by the inferior sulcus. This region is one of the higher levels of the ventral stream of visual processing, associated with the representation of objects, places, faces, and colors. It may also be involved in face perception, and in the recognition of numbers and words.

<span class="mw-page-title-main">Colour centre</span> Brain region responsible for colour processing

The colour centre is a region in the brain primarily responsible for visual perception and cortical processing of colour signals received by the eye, which ultimately results in colour vision. The colour centre in humans is thought to be located in the ventral occipital lobe as part of the visual system, in addition to other areas responsible for recognizing and processing specific visual stimuli, such as faces, words, and objects. Many functional magnetic resonance imaging (fMRI) studies in both humans and macaque monkeys have shown colour stimuli to activate multiple areas in the brain, including the fusiform gyrus and the lingual gyrus. These areas, as well as others identified as having a role in colour vision processing, are collectively labelled visual area 4 (V4). The exact mechanisms, location, and function of V4 are still being investigated.

<span class="mw-page-title-main">Fusiform face area</span> Part of the human visual system that is specialized for facial recognition

The fusiform face area is a part of the human visual system that is specialized for facial recognition. It is located in the inferior temporal cortex (IT), in the fusiform gyrus.

<span class="mw-page-title-main">Inferior longitudinal fasciculus</span>

The inferior longitudinal fasciculus (ILF) is traditionally considered one of the major occipitotemporal association tracts. It is the white matter backbone of the ventral visual stream. It connects the ventral surface of the anterior temporal lobe and the extrastriate cortex of the occipital lobe, running along the lateral and inferior wall of the lateral ventricle.

Apperceptive agnosia is a failure in recognition that is due to a failure of perception. In contrast, associative agnosia is a type of agnosia where perception occurs but recognition still does not occur. When referring to apperceptive agnosia, visual and object agnosia are most commonly discussed; this occurs because apperceptive agnosia is most likely to present visual impairments. However, in addition to visual apperceptive agnosia there are also cases of apperceptive agnosia in other sensory areas.

Recognition memory, a subcategory of explicit memory, is the ability to recognize previously encountered events, objects, or people. When the previously experienced event is reexperienced, this environmental content is matched to stored memory representations, eliciting matching signals. As first established by psychology experiments in the 1970s, recognition memory for pictures is quite remarkable: humans can remember thousands of images at high accuracy after seeing each only once and only for a few seconds.

<span class="mw-page-title-main">Superior temporal sulcus</span> Part of the brains temporal lobe

In the human brain, the superior temporal sulcus (STS) is the sulcus separating the superior temporal gyrus from the middle temporal gyrus in the temporal lobe of the brain. A sulcus is a deep groove that curves into the largest part of the brain, the cerebrum, and a gyrus is a ridge that curves outward of the cerebrum.

Form perception is the recognition of visual elements of objects, specifically those to do with shapes, patterns and previously identified important characteristics. An object is perceived by the retina as a two-dimensional image, but the image can vary for the same object in terms of the context with which it is viewed, the apparent size of the object, the angle from which it is viewed, how illuminated it is, as well as where it resides in the field of vision. Despite the fact that each instance of observing an object leads to a unique retinal response pattern, the visual processing in the brain is capable of recognizing these experiences as analogous, allowing invariant object recognition. Visual processing occurs in a hierarchy with the lowest levels recognizing lines and contours, and slightly higher levels performing tasks such as completing boundaries and recognizing contour combinations. The highest levels integrate the perceived information to recognize an entire object. Essentially object recognition is the ability to assign labels to objects in order to categorize and identify them, thus distinguishing one object from another. During visual processing information is not created, but rather reformatted in a way that draws out the most detailed information of the stimulus.

Patient DF is a woman with visual apperceptive agnosia who has been studied extensively due to the implications of her behavior for the two streams theory of visual perception. Though her vision remains intact, she has trouble visually locating and identifying objects. Her agnosia is thought to be caused by a bilateral lesion to her lateral occipital cortex, an area thought by dual-stream proponents to be the ventral "object recognition" stream. Despite being unable to identify or recognize objects, DF can still use visual input to guide her action.

References

  1. Ullman, S. (1996) High Level Vision, MIT Press
  2. Humphreys G., Price C., Riddoch J. (1999). "From objects to names: A cognitive neuroscience approach". Psychological Research. 62 (2–3): 118–130. doi:10.1007/s004260050046. PMID   10472198. S2CID   13783299.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  3. Riddoch, M., & Humphreys, G. (2001). Object Recognition. In B. Rapp (Ed.), Handbook of Cognitive Neuropsychology. Hove: Psychology Press.
  4. Ward, J. (2006). The Student's Guide to Cognitive Neuroscience. New York: Psychology Press.
  5. 1 2 Bar M (2003). "A cortical mechanism for triggering top-down facilitation in visual object recognition". Journal of Cognitive Neuroscience. 15 (4): 600–609. CiteSeerX   10.1.1.296.3039 . doi:10.1162/089892903321662976. PMID   12803970. S2CID   18209748.
  6. DiCarlo JJ, Cox DD (2007). "Untangling invariant object recognition". Trends Cogn Sci. 11 (8): 333–41. doi:10.1016/j.tics.2007.06.010. PMID   17631409. S2CID   11527344.
  7. Richer F., Boulet C. (1999). "Frontal lesions and fluctuations in response preparation" (PDF). Brain and Cognition. 40 (1): 234–238. doi:10.1006/brcg.1998.1067. PMID   10373286. Archived from the original (PDF) on 2018-01-18. Retrieved 2018-01-17.
  8. Schenden, Haline (2008). "Where vision meets memory: Prefrontal-posterior networks for visual object constancy during categorization and recognition". Neuropsychology & Neurolog. 18 (7): 1695–1711.
  9. 1 2 Burgund, E. Darcy; Marsolek, Chad J. (2000). "Viewpoint-invariant and viewpoint-dependent object recognition in dissociable neural subsystems". Psychonomic Bulletin & Review. 7 (3): 480–489. doi: 10.3758/BF03214360 . ISSN   1069-9384. PMID   11082854.
  10. 1 2 Yunfeng, Yi (2009). "A computational model that recovers the 3D shape of an object from a single 2D retinal representation". Vision Research. 49 (9): 979–991. doi: 10.1016/j.visres.2008.05.013 . PMID   18621410.
  11. 1 2 Rosenberg, Ari (2013). "The visual representation of 3D object orientation in parietal cortex". The Journal of Neuroscience. 33 (49): 19352–19361. doi:10.1523/jneurosci.3174-13.2013. PMC   3850047 . PMID   24305830.
  12. Biederman I (1987). "Recognition by components: A theory of human image understanding". Psychological Review. 94 (2): 115–147. CiteSeerX   10.1.1.132.8548 . doi:10.1037/0033-295x.94.2.115. PMID   3575582. S2CID   8054340.
  13. 1 2 Tarr M., Bulthoff H. (1995). "Is human object recognition better described by geon structural descriptions or by multiple views? Comment on Biederman and Gerhardstein (1993)". Journal of Experimental Psychology: Human Perception and Performance. 21 (6): 1494–1505. doi:10.1037/0096-1523.21.6.1494. PMID   7490590.
  14. Peterson, M. A., & Rhodes, G. (Eds.). (2003). Perception of Faces, Objects and Scenes: Analytic and Holistic Processes. New York: Oxford University Press.
  15. Ungerleider, L.G., Mishkin, M., 1982. Two cortical visual systems.In: Ingle, D.J., Goodale, M.A., Mansfield, R.J.W. (Eds.), Analysis of Visual Behavior. InMIT Press, Cambridge, pp. 549–586.
  16. Goodale M., Milner A. (1992). "Separate visual pathways for perception and action". Trends in Neurosciences. 15 (1): 20–25. CiteSeerX   10.1.1.207.6873 . doi:10.1016/0166-2236(92)90344-8. PMID   1374953. S2CID   793980.
  17. Spiridon M., Fischl B., Kanwisher N. (2006). "Location and spatial profile of category-specific regions in human extrastriate cortex". Human Brain Mapping. 27 (1): 77–89. doi:10.1002/hbm.20169. PMC   3264054 . PMID   15966002.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  18. Kourtzi Z., Kanwisher N. (2001). "Representation of perceived object shape by the human lateral occipital complex". Science. 293 (5534): 1506–1509. Bibcode:2001Sci...293.1506K. doi: 10.1126/science.1061133 . PMID   11520991. S2CID   2942593.
  19. Grill-Spector K.; Kushnir T.; Edelman S.; Itzchak Y.; Malach R. (1998). "Cue-invariant activation in object-related areas of the human occipital lobe". Neuron. 21 (1): 191–202. doi: 10.1016/s0896-6273(00)80526-7 . PMID   9697863.
  20. Malach R.; Reppas J.; Benson R.; Kwong K.; Jiang H.; Kennedy W.; et al. (1995). "Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex". Proceedings of the National Academy of Sciences of the USA. 92 (18): 8135–8139. Bibcode:1995PNAS...92.8135M. doi: 10.1073/pnas.92.18.8135 . PMC   41110 . PMID   7667258.
  21. Grill-Spector K., Kourtzi Z., Kanwisher N. (2001). "The lateral occipital complex and its role in object recognition". Vision Research. 42 (10–11): 1409–1422. doi: 10.1016/s0042-6989(01)00073-6 . PMID   11322983.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  22. Ungerleider, L.G., Mishkin, M., 1982. Two cortical visual systems. In: Ingle, D.J., Goodale, M.A., Mansfield, R.J.W. (Eds.), Analysis of Visual Behavior. InMIT Press, Cambridge, pp. 549–586.
  23. Collins and Curby (2013). "Conceptual knowledge attenuates viewpoint dependency in visual object recognition". Visual Cognition. 21 (8): 945–960. doi:10.1080/13506285.2013.836138. S2CID   144846924.
  24. Helbig; et al. (2009). "Action observation can prime visual object recognition". Exp Brain Res. 200 (3–4): 251–8. doi:10.1007/s00221-009-1953-8. PMC   2820217 . PMID   19669130.
  25. Kellenbach M., Hovius M., Patterson K. (2005). "A PET study of visual and semantic knowledge about objects". Cortex. 41 (2): 121–132. doi:10.1016/s0010-9452(08)70887-6. PMID   15714895. S2CID   4476793.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  26. Wierenga C., Perlstein W., Benjamin M., Leonard C., Rothi L., Conway T.; et al. (2009). "Neural substrates of object identification: Functional magnetic resonance imaging evidence that category and visual attribute contribute to semantic knowledge". Journal of the International Neuropsychological Society. 15 (2): 169–181. doi:10.1017/s1355617709090468. PMID   19232155. S2CID   9987685.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  27. 1 2 Gerlach C (2009). "Category-specificity in visual object recognition". Cognition. 111 (3): 281–301. doi:10.1016/j.cognition.2009.02.005. PMID   19324331. S2CID   13572437.
  28. 1 2 Mechelli A., Sartori G., Orlandi P., Price C. (2006). "Semantic relevance explains category effects in medial fusiform gyri". NeuroImage. 30 (3): 992–1002. doi:10.1016/j.neuroimage.2005.10.017. hdl: 11577/1565416 . PMID   16343950. S2CID   17635735.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  29. 1 2 Bar M., Ullman S. (1996). "Spatial context in recognition". Perception. 25 (3): 343–352. doi:10.1068/p250343. PMID   8804097. S2CID   10106848.
  30. 1 2 Bar M., Aminoff E. (2003). "Cortical analysis of visual context". Neuron. 38 (2): 347–358. doi: 10.1016/s0896-6273(03)00167-3 . PMID   12718867.
  31. Brady TF, Konkle T, Alvarez GA, Oliva A (2008). "Visual long-term memory has a massive storage capacity for object details". Proc Natl Acad Sci USA. 105 (38): 14325–9. Bibcode:2008PNAS..10514325B. doi: 10.1073/pnas.0803390105 . PMC   2533687 . PMID   18787113.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  32. Barenholtz; et al. (2014). "Quantifying the role of context in visual object recognition". Visual Cognition. 22: 30–56. doi:10.1080/13506285.2013.865694. S2CID   144891703.
  33. Theurel; et al. (2016). "The integration of visual context information in facial emotion recognition in 5- to 15-year-olds". Journal of Experimental Child Psychology. 150: 252–271. doi:10.1016/j.jecp.2016.06.004. PMID   27367301.
  34. 1 2 3 4 5 6 7 Ward, J. (2006). The Student's Guide to Cognitive Neuroscience. New York: Psychology Press
  35. 1 2 3 4 Bulthoff I., Newell F. (2006). "The role of familiarity in the recognition of static and dynamic objects". Visual Perception - Fundamentals of Vision: Low and Mid-Level Processes in Perception. Progress in Brain Research. Vol. 154. pp. 315–325. doi:10.1016/S0079-6123(06)54017-8. hdl:21.11116/0000-0004-9C5A-8. ISBN   9780444529664. PMID   17010720.
  36. Vuong, Q., & Tarr, M. (2004). Rotation direction affects object recognition
  37. Norman G., Eacott M. (2004). "Impaired object recognition with increasing levels of feature ambiguity in rats with perirhinal cortex lesions". Behavioural Brain Research. 148 (1–2): 79–91. doi:10.1016/s0166-4328(03)00176-1. PMID   14684250. S2CID   42296072.
  38. Bachevalier, J., Beauregard, M., & Alvarado, M. C. (1999). Long-term effects of neonatal damage to the hippocampal formation and amygdaloid complex on object discrimination and object recognition in rhesus monkeys. Behavioral Neuroscience, 113.
  39. Aggleton J. P., Blindt H. S., Rawlins J. N. P. (1989). "Effects of amygdaloid and Amygdaloid–Hippocampal lesions on object recognition and spatial working memory in rats". Behavioral Neuroscience. 103 (5): 962–974. doi:10.1037/0735-7044.103.5.962. PMID   2803563. S2CID   18503443.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  40. Ennaceur A. (1998). "Effects of lesions of the substantia Innominata/ventral pallidum, globus pallidus and medial septum on rat's performance in object-recognition and radial-maze tasks: Physostigmine and amphetamine treatments". Pharmacological Research. 38 (4): 251–263. doi:10.1006/phrs.1998.0361. PMID   9774488.
  41. 1 2 3 4 Bauer, R. M. (2006). The agnosias. DC, US: American Psychological Association: Washington
  42. 1 2 Hajilou B. B., Done D. J. (2007). "Evidence for a dissociation of structural and semantic knowledge in dementia of the alzheimer type (DAT)". Neuropsychologia. 45 (4): 810–816. doi:10.1016/j.neuropsychologia.2006.08.008. PMID   17034821. S2CID   21628550.
  43. Laatu S., Jaykka H., Portin R., Rinne J. (2003). "Visual object recognition in early Alzheimer's disease: deficits in semantic processing". Acta Neurologica Scandinavica. 108 (2): 82–89. doi: 10.1034/j.1600-0404.2003.00097.x . PMID   12859283. S2CID   22741928.{{cite journal}}: CS1 maint: multiple names: authors list (link)