Visual spatial attention is a form of visual attention that involves directing attention to a location in space. Similar to its temporal counterpart visual temporal attention, these attention modules have been widely implemented in video analytics in computer vision to provide enhanced performance and human interpretable explanation [1] [2] [3] of deep learning models.
Spatial attention allows humans to selectively process visual information through prioritization of an area within the visual field. A region of space within the visual field is selected for attention and the information within this region then receives further processing. Research shows that when spatial attention is evoked, an observer is typically faster and more accurate at detecting a target that appears in an expected location compared to an unexpected location. [4] Attention is guided even more quickly to unexpected locations, when these locations are made salient by external visual inputs (such as a sudden flash). According to the V1 Saliency Hypothesis, the human primary visual cortex plays a critical role for such an exogenous attentional guidance. [5]
Spatial attention is distinctive from other forms of visual attention such as object-based attention and feature-based attention. [6] These other forms of visual attention select an entire object or a specific feature of an object regardless of its location, whereas spatial attention selects a specific region of space and the objects and features within that region are processed.
A key property of visual attention is that attention can be selected based on spatial location and spatial cueing experiments have been used to assess this type of selection. In Posner's cueing paradigm, [4] the task was to detect a target that could be presented in one of two locations and respond as quickly as possible. At the start of each trial, a cue is presented that either indicates the location of the target (valid cue) or indicates the incorrect location thus misdirecting the observer (invalid cue). In addition, on some trials there is no information given about the location of the target, as no cue is presented (neutral trials). Two distinct cues were used; the cue was either a peripheral 'flicker' around the target's location (peripheral cue) or the cue was centrally displayed as a symbol, such as an arrow pointing to the location of the target (central cue). Observers are faster and more accurate at detecting and recognising a target if the location of the target is known in advance. [4] [7] Furthermore, misinforming subjects about the location of the target leads to slower reaction times and poorer accuracy relative to performance when no information about the location of the target is given. [4] [7]
Spatial cueing tasks typically assess covert spatial attention, which refers to attention that can change spatially without any accompanying eye movements. To investigate covert attention, it is necessary to ensure that observer's eyes remain fixated at one location throughout the task. In spatial cueing tasks, subjects are instructed to fixate on a central fixation point. Typically it takes 200 ms to make a saccadic eye movement to a location. [8] Therefore, the combined duration of the cue and target is typically presented in less than 200 ms. This ensures that covert spatial attention is being measured and the effects are not due to overt eye movements. Some studies specifically monitor eye movements to ensure that the observer's eyes are continually fixated on the central fixation point. [9]
The central and peripheral cues in spatial cueing experiments can assess the orienting of covert spatial attention. These two cues appear to use different mechanisms for orienting spatial attention. The peripheral cues tend to attract attention automatically, recruiting bottom-up attentional control processes. Conversely, central cues are thought to be under voluntary control and therefore use top-down processes. [10] Studies have shown that peripheral cues are difficult to ignore, as attention is oriented towards the peripheral cue even when the observer knows the cue does not predict the location of the target. [7] Peripheral cues also cause an allocation of attention much faster than central cues, as central cues require greater processing time to interpret the cue. [10]
In spatial cueing tasks, the spatial probe (cue) causes an allocation of attention to a particular location. Spatial probes have also been often used in other types of tasks to assess how spatial attention is allocated.
Spatial probes have been used to assess spatial attention in visual searches. Visual search tasks involve the detection of a target among a set of distractors. Attention to the location of items in the search can be used to guide visual searches. This was demonstrated by valid cues improving the identification of targets relative to the invalid and neutral conditions. [11] A visual search display can also influence how fast an observer responds to a spatial probe. In a visual search task, a small dot appeared after a visual display and it was found that observers were faster at detecting the dot when it was located at the same location as the target. [12] This demonstrated that spatial attention had been allocated to the target location.
The use of multiple tasks simultaneously in an experiment can also demonstrate the generality of spatial attention, as allocation of attention to one task can influence performance in other tasks. [13] [14] For example, it was found that when attention was allocated to detecting a flickering dot (spatial probe), this increased the likelihood of identifying nearby letters. [14]
The distribution of spatial attention has been subject to considerable research. Consequently, this has led to the development of different metaphors and models that represent the proposed spatial distribution of attention.
According to the 'spotlight' metaphor, the focus of attention is analogous to the beam of a spotlight. [15] The moveable spotlight is directed at one location and everything within its beam is attended and processed preferentially, while information outside the beam is unattended. This suggests that the focus of visual attention is limited in spatial size and moves to process other areas in the visual field.
Research has suggested that the attentional focus is variable in size. [16] Eriksen and St James [17] proposed the 'zoom-lens' metaphor, which is an alternative to the spotlight metaphor and takes into account the variable nature of attention. This account likens the distribution of attention to a zoom-lens that can narrow or widen the focus of attention. This supports findings that show attention can be distributed both over a large area of the visual field and also function in a focused mode. [18] In support of this analogy, research has shown that there is an inverse relationship between the size of the attentional focus and the efficiency of processing within the boundaries of a zoom-lens. [19]
The Gradient Model is an alternative theory on the distribution of spatial attention. This model proposes that attentional resources are allocated in a gradient pattern, with concentrated resources in the centre of focus and resources decrease in a continuous fashion away from the centre. [20] Downing [9] conducted research using an adaptation of Posner's cueing paradigm that supported this model. The target could appear in 12 potential locations, marked by boxes. Results showed that attentional facilitation was strongest at the cued location and gradually decreased with distance away from the cued location. However, not all research has supported the gradient model. For example, Hughes and Zimba [21] conducted a similar experiment, using a highly distributed visual array and did not use boxes to mark the potential locations of the target. There was no evidence of a gradient effect, as the faster responses were when the cue and target were in the same hemifield and slower responses when they were in different hemifields. The boxes played an important role in attention as a later experiment, used the boxes and consequently found a gradient pattern. [22] Therefore, it is considered that the size of the gradient can adjust according to the circumstances. A broader gradient may be adopted when there is an empty display, as attention can spread and is only restricted by hemifield borders.
It is debated in research on visual spatial attention whether it is possible to split attention across different areas in the visual field. The 'spotlight' and 'zoom-lens' accounts postulate that attention uses a single unitary focus. Therefore, spatial attention can only be allocated to adjacent areas in the visual field and consequently cannot be split. This was supported by an experiment that altered the spatial cueing paradigm by using two cues, a primary and a secondary cue. It was found that the secondary cue was only effective in focusing attention when its location was adjacent to the primary cue. [15] In addition, it has been demonstrated that observers are unable to ignore stimuli presented in areas situated between two cued locations. [23] These findings have proposed that attention cannot be split across two non-contiguous regions. However, other studies have demonstrated that spatial attention can be split across two locations. For example, observers were able to attend simultaneously to two different targets located in opposite hemifields. [19] Research has even suggested that humans are able to focus attention across two to four locations in the visual field. [24] Another perspective is that spatial attention can be split only under certain conditions. This perspective suggests that the splitting of spatial attention is flexible. Research demonstrated that whether spatial attention is unitary or divided depends on the goals of the task. [25] Therefore, if dividing attention is beneficial to the observer then a divided focus of attention will be utilised.
One of the main difficulties in establishing whether spatial attention can be divided is that a unitary focus model of attention can also explain a number of the findings. For example, when two non-contiguous locations are attended to, it may not be that attention has been split between these two locations but instead it may be that the unitary focus of attention has expanded. [24] Alternatively, the two locations may not be attended to simultaneously and instead the area of focus is moving quickly from one location to another. [26] Consequently, it appears very difficult to prove undoubtedly that spatial attention can be split.
Hemineglect , also known as unilateral visual neglect, attentional neglect, hemispatial neglect or spatial neglect, is a disorder incorporating a significant deficit in visuospatial attention. Hemineglect refers to the inability of patients with unilateral brain damage to detect objects in the side of space contralateral to the lesion (contralesional); i.e. damage to the right cerebral hemisphere resulting in neglect of objects on the left side of space, [27] and is characterized by hemispheric asymmetry. Performance is generally preserved in the side ipsilateral to the lesion (ipsilesional). [27] Hemineglect is more frequent and arguably more severe following damage to the right cerebral hemisphere of right-handed subjects. [27] It has been proposed that the right parietal lobes are comparatively more responsible for the allocation of spatial attention, therefore damage to this hemisphere often produces more severe effects. [28] Additionally, it is difficult to map with accuracy the visual sensory deficits in the neglected hemifield.
Neglect is diagnosed using a variety of paper-and-pencil tasks. A common method is the Complex Figure Test (CFT). The CFT requires patients to copy a complicated line drawing, and then reproduce it from memory. Often patients will neglect features present on the contralesional side of space and objects. Patients with neglect will perform similarly when reproducing mental images of familiar places and objects. A common error is the failure to include numbers on the left side of a picture when drawing an analogue clock from memory, for example, all of the numbers may be positioned on the right side of the clock face. [10]
Another paper-and-pencil task is the line bisection task. In this exercise, patients are required to divide a horizontal line halfway along. Patients with neglect will often bisect the line to the right of the true centre, leaving the left portion of the line unattended to. [27]
Object cancellation tasks are also used to determine the extent of potential deficit. During this task, patients are required to cancel out (cross out) all of the objects in a cluttered display (e.g. lines, geometric shapes, letters, etc.). [10] Patients with damage primarily to the right parietal area fail in the detection of objects in the left visuospatial field, and these are often not crossed out by the patient. In addition, those patients who may be severely affected tend to fail in detecting their errors on visual inspection.
Extinction is a phenomenon observable during double simultaneous stimulation of both left and right visual fields. Patients with extinction will fail to perceive the stimulus in the contralesional visual field when presented in conjunction with a stimulus in the ipsilesional field. [10] However, when presented on its own, patients can correctly perceive the contralesional stimulus. Thus, patients with neglect fail to report stimuli present in the aberrant field, whereas patients with extinction fail to report stimuli in the aberrant field only when double simultaneous presentations occur in both hemifields. [10] Analogous to neglect, extinction affects the contralesional visuospatial field in majority of patients with unilateral damage. [27] Anatomical correlates of visuospatial neglect and extinction do not overlap absolutely, with extinction proposed to be associated with subcortical lesions. [27]
A common method in quick detection of visuospatial extinction is a Finger Confrontation Model. Utilized as standard bedside evaluation, the task requires the patient to indicate (either verbally or by pointing) in which visual field the doctor's hand or finger is moving, while the doctor makes a wiggling motion with his index. [10] This enables the doctor to distinguish between deficits resembling neglect and those which may indicate extinction, by presenting either a single stimulus in the contralesional field or two simultaneous stimuli in both the contralesional and ipsilesional visual fields. This quick test can be used immediately in a hospital setting for quick diagnosis, and can be particularly useful following strokes and seizures.
The posterior parietal region is arguably the most extensively studied in relation to visuospatial attention. Patients with parietal lobe damage most often fail to attend to stimuli located on the contralesional hemisphere, as seen in patients with hemineglect/unilateral visual neglect. [10] As such, they may fail to acknowledge a person sitting to their left, they may neglect to eat food positioned on their left, or make head or eye movements to the left. [10] Computed tomography (CT) studies have demonstrated that the inferior parietal lobule in the right hemisphere is the most frequently damaged in patients with severe neglect. [29]
Parietal damage may decrease the ability to reduce decision noise. [10] Spatial cues appear to reduce the uncertainty of a visuospatial decision. Disruption to spatial orienting, as seen in hemineglect, suggests that patients with damage to the parietal region may experience an increased difficulty in decision-making regarding targets located in the contralesional field. [10]
Damage to the parietal region may also increase illusory conjunctions of features. Illusory conjunctions occur when people report combinations of features which did not occur. [28] For example, when presented with an orange square and a purple circle, the participant may report a purple square or an orange circle. Although it would typically require special circumstances for a non-impaired person to produce an illusory conjunction, it appears that some patients with damage to the parietal cortex may demonstrate a vulnerability to such visuospatial impairments. [27] Results from parietal patients suggest that the parietal cortex, and therefore spatial attention, may be implicated in solving this problem of binding features. [10]
Lesions to the frontal cortices have long been known to precede spatial neglect and other visuospatial deficits. Specifically, frontal lobe damage has been associated with a deficit in the control of over attention (the production of eye movements). Lesions to the superior frontal lobe areas that include the frontal eye fields seem to disrupt some forms of overt eye movements. [10] It has been demonstrated by Guitton, Buchtel, & Douglas [30] that eye movement directed away from an abruptly appearing visual target ("antisaccade") is remarkably impaired in patients with damage to the frontal eye fields, who frequently made reflexive eye movements to the target. When frontal eye field patients did make antisaccades, they had increased latency of their eye movements compared to controls. This suggests that the frontal lobes, specifically the dorsolateral region containing the frontal eye fields, play an inhibitory role in preventing reflexive eye movements in overt attention control. [30] Further, the frontal eye fields or surrounding areas may be critically associated with neglect following dorsolateral frontal lesions. [29]
Frontal lobe lesions also appear to produce deficits in visuospatial attention related to covert attention (the orienting of attention without the requirement eye movement). Using Posner's Spatial Cueing Task, Alivesatos and Milner (1989; see [10] ) found that participants with frontal lobe damage demonstrated a comparably smaller attentional benefit from the valid cues than control participants or participants with temporal lobe damage. Voluntary orienting of frontal lobe patients appear to be impaired.
The right lateral frontal lobe region was also found to be associated with left-sided visual neglect in an investigation carried out by Husain & Kennard. [29] A region of overlap was found in the location of lesions in four of five patients with left-sided visual neglect, specifically the dorsal aspect of the inferior frontal gyrus and the underlying white matter. Additionally, overlap of lesion areas was also detected in the dorsal region of Brodmann area 44 (anterior to the premotor cortex). These results further implicate the frontal lobe in directing attention in visual space.
The thalamic nuclei have been speculated to be involved in directing attention to locations in visual space. [31] Specifically, the pulvinar nucleus appears to be implicated in the subcortical control of spatial attention, and lesions in this area can cause neglect. [10] Evidence [31] suggests that the pulvinar nucleus of the thalamus might be responsible for engaging in spatial attention at a previously cued location. A study by Rafal and Posner [31] found that patients who had acute pulvinar lesions were slower to detect a target which appeared in the contralesional visuospatial field compared to the appearance of a target in the ipsilesional field during a spatial cuing task. This suggests a deficit in the ability to use attention to improve performance in detection and processing of visual targets in the contralesional region. [31]
Camouflage relies on deceiving the cognition of the observer, such as a predator. Some camouflage mechanisms such as distractive markings likely function by competing for visual attention with stimuli that would give away the presence of the camouflaged object (such as a prey animal). Such markings have to be conspicuous, and positioned away from the outline so as to avoid drawing attention to it, in contrast to disruptive markings which work best when in contact with the outline. [32]
Attention or focus, is the concentration of awareness on some phenomenon to the exclusion of other stimuli. It is the selective concentration on discrete information, either subjectively or objectively. William James (1890) wrote that "Attention is the taking possession by the mind, in clear and vivid form, of one out of what seem several simultaneously possible objects or trains of thought. Focalization, concentration, of consciousness are of its essence." Attention has also been described as the allocation of limited cognitive processing resources. Attention is manifested by an attentional bottleneck, in terms of the amount of data the brain can process each second; for example, in human vision, less than 1% of the visual input data stream of 1MByte/sec can enter the bottleneck, leading to inattentional blindness.
Agraphia is an acquired neurological disorder causing a loss in the ability to communicate through writing, either due to some form of motor dysfunction or an inability to spell. The loss of writing ability may present with other language or neurological disorders; disorders appearing commonly with agraphia are alexia, aphasia, dysarthria, agnosia, acalculia and apraxia. The study of individuals with agraphia may provide more information about the pathways involved in writing, both language related and motoric. Agraphia cannot be directly treated, but individuals can learn techniques to help regain and rehabilitate some of their previous writing abilities. These techniques differ depending on the type of agraphia.
The parietal lobe is one of the four major lobes of the cerebral cortex in the brain of mammals. The parietal lobe is positioned above the temporal lobe and behind the frontal lobe and central sulcus.
Hemispatial neglect is a neuropsychological condition in which, after damage to one hemisphere of the brain, a deficit in attention and awareness towards the side of space opposite brain damage is observed. It is defined by the inability of a person to process and perceive stimuli towards the contralesional side of the body or environment. Hemispatial neglect is very commonly contralateral to the damaged hemisphere, but instances of ipsilesional neglect have been reported.
Bálint's syndrome is an uncommon and incompletely understood triad of severe neuropsychological impairments: inability to perceive the visual field as a whole (simultanagnosia), difficulty in fixating the eyes, and inability to move the hand to a specific object by using vision. It was named in 1909 for the Austro-Hungarian neurologist and psychiatrist Rezső Bálint who first identified it.
Visual extinction is a neurological disorder which occurs following damage to the parietal lobe of the brain. It is similar to, but distinct from, hemispatial neglect. Visual extinction has the characteristic symptom of difficulty to perceive contralesional stimuli when presented simultaneously with an ipsilesional stimulus, but the ability to correctly identify them when not presented simultaneously. Under simultaneous presentation, the contralesional stimulus is apparently ignored by the patient, or extinguished. This deficiency may lead to difficulty on behalf of the patient with processing the stimuli's 3D position.
Inhibition of return (IOR) refers to an orientation mechanism that briefly enhances the speed and accuracy with which an object is detected after the object is attended, but then impairs detection speed and accuracy. IOR is usually measured with a cue-response paradigm, in which a person presses a button when they detect a target stimulus following the presentation of a cue that indicates the location in which the target will appear. The cue can be exogenous, or endogenous. Inhibition of return results from oculomotor activation, regardless of whether it was produced by exogenous signals or endogenously. Although IOR occurs for both visual and auditory stimuli, IOR is greater for visual stimuli, and is studied more often than auditory stimuli.
The two-streams hypothesis is a model of the neural processing of vision as well as hearing. The hypothesis, given its initial characterisation in a paper by David Milner and Melvyn A. Goodale in 1992, argues that humans possess two distinct visual systems. Recently there seems to be evidence of two distinct auditory systems as well. As visual information exits the occipital lobe, and as sound leaves the phonological network, it follows two main pathways, or "streams". The ventral stream leads to the temporal lobe, which is involved with object and visual identification and recognition. The dorsal stream leads to the parietal lobe, which is involved with processing the object's spatial location relative to the viewer and with speech repetition.
Visual search is a type of perceptual task requiring attention that typically involves an active scan of the visual environment for a particular object or feature among other objects or features. Visual search can take place with or without eye movements. The ability to consciously locate an object or target amongst a complex array of stimuli has been extensively studied over the past 40 years. Practical examples of using visual search can be seen in everyday life, such as when one is picking out a product on a supermarket shelf, when animals are searching for food among piles of leaves, when trying to find a friend in a large crowd of people, or simply when playing visual search games such as Where's Wally?
Attentional shift occurs when directing attention to a point increases the efficiency of processing of that point and includes inhibition to decrease attentional resources to unwanted or irrelevant inputs. Shifting of attention is needed to allocate attentional resources to more efficiently process information from a stimulus. Research has shown that when an object or area is attended, processing operates more efficiently. Task switching costs occur when performance on a task suffers due to the increased effort added in shifting attention. There are competing theories that attempt to explain why and how attention is shifted as well as how attention is moved through space in attentional control.
The superior longitudinal fasciculus (SLF) is an association tract in the brain that is composed of three separate components. It is present in both hemispheres and can be found lateral to the centrum semiovale and connects the frontal, occipital, parietal, and temporal lobes. This bundle of tracts (fasciculus) passes from the frontal lobe through the operculum to the posterior end of the lateral sulcus where they either radiate to and synapse on neurons in the occipital lobe, or turn downward and forward around the putamen and then radiate to and synapse on neurons in anterior portions of the temporal lobe.
The posterior parietal cortex plays an important role in planned movements, spatial reasoning, and attention.
The neuroanatomy of memory encompasses a wide variety of anatomical structures in the brain.
Extinction is a neurological disorder that impairs the ability to perceive multiple stimuli of the same type simultaneously. Extinction is usually caused by damage resulting in lesions on one side of the brain. Those who are affected by extinction have a lack of awareness in the contralesional side of space and a loss of exploratory search and other actions normally directed toward that side.
Amorphosynthesis, also called a hemi-sensory deficit, is a neuropsychological condition in which a patient experiences unilateral inattention to sensory input. This phenomenon is frequently associated with damage to the right cerebral hemisphere resulting in severe sensory deficits that are observed on the contralesional (left) side of the body. A right-sided deficit is less commonly observed and the effects are reported to be temporary and minor. Evidence suggests that the right cerebral hemisphere has a dominant role in attention and awareness to somatic sensations through ipsilateral and contralateral stimulation. In contrast, the left cerebral hemisphere is activated only by contralateral stimuli. Thus, the left and right cerebral hemispheres exhibit redundant processing to the right-side of the body and a lesion to the left cerebral hemisphere can be compensated by the ipsiversive processes of the right cerebral hemisphere. For this reason, right-sided amorphosynthesis is less often observed and is generally associated with bilateral lesions.
Auditory spatial attention is a specific form of attention, involving the focusing of auditory perception to a location in space.
Constructional apraxia is a neurological disorder in which people are unable to perform tasks or movements even though they understand the task, are willing to complete it, and have the physical ability to perform the movements. It is characterized by an inability or difficulty to build, assemble, or draw objects. Constructional apraxia may be caused by lesions in the parietal lobe following stroke or it may serve as an indicator for Alzheimer's disease.
Object-based attention refers to the relationship between an ‘object’ representation and a person’s visually stimulated, selective attention, as opposed to a relationship involving either a spatial or a feature representation; although these types of selective attention are not necessarily mutually exclusive. Research into object-based attention suggests that attention improves the quality of the sensory representation of a selected object, and results in the enhanced processing of that object’s features.
The Posner cueing task, also known as the Posner paradigm, is a neuropsychological test often used to assess attention. Formulated by Michael Posner, it assesses a person's ability to perform an attentional shift. It has been used and modified to assess disorders, focal brain injury, and the effects of both on spatial attention.
Dyschiria, also known as dyschiric syndrome, is a neurological disorder where one-half of an individual's body or space cannot be recognized or respond to sensations. The term dyschiria is rarely used in modern scientific research and literature. Dyschiria has been often referred to as unilateral neglect, visuo-spatial neglect, or hemispatial neglect from the 20th century onwards. Psychologists formerly characterized dyschiric patients to be unable to discriminate or report external stimuli. This left the patients incapable of orienting sensory responses in their extrapersonal and personal space. Patients with dyschiria are unable to distinguish one side of their body in general, or specific segments of the body. There are three stages to dyschiria: achiria, allochiria, and synchiria, in which manifestations of dyschiria evolve in varying degrees.
{{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link)