Prediction in language comprehension

Last updated

Linguistic prediction is a phenomenon in psycholinguistics occurring whenever information about a word or other linguistic unit is activated before that unit is actually encountered. Evidence from eyetracking, event-related potentials, and other experimental methods indicates that in addition to integrating each subsequent word into the context formed by previously encountered words, language users may, under certain conditions, try to predict upcoming words. In particular, prediction seems to occur regularly when the context of a sentence greatly limits the possible words that have not yet been revealed. For instance, a person listening to a sentence like, "In the summer it is hot, and in the winter it is..." would be highly likely to predict the sentence completion "cold" in advance of actually hearing it. A form of prediction is also thought to occur in some types of lexical priming, a phenomenon whereby a word becomes easier to process if it is preceded by a related word. [1] Linguistic prediction is an active area of research in psycholinguistics and cognitive neuroscience.

Contents

Evidence from eyetracking

Visual world paradigms

In the eyetracking visual world paradigm, experimental subjects listen to a sentence while staring at an array of pictures on a computer monitor. Their eye movements are recorded, allowing the experimenter to understand how language influences eye movements toward pictures related to the content of the sentence. Experiments of this type have shown that while listening to the verb in a sentence, comprehenders anticipatorily move their eyes to the picture of the verb's likely direct object (e.g. "cake" rather than "ball" while hearing, "The boy will eat..."). [2] Subsequent investigations using the same experimental setup showed that the verb's subject can also determine which object comprehenders anticipate (e.g., comprehenders look at the merry-go-round rather than the motorcycle while hearing, "The little girl will ride..."). [3] In short, comprehenders use the information in the sentence context to predict the meanings of upcoming words. In these experiments, comprehenders used the verb and its subject to activate information about the verb's direct object before hearing that word. However, another experiment has shown that in a language with more flexible word order (German), comprehenders can also use context to predict the sentence's subject. [4]

Natural reading

Eyetracking technology has also been used to monitor readers' eye movements while they read text on a computer screen. Data from this kind of experiment has supported the hypothesis that readers use contextual information to predict upcoming words during natural reading. Specifically, readers fixate their eyes on a word for a shorter time when the word occurs in a moderately or highly constraining context, compared to the same word in an unconstrained context. This is true regardless of the word's frequency or length. Readers are also more likely to skip over a word in a highly constraining context only. [5] Subsequent investigations of reading in the Chinese logographic script have shown that despite the large differences between the Chinese and English orthographies, readers exploit contextual information for prediction in similar ways, with the exception that Chinese readers were more likely to skip words in moderately constraining contexts. [6]

Computational models of eye movements during reading, which model data related to word predictability, include Reichle and colleagues' E-Z Reader model [7] and Engbert and colleagues' SWIFT model. [8]

M100

The M100 discussed here is the magnetic equivalent of the visual N1 potential—an event-related potential linked to visual processing and attention. The M100 was also linked to prediction in language comprehension in a series of event-related magnetoencephalography (MEG) experiments. In these experiments, participants read words whose visual forms were either predictable or unpredictable based on prior linguistic context [9] [10] or based on a recently seen picture. [11] The predictability of the word's visual form (but not the predictability of its meaning) affected the amplitude of the M100. There is ongoing controversy about whether this M100 effect is related to the early left anterior negativity (eLAN), an event-related potential response to words that is theorized to reflect the brain's assignment of local phrase structure. [12]

P2

The P2 component is generally thought to reflect higher-order perceptual processing and its modulation by attention. However, it has also been linked to prediction of visual word forms. The P2 response to words in highly constraining contexts is often larger than the P2 response to words in less constraining contexts. When experimental participants read words that are presented to the left or right of their visual fixation (stimulating the opposite hemisphere of the brain first), the larger P2 for words in highly constraining contexts is observed only for right visual field presentation (targeting left hemisphere). [13] This is consistent with the PARLO hypothesis that linguistic prediction is mainly a function of the left hemisphere, discussed below.

N400

The N400 is part of the normal ERP response to potentially meaningful stimuli, whose amplitude is inversely correlated with the predictability of a stimuli in a particular context. [14] In sentence processing, the predictability of a word is established by two related factors: 'cloze probability' and 'sentential constraint'. Cloze probability reflects the expectancy of a target word given the context of the sentence, which is determined by the percentage of individuals who supply the word when completing a sentence whose final word is missing. Kutas and colleagues found that the N400 to sentences final words with cloze probability of 90% was smaller (i.e., more positive) than the N400 for words with cloze probability of 70%, which was then smaller for words with cloze probability of 30%. Closely related, sentential constraint reflects the degree to which the context of the sentence constrains the number of acceptable continuations. Whereas cloze probability is the percent of individuals who choose a particular word, constraint is the number of different words chosen by a representative sample of individuals. Although words that are not predicted elicit a larger N400, the N400 to unpredicted words that are semantically related to the predicted word elicit a smaller N400 than when the unpredicted words are semantically unrelated. When the sentence context is highly constraining, semantically related words receive further facilitation in that the N400 to semantically related words is smaller in high constraint sentences than in low constraint sentences. [15] [16] [17] Evidence for the prediction of specific words comes from a study by DeLong et al. [18] DeLong and colleagues took advantage of the use of different indefinite articles, 'A' and 'AN' for English words that begin with a consonant or vowel respectively. They found that when the most probable sentence completion began with a consonant, the N400 was larger for 'AN' than for 'A' and vice versa, suggesting that prediction occurs at both a semantic and lexical level during language processing. (The study never replicated. In the most recent multi-lab attempt (335 participants), no evidence for word form prediction was found (Niewland et al., 2018).

Late positivity

P300

The P300, specifically the P3b is an ERP response to improbable stimuli and is sensitive to the subjective probability that a particular stimulus will occur. The P300 has been closely tied to context updating, which can be initiated by unexpected stimuli. [19]

P600

The P600 an ERP response to syntactic violations, as well as complex, but error free, language. [20] [21] A P600-like response is also observed for thematically implausible sentences: example, "For breakfast, the eggs would only EAT toast and jam". [22] Both P600 responses are generally attributed to the process of revising or continuing the analysis of the sentence. [23] The syntactic P600 has been compared to the P300 in that both responses are sensitive to similar manipulations; importantly, the probability of the stimulus. [24] The similarity between the two responses may suggest that the P300 significantly contributes to the syntactic P600 response.

Post-N400 positivity

A late positivity is often observed subsequent to the N400. Recent meta-analysis of the ERP literature on language processing has identified two different Post-N400 Positivities. [25] In comparing the Post-N400 Positivity (PNP) for congruent and incongruent sentence final words, a parietal PNP is observed for incongruent words. This parietal PNP is similar to the typical P600 response, suggesting continued or revised analysis. Within the congruent condition, when comparing high- and low-cloze probability sentence final words, a PNP response (if it is observed) is generally distributed across the front of the scalp. A recent study has shown that the frontal PNP may reflect processing an unexpected lexical item instead of an unexpected concept, suggesting that the frontal PNP reflects disconfirmed lexical predictions. [25]

Evidence from functional imaging

Functional magnetic resonance imaging (fMRI) is a neuroimaging technology that uses nuclear magnetic resonance to measure blood oxygenation levels in the brain and spinal cord. Because neural activity affects blood flow, the pattern of the hemodynamic response is thought to correspond closely to the pattern of neural activity. The fine spatial resolution afforded by fMRI allows cognitive neuroscientists to see in detail which areas of the brain are activated in relation to an experimental task. However, the hemodynamic response is much slower than the neural activity measured by EEG and MEG. This poor sensitivity to timing information makes fMRI a less useful technique than EEG or eyetracking for studying linguistic prediction. One exception is an fMRI test of the differences in neural activation between strategic and automatic semantic priming. When the time between the prime and the target word is short (around 150 milliseconds), priming is theorized to rely on automatic neural processes. However, at longer time intervals (approaching 1 second), it is thought that experimental subjects strategically predict related upcoming words and suppress unrelated words, leading to a processing penalty in the event that an unrelated word actually occurs. [1] An fMRI test of this hypothesis showed that at longer intervals, the processing penalty for an incorrect prediction is related to heightened activity in the anterior cingulate gyrus and Broca's area. [26]

Theories of prediction

PARLO theory

The PARLO ("Production Affects Reception in Left Only") framework is a theory of the neural domains supporting language prediction. It is based on evidence that shows that the left and right hemispheres differentially contribute to language comprehension. [17] Generally, the neural structures that support language production are predominantly in the left hemisphere for most individuals creating a hemispheric asymmetry, which results in differential language processing abilities of the two hemispheres. Because of its spatially close ties and integration with language production, left hemisphere language comprehension seems to be driven by expectancy and context in a top-down manner, whereas the right hemisphere seems to integrate information in a bottom-up manner. [17] The PARLO framework suggests that both prediction and integration occur during language processing but rely on the distinct contributions of the two hemispheres of the brain.

Surprisal theory

The surprisal theory is a theory of sentence processing based on information theory. [27] In the surprisal theory, the cost of processing a word is determined by its self-information, or how predictable the word is, given its context. A highly probable word carries a small amount of self-information and would therefore be processed easily, as measured by reduced reaction time, a smaller N400 response, or reduced fixation times in an eyetracking reading study. Empirical tests of this theory have shown a high degree of match between processing cost measures and the self-information values assigned to words. [28] [29]

Related Research Articles

<span class="mw-page-title-main">Language center</span> Speech processing areas of the brain

In neuroscience and psychology, the term language center refers collectively to the areas of the brain which serve a particular function for speech processing and production. Language is a core system which gives humans the capacity to solve difficult problems, and provides them with a unique type of social interaction. Language allows individuals to attribute symbols to specific concepts, and utilize them through sentences and phrases that follow proper grammatical rules. Finally, speech is the mechanism by which language is orally expressed.

<span class="mw-page-title-main">Broca's area</span> Speech production region in the dominant hemisphere of the hominid brain

Broca's area, or the Broca area, is a region in the frontal lobe of the dominant hemisphere, usually the left, of the brain with functions linked to speech production.

Psycholinguistics or psychology of language is the study of the interrelation between linguistic factors and psychological aspects. The discipline is mainly concerned with the mechanisms by which language is processed and represented in the mind and brain; that is, the psychological and neurobiological factors that enable humans to acquire, use, comprehend, and produce language.

<span class="mw-page-title-main">Neurolinguistics</span> Neuroscience and linguistics-related studies

Neurolinguistics is the study of neural mechanisms in the human brain that control the comprehension, production, and acquisition of language. As an interdisciplinary field, neurolinguistics draws methods and theories from fields such as neuroscience, linguistics, cognitive science, communication disorders and neuropsychology. Researchers are drawn to the field from a variety of backgrounds, bringing along a variety of experimental techniques as well as widely varying theoretical perspectives. Much work in neurolinguistics is informed by models in psycholinguistics and theoretical linguistics, and is focused on investigating how the brain can implement the processes that theoretical and psycholinguistics propose are necessary in producing and comprehending language. Neurolinguists study the physiological mechanisms by which the brain processes information related to language, and evaluate linguistic and psycholinguistic theories, using aphasiology, brain imaging, electrophysiology, and computer modeling.

The N400 is a component of time-locked EEG signals known as event-related potentials (ERP). It is a negative-going deflection that peaks around 400 milliseconds post-stimulus onset, although it can extend from 250-500 ms, and is typically maximal over centro-parietal electrode sites. The N400 is part of the normal brain response to words and other meaningful stimuli, including visual and auditory words, sign language signs, pictures, faces, environmental sounds, and smells.

Semantic dementia (SD), also known as semantic variant primary progressive aphasia (svPPA), is a progressive neurodegenerative disorder characterized by loss of semantic memory in both the verbal and non-verbal domains. However, the most common presenting symptoms are in the verbal domain. Semantic dementia is a disorder of semantic memory that causes patients to lose the ability to match words or images to their meanings. However, it is fairly rare for patients with semantic dementia to develop category specific impairments, though there have been documented cases of it occurring. Typically, a more generalized semantic impairment results from dimmed semantic representations in the brain.

Sentence processing takes place whenever a reader or listener processes a language utterance, either in isolation or in the context of a conversation or a text. Many studies of the human language comprehension process have focused on reading of single utterances (sentences) without context. Extensive research has shown that language comprehension is affected by context preceding a given utterance as well as many other factors.

The P600 is an event-related potential (ERP) component, or peak in electrical brain activity measured by electroencephalography (EEG). It is a language-relevant ERP component and is thought to be elicited by hearing or reading grammatical errors and other syntactic anomalies. Therefore, it is a common topic of study in neurolinguistic experiments investigating sentence processing in the human brain.

The early left anterior negativity is an event-related potential in electroencephalography (EEG), or component of brain activity that occurs in response to a certain kind of stimulus. It is characterized by a negative-going wave that peaks around 200 milliseconds or less after the onset of a stimulus, and most often occurs in response to linguistic stimuli that violate word-category or phrase structure rules. As such, it is frequently a topic of study in neurolinguistics experiments, specifically in areas such as sentence processing. While it is frequently used in language research, there is no evidence yet that it is necessarily a language-specific phenomenon.

When analysing the regularities and structure of music as well as the processing of music in the brain, certain findings lead to the question of whether music is based on a syntax that could be compared with linguistic syntax. To get closer to this question it is necessary to have a look at the basic aspects of syntax in language, as language unquestionably presents a complex syntactical system. If music has a matchable syntax, noteworthy equivalents to basic aspects of linguistic syntax have to be found in musical structure. By implication the processing of music in comparison to language could also give information about the structure of music.

Music semantics refers to the ability of music to convey semantic meaning. Semantics are a key feature of language, and whether music shares some of the same ability to prime and convey meaning has been the subject of recent study.

In neuroscience, the visual P200 or P2 is a waveform component or feature of the event-related potential (ERP) measured at the human scalp. Like other potential changes measurable from the scalp, this effect is believed to reflect the post-synaptic activity of a specific neural process. The P2 component, also known as the P200, is so named because it is a positive going electrical potential that peaks at about 200 milliseconds after the onset of some external stimulus. This component is often distributed around the centro-frontal and the parieto-occipital areas of the scalp. It is generally found to be maximal around the vertex of the scalp, however there have been some topographical differences noted in ERP studies of the P2 in different experimental conditions.

The late positive component or late positive complex (LPC) is a positive-going event-related brain potential (ERP) component that has been important in studies of explicit recognition memory. It is generally found to be largest over parietal scalp sites, beginning around 400–500 ms after the onset of a stimulus and lasting for a few hundred milliseconds. It is an important part of the ERP "old/new" effect, which may also include modulations of an earlier component similar to an N400. Similar positivities have sometimes been referred to as the P3b, P300, and P600. Here, we use the term "LPC" in reference to this late positive component.

Bilingual interactive activation plus (BIA+) is a model for understanding the process of bilingual language comprehension and consists of two interactive subsystems: the word identification subsystem and task/decision subsystem. It is the successor of the Bilingual Interactive Activation (BIA) model which was updated in 2002 to include phonologic and semantic lexical representations, revise the role of language nodes, and specify the purely bottom-up nature of bilingual language processing.

<span class="mw-page-title-main">Jabberwocky sentence</span>

A Jabberwocky sentence is a type of sentence of interest in neurolinguistics. Jabberwocky sentences take their name from the language of Lewis Carroll's well-known poem "Jabberwocky". In the poem, Carroll uses correct English grammar and syntax, but many of the words are made up and merely suggest meaning. A Jabberwocky sentence is therefore a sentence which uses correct grammar and syntax but contains nonsense words, rendering it semantically meaningless.

Embodied cognition occurs when an organism's sensorimotor capacities, body and environment play an important role in thinking. The way in which a person's body and their surroundings interacts also allows for specific brain functions to develop and in the future to be able to act. This means that not only does the mind influence the body's movements, but the body also influences the abilities of the mind, also termed the bi-directional hypothesis. There are three generalizations that are assumed to be true relating to embodied cognition. A person's motor system is activated when (1) they observe manipulable objects, (2) process action verbs, and (3) observe another individual's movements.

The bi-directional hypothesis of language and action proposes that the sensorimotor and language comprehension areas of the brain exert reciprocal influence over one another. This hypothesis argues that areas of the brain involved in movement and sensation, as well as movement itself, influence cognitive processes such as language comprehension. In addition, the reverse effect is argued, where it is proposed that language comprehension influences movement and sensation. Proponents of the bi-directional hypothesis of language and action conduct and interpret linguistic, cognitive, and movement studies within the framework of embodied cognition and embodied language processing. Embodied language developed from embodied cognition, and proposes that sensorimotor systems are not only involved in the comprehension of language, but that they are necessary for understanding the semantic meaning of words.

Cyma Kathryn Van Petten is an American cognitive neuroscientist known for electrophysiological studies of language, memory, and cognition. She is Professor of Psychology at the State University of New York at Binghamton where she directs the Event-Related Potential Lab. Van Petten was recipient of the Early Career Award from the Society for Psychophysiological Research in 1994.

Bidirectional Encoder Representations from Transformers (BERT) is a family of language models introduced in 2018 by researchers at Google. A 2020 literature survey concluded that "in a little over a year, BERT has become a ubiquitous baseline in Natural Language Processing (NLP) experiments counting over 150 research publications analyzing and improving the model."

Seana Coulson is a cognitive scientist known for her research on the neurobiology of language and studies of how meaning is constructed in human language, including experimental pragmatics, concepts, semantics, and metaphors. She is a professor in the Cognitive Science department at University of California, San Diego, where her Brain and Cognition Laboratory focuses on the cognitive neuroscience of language and reasoning.

References

  1. 1 2 Neely, J. (1991). Semantic priming in visual word recognition: a selective review of current theories and findings. In: Basic processes in reading: visual word recognition (Besner D, Humphreys GW, eds), pp 264 –336. Hillsdale, NJ: Erlbaum Associates.
  2. Altmann, G. & Kamide, Y. (1999). Incremental interpretation at verbs: restricting the domain of subsequent reference. Cognition, 73, 247–264.
  3. Kamide, Y., Altmann, G. & Haywood, S. (2003). The time-course of prediction in incremental sentence process- ing: evidence from anticipatory eye movements. Journal of Memory and Language, 49, 133–159.
  4. Kamide, Y., Scheepers, C. & Altmann, G. (2003). Integration of Syntactic and Semantic Information in Predictive Processing: Cross-Linguistic Evidence from German and English. Journal of Psycholinguistic Research, 32 (1), 37-55.
  5. Rayner, K. & Well, A. (1996). Effects of contextual constraint on eye movements in reading: A further examination. Psychonomic Bulletin and Review, 3 (4), 504-509.
  6. Rayner, K., Li, X., Juhasz, B. & Yan, G. (2005). The effect of word predictability on the eye movements of Chinese readers. Psychonomic Bulletin and Review, 12 (6), 1089-1093.
  7. Reichle, E., Rayner, K. & Pollatsek, A. (2003). The E-Z Reader model of eye movement control in reading: Comparisons to other models. Behavioral and Brain Sciences, 26, 445-526.
  8. Engbert, R., Nuthmann, A., Richter, E. & Kliegl, R. (2005). SWIFT: A dynamical model of saccade generation during reading. Psychological Review, 112 (4), 777-813.
  9. Dikker, S., Rabagliati, H., Farmer, T. & Pylkkänen, L. (2010). Early occipital sensitivity to syntactic category is based on form typicality. Psychological Science, 21(5), 629-634. doi : 10.1177/0956797610367751
  10. Dikker, S., Rabagliati, H. & Pylkkänen, L. (2009). Sensitivity to syntax in visual cortex. Cognition, 110 (3), 293-321. doi : 10.1016/J.COGNITION.2008.09.008
  11. Dikker, S. & Pylkkänen, L. (2011). Before the N400: Effects of lexical–semantic violations in visual cortex. Brain and Language 118, 23-28. doi : 10.1016/J.BANDL.2011.02.006
  12. Friederici, A. & Weissenborn, J. (2007). Mapping sentence form onto meaning: The syntax-semantic interface. Brain Research, 1146, 50-58.
  13. Wlotko, E. & Federmeier, K. (2007). Finding the right word: Hemispheric asymmetries in the use of sentence context information. Neuropsychologia, 45, 3001-3014.
  14. Kutas, M., & Hillyard, S. A., (1984). Brain Potentials during reading reflect word expectancy and semantic association. Nature, 307, 161-163.
  15. Federmeier, K. D., & Kutas, M. (1999). Right words and left words: electrophysiological evidence for hemispheric differences in meaning processing. Cognitive Brain Research, 8, 373-392.
  16. Federmeier, K. D., McLennan, D. B., De Ochoa, E., Kutas, M. (2002). The impact of semantic memory organization and sentence context information on spoken language processing by younger and older adults: an ERP study. Psychophysiology, 39, 133-146.
  17. 1 2 3 Federmeier, K. D. (2007). Thinking ahead: the role and roots of prediction in language comprehension. Psychophysiology, 44, 491-505.
  18. DeLong, K. A., Urbach, T. P., Kutas, M. (2005). Probabilistic word pre-activation during language comprehension inferred from electrical brain activity. Nature Neuroscience, 8, 1117-1145.
  19. Donchin, E., & Coles, M. G. H., (1988). Is the P300 component a manifestation of context updating? The Behavioral and Brain Sciences, 11, 357–374.
  20. Osterhout, L., & Holcomb, P. J. (1992). Event-related brain potentials elicited by syntactic anomaly. Journal of Memory and Language, 31(6), 785-806.
  21. Friederici, A. D., Hahne, A., & Mecklinger, A. (1996). Temporal structure of syntactic parsing: early and late event-related brain potential effects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 1219-1248.
  22. Kuperberg, G. R., Sitnikova, T., Caplan, D., Holcomb, P. (2003). Electrophysiological distinctions in processing conceptual relationships within simple sentences. Cognitive Brain Research, 17, 117-129.
  23. Kuperberg, G. R. (2007). Neural mechanisms of language comprehension: challenges to syntax. Brain Research, 1146, 23-49.
  24. Coulson, S., King, J. W., & Kutas, M. (1998). Expect the unexpected: event-related brain response to morphosyntactic violations. Language and Cognitive Processes, 13, 21-58.
  25. 1 2 Van Petten, C., Luka, B. J., Prediction during language comprehension: Benefits, costs, and ERP components. International Journal of Psychophsiology. (2011). doi : 10.1016/j.jpsycho.2011.09.015
  26. Gold, B., Balota, D., Jones, S., Powell, D., Smith, C., & Andersen, A. (2006). Dissociation of Automatic and Strategic Lexical-Semantics: Functional Magnetic Resonance Imaging Evidence for Differing Roles of Multiple Frontotemporal Regions. Journal of Neuroscience, 26 (24), 6523-6532.
  27. Levy, R. (2008). Expectation-based syntactic comprehension. Cognition, 106(3), 1126-1177.
  28. Levy, R., Fedorenko, E., Breen, M. and Gibson, T. (2011). The processing of extraposed structures in English. Cognition, 122(1), 12-36.
  29. Levy, R. (2011). Integrating surprisal and uncertain-input models in online sentence comprehension: formal techniques and empirical results. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics.