Statistical learning in language acquisition

Last updated

Statistical learning is the ability for humans and other animals to extract statistical regularities from the world around them to learn about the environment. Although statistical learning is now thought to be a generalized learning mechanism, the phenomenon was first identified in human infant language acquisition.

Contents

The earliest evidence for these statistical learning abilities comes from a study by Jenny Saffran, Richard Aslin, and Elissa Newport, in which 8-month-old infants were presented with nonsense streams of monotone speech. Each stream was composed of four three-syllable "pseudowords" that were repeated randomly. After exposure to the speech streams for two minutes, infants reacted differently to hearing "pseudowords" as opposed to "nonwords" from the speech stream, where nonwords were composed of the same syllables that the infants had been exposed to, but in a different order. This suggests that infants are able to learn statistical relationships between syllables even with very limited exposure to a language. That is, infants learn which syllables are always paired together and which ones only occur together relatively rarely, suggesting that they are parts of two different units. This method of learning is thought to be one way that children learn which groups of syllables form individual words. [ citation needed ]

Since the initial discovery of the role of statistical learning in lexical acquisition, the same mechanism has been proposed for elements of phonological acquisition, and syntactical acquisition, as well as in non-linguistic domains. Further research has also indicated that statistical learning is likely a domain-general and even species-general learning mechanism, occurring for visual as well as auditory information, and in both primates and non-primates.

Lexical acquisition

The role of statistical learning in language acquisition has been particularly well documented in the area of lexical acquisition. [1] One important contribution to infants' understanding of segmenting words from a continuous stream of speech is their ability to recognize statistical regularities of the speech heard in their environments. [1] Although many factors play an important role, this specific mechanism is powerful and can operate over a short time scale. [1]

Original findings

A spectrogram of a male speaker saying the phrase "nineteenth century". There is no clear demarcation where one word ends and the next begins. Spectrogram-19thC.png
A spectrogram of a male speaker saying the phrase "nineteenth century". There is no clear demarcation where one word ends and the next begins.

It is a well-established finding that, unlike written language, spoken language does not have any clear boundaries between words; spoken language is a continuous stream of sound rather than individual words with silences between them. [2] This lack of segmentation between linguistic units presents a problem for young children learning language, who must be able to pick out individual units from the continuous speech streams that they hear. [3] One proposed method of how children are able to solve this problem is that they are attentive to the statistical regularities of the world around them. [2] [3] For example, in the phrase "pretty baby", children are more likely to hear the sounds pre and ty heard together during the entirety of the lexical input around them than they are to hear the sounds ty and ba together. [3] In an artificial grammar learning study with adult participants, Saffran, Newport, and Aslin found that participants were able to locate word boundaries based only on transitional probabilities, suggesting that adults are capable of using statistical regularities in a language-learning task. [4] This is a robust finding that has been widely replicated. [1]

To determine if young children have these same abilities Saffran Aslin and Newport exposed 8-month-old infants to an artificial grammar. [3] The grammar was composed of four words, each composed of three nonsense syllables. During the experiment, infants heard a continuous speech stream of these words. The speech was presented in a monotone with no cues (such as pauses, intonation, etc.) to word boundaries other than the statistical probabilities. Within a word, the transitional probability of two syllable pairs was 1.0: in the word bidaku, for example, the probability of hearing the syllable da immediately after the syllable bi was 100%. Between words, however, the transitional probability of hearing a syllable pair was much lower: After any given word (e.g., bidaku) was presented, one of three words could follow (in this case, padoti, golabu, or tupiro), so the likelihood of hearing any given syllable after ku was only 33%.

To determine if infants were picking up on the statistical information, each infant was presented with multiple presentations of either a word from the artificial grammar or a nonword made up of the same syllables but presented in a random order. Infants who were presented with nonwords during the test phase listened significantly longer to these words than infants who were presented with words from the artificial grammar, showing a novelty preference for these new nonwords. However, the implementation of the test could also be due to infants learning serial-order information and not to actually learning transitional probabilities between words. That is, at test, infants heard strings such as dapiku and tilado that were never presented during learning; they could simply have learned that the syllable ku never followed the syllable pi. [3]

To look more closely at this issue, Saffran Aslin and Newport conducted another study in which infants underwent the same training with the artificial grammar but then were presented with either words or part-words rather than words or nonwords. [3] The part-words were syllable sequences composed of the last syllable from one word and the first two syllables from another (such as kupado). Because the part-words had been heard during the time when children were listening to the artificial grammar, preferential listening to these part-words would indicate that children were learning not only serial-order information, but also the statistical likelihood of hearing particular syllable sequences. Again, infants showed greater listening times to the novel (part-) words, indicating that 8-month-old infants were able to extract these statistical regularities from a continuous speech stream.

Further research

This result has been the impetus for much more research on the role of statistical learning in lexical acquisition and other areas (see [1] ). In a follow-up to the original report, [3] Aslin, Saffran, and Newport found that even when words and part words occurred equally often in the speech stream, but with different transitional probabilities between syllables of words and part words, infants were still able to detect the statistical regularities and still preferred to listen to the novel part-words over the familiarized words. [5] This finding provides stronger evidence that infants are able to pick up transitional probabilities from the speech they hear, rather than just being aware of frequencies of individual syllable sequences. [1]

Another follow-up study examined the extent to which the statistical information learned during this type of artificial grammar learning feeds into knowledge that infants may already have about their native language. [6] Infants preferred to listen to words over part-words, whereas there was no significant difference in the nonsense frame condition. This finding suggests that even pre-linguistic infants are able to integrate the statistical cues they learn in a laboratory into their previously acquired knowledge of a language. [1] [6] In other words, once infants have acquired some linguistic knowledge, they incorporate newly acquired information into that previously acquired learning.

A related finding indicates that slightly older infants can acquire both lexical and grammatical regularities from a single set of input, [7] suggesting that they are able to use outputs of one type of statistical learning (cues that lead to the discovery of word boundaries) as input to a second type (cues that lead to the discovery of syntactical regularities. [1] [7] At test, 12-month-olds preferred to listen to sentences that had the same grammatical structure as the artificial language they had been tested on rather than sentences that had a different (ungrammatical) structure. Because learning grammatical regularities requires infants to be able to determine boundaries between individual words, this indicates that infants who are still quite young are able to acquire multiple levels of language knowledge (both lexical and syntactical) simultaneously, indicating that statistical learning is a powerful mechanism at play in language learning. [1] [7]

Despite the large role that statistical learning appears to play in lexical acquisition, it is likely not the only mechanism by which infants learn to segment words. Statistical learning studies are generally conducted with artificial grammars that have no cues to word boundary information other than transitional probabilities between words. Real speech, though, has many different types of cues to word boundaries, including prosodic and phonotactic information. [8]

Together, the findings from these studies of statistical learning in language acquisition indicate that statistical properties of the language are a strong cue in helping infants learn their first language. [1]

Phonological acquisition

There is much evidence that statistical learning is an important component of both discovering which phonemes are important for a given language and which contrasts within phonemes are important. [9] [10] [11] Having this knowledge is important for aspects of both speech perception and speech production.

Distributional learning

Since the discovery of infants' statistical learning abilities in word learning, the same general mechanism has also been studied in other facets of language learning. For example, it is well-established that infants can discriminate between phonemes of many different languages but eventually become unable to discriminate between phonemes that do not appear in their native language; [12] however, it was not clear how this decrease in discriminatory ability came about. Maye et al. suggested that the mechanism responsible might be a statistical learning mechanism in which infants track the distributional regularities of the sounds in their native language. [12] To test this idea, Maye et al. exposed 6- and 8-month-old infants to a continuum of speech sounds that varied on the degree to which they were voiced. The distribution that the infants heard was either bimodal, with sounds from both ends of the voicing continuum heard most often, or unimodal, with sounds from the middle of the distribution heard most often. The results indicated that infants from both age groups were sensitive to the distribution of phonemes. At test, infants heard either non-alternating (repeated exemplars of tokens 3 or 6 from an 8-token continuum) or alternating (exemplars of tokens 1 and 8) exposures to specific phonemes on the continuum. Infants exposed to the bimodal distribution listened longer to the alternating trials than the non-alternating trials while there was no difference in listening times for infants exposed to the unimodal distribution. This finding indicates that infants exposed the bimodal distribution were better able to discriminate sounds from the two ends of the distribution than were infants in the unimodal condition, regardless of age. This type of statistical learning differs from that used in lexical acquisition, as it requires infants to track frequencies rather than transitional probabilities, and has been named "distributional learning". [10]

Distributional learning has also been found to help infants contrast two phonemes that they initially have difficulty in discriminating between. Maye, Weiss, and Aslin found that infants who were exposed to a bimodal distribution of a non-native contrast that was initially difficult to discriminate were better able to discriminate the contrast than infants exposed to a unimodal distribution of the same contrast. [13] Maye et al. also found that infants were able to abstract features of a contrast (i.e., voicing onset time) and generalize that feature to the same type of contrast at a different place of articulation, a finding that has not been found in adults.

In a review of the role of distributional learning on phonological acquisition, Werker et al. note that distributional learning cannot be the only mechanism by which phonetic categories are acquired. [10] However, it does seem clear that this type of statistical learning mechanism can play a role in this skill, although research is ongoing. [10]

Perceptual magnet effect

A related finding regarding statistical cues to phonological acquisition is a phenomenon known as the perceptual magnet effect. [14] [15] [16] In this effect, a prototypical phoneme of a person's native language acts as a "magnet" for similar phonemes, which are perceived as belonging to the same category as the prototypical phoneme. In the original test of this effect, adult participants were asked to indicate if a given exemplar of a particular phoneme differed from a referent phoneme. [14] If the referent phoneme is a non-prototypical phoneme for that language, both adults and 6-month-old infants show less generalization to other sounds than they do for prototypical phonemes, even if the subjective distance between the sounds is the same. [14] [16] That is, adults and infants are both more likely to notice that a particular phoneme differs from the referent phoneme if that referent phoneme is a non-prototypical exemplar than if it is a prototypical exemplar. The prototypes themselves are apparently discovered through a distributional learning process, in which infants are sensitive to the frequencies with which certain sounds occur and treat those that occur most often as the prototypical phonemes of their language. [11]

Syntactical acquisition

A statistical learning device has also been proposed as a component of syntactical acquisition for young children. [1] [9] [17] Early evidence for this mechanism came largely from studies of computer modeling or analyses of natural language corpora. [18] [19] These early studies focused largely on distributional information specifically rather than statistical learning mechanisms generally. Specifically, in these early papers it was proposed that children created templates of possible sentence structures involving unnamed categories of word types (i.e., nouns or verbs, although children would not put these labels on their categories). Children were thought to learn which words belonged to the same categories by tracking the similar contexts in which words of the same category appeared.

Later studies expanded these results by looking at the actual behavior of children or adults who had been exposed to artificial grammars. [9] These later studies also considered the role of statistical learning more broadly than the earlier studies, placing their results in the context of the statistical learning mechanisms thought to be involved with other aspects of language learning, such as lexical acquisition.

Experimental results

Evidence from a series of four experiments conducted by Gomez and Gerken suggests that children are able to generalize grammatical structures with less than two minutes of exposure to an artificial grammar. [9] [20] In the first experiment, 11-12 month-old infants were trained on an artificial grammar composed of nonsense words with a set grammatical structure. At test, infants heard both novel grammatical and ungrammatical sentences. Infants oriented longer toward the grammatical sentences, in line with previous research that suggests that infants generally orient for a longer amount of time to natural instances of language rather than altered instances of language e.g.,. [21] (This familiarity preference differs from the novelty preference generally found in word-learning studies, due to the differences between lexical acquisition and syntactical acquisition.) This finding indicates that young children are sensitive to the grammatical structure of language even after minimal exposure. Gomez and Gerken also found that this sensitivity is evident when ungrammatical transitions are located in the middle of the sentence (unlike in the first experiment, in which all the errors occurred at the beginning and end of the sentences), that the results could not be due to an innate preference for the grammatical sentences caused by something other than grammar, and that children are able to generalize the grammatical rules to new vocabulary.

Together these studies suggest that infants are able to extract a substantial amount of syntactic knowledge even from limited exposure to a language. [9] [20] Children apparently detected grammatical anomalies whether the grammatical violation in the test sentences occurred at the end or in the middle of the sentence. Additionally, even when the individual words of the grammar were changed, infants were still able to discriminate between grammatical and ungrammatical strings during the test phase. This generalization indicates that infants were not learning vocabulary-specific grammatical structures, but abstracting the general rules of that grammar and applying those rules to novel vocabulary. Furthermore, in all four experiments, the test of grammatical structures occurred five minutes after the initial exposure to the artificial grammar had ended, suggesting that the infants were able to maintain the grammatical abstractions they had learned even after a short delay.

In a similar study, Saffran found that adults and older children (first- and second grade children) were also sensitive to syntactical information after exposure to an artificial language which had no cues to phrase structure other than the statistical regularities that were present. [22] Both adults and children were able to pick out sentences that were ungrammatical at a rate greater than chance, even under an "incidental" exposure condition in which participants' primary goal was to complete a different task while hearing the language.

Although the number of studies dealing with statistical learning of syntactical information is limited, the available evidence does indicate that the statistical learning mechanisms are likely a contributing factor to children's ability to learn their language. [9] [17]

Statistical learning in bilingualism

Much of the early work using statistical learning paradigms focused on the ability for children or adults to learn a single language, [1] consistent with the process of language acquisition for monolingual speakers or learners. However, it is estimated that approximately 60-75% of people in the world are bilingual. [23] More recently, researchers have begun looking at the role of statistical learning for those who speak more than one language. Although there are no reviews on this topic yet, Weiss, Gerfen, and Mitchel examined how hearing input from multiple artificial languages simultaneously can affect the ability to learn either or both languages. [24] Over four experiments, Weiss et al. found that, after exposure to two artificial languages, adult learners are capable of determining word boundaries in both languages when each language is spoken by a different speaker. However, when the two languages were spoken by the same speaker, participants were able learn both languages only when they were "congruent"—when the word boundaries of one language matched the word boundaries of the other. When the languages were incongruent—a syllable that appeared in the middle of a word in one language appeared at the end of the word in the other language—and spoken by a single speaker, participants were able to learn, at best, one of the two languages. A final experiment showed that the inability to learn incongruent languages spoken in the same voice was not due to syllable overlap between the languages but due to differing word boundaries.

Similar work replicates the finding that learners are able to learn two sets of statistical representations when an additional cue is present (two different male voices in this case). [25] In their paradigm, the two languages were presented consecutively, rather than interleaved as in Weiss et al.'s paradigm, [24] and participants did learn the first artificial language to which they had been exposed better than the second, although participants' performance was above chance for both languages.

While statistical learning improves and strengthens multilingualism, it appears that the inverse is not true. In a study by Yim and Rudoy [26] it was found that both monolingual and bilingual children perform statistical learning tasks equally well.

Antovich and Graf Estes [27] found that 14-month-old bilingual children are better than monolinguals at segmenting two different artificial languages using transitional probability cues. They suggest that a bilingual environment in early childhood trains children to rely on statistical regularities to segment the speech flow and access two lexical systems.

Limitations on statistical learning

Word-referent mapping

A statistical learning mechanism has also been proposed for learning the meaning of words. Specifically, Yu and Smith conducted a pair of studies in which adults were exposed to pictures of objects and heard nonsense words. [28] Each nonsense word was paired with a particular object. There were 18 total word-referent pairs, and each participant was presented with either 2, 3, or 4 objects at a time, depending on the condition, and heard the nonsense word associated with one of those objects. Each word-referent pair was presented 6 times over the course of the training trials; after the completion of the training trials, participants completed a forced-alternative test in which they were asked to choose the correct referent that matched a nonsense word they were given. Participants were able to choose the correct item more often than would happen by chance, indicating, according to the authors, that they were using statistical learning mechanisms to track co-occurrence probabilities across training trials.

An alternative hypothesis is that learners in this type of task may be using a "propose-but-verify" mechanism rather than a statistical learning mechanism. [29] [30] Medina et al. and Trueswell et al. argue that, because Yu and Smith only tracked knowledge at the end of the training, rather than tracking knowledge on a trial-by-trial basis, it is impossible to know if participants were truly updating statistical probabilities of co-occurrence (and therefore maintaining multiple hypotheses simultaneously), or if, instead, they were forming a single hypothesis and checking it on the next trial. [28] [29] [30] For example, if a participant is presented with a picture of a dog and a picture of a shoe, and hears the nonsense word vash she might hypothesize that vash refers to the dog. On a future trial, she may see a picture of a shoe and a picture of a door and again hear the word vash. If statistical learning is the mechanism by which word-referent mappings are learned, then the participant would be more likely to select the picture of the shoe than the door, as shoe would have appeared in conjunction with the word vash 100% of the time. However, if participants are simply forming a single hypothesis, they may fail to remember the context of the previous presentation of vash (especially if, as in the experimental conditions, there are multiple trials with other words in between the two presentations of vash) and therefore be at chance in this second trial. According to this proposed mechanism of word learning, if the participant had correctly guessed that vash referred to the shoe in the first trial, her hypothesis would be confirmed in the subsequent trial.

To distinguish between these two possibilities, Trueswell et al. conducted a series of experiments similar to those conducted by Yu and Smith except that participants were asked to indicate their choice of the word-referent mapping on each trial, and only a single object name was presented on each trial (with varying numbers of objects). [28] [30] Participants would therefore have been at chance when they are forced to make a choice in their first trial. The results from the subsequent trials indicate that participants were not using a statistical learning mechanism in these experiments, but instead were using a propose-and-verify mechanism, holding only one potential hypothesis in mind at a time. Specifically, if participants had chosen an incorrect word-referent mapping in an initial presentation of a nonsense word (from a display of five possible choices), their likelihood of choosing the correct word-referent mapping in the next trial of that word was still at chance, or 20%. If, though, the participant had chosen the correct word-referent mapping on an initial presentation of a nonsense word, the likelihood of choosing the correct word-referent mapping on the subsequent presentation of that word was approximately 50%. These results were also replicated in a condition where participants were choosing between only two alternatives. These results suggest that participants did not remember the surrounding context of individual presentations and were therefore not using statistical cues to determine the word-referent mappings. Instead, participants make a hypothesis regarding a word-referent mapping and, on the next presentation of that word, either confirm or reject the hypothesis accordingly.

Overall, these results, along with similar results from Medina et al., indicate that word meanings may not be learned through a statistical learning mechanism in these experiments, which ask participants to hypothesize a mapping even on the first occurrence (i.e., not cross-situationally). [29] However, when the propose-but-verify mechanism has been compared to a statistical learning mechanism, the former was unable to reproduce individual learning trajectories nor fit as well as the latter. [31]

Need for social interaction

Additionally, statistical learning by itself cannot account even for those aspects of language acquisition for which it has been shown to play a large role. For example, Kuhl, Tsao, and Liu found that young English-learning infants who spent time in a laboratory session with a native Mandarin speaker were able to distinguish between phonemes that occur in Mandarin but not in English, unlike infants who were in a control condition. [32] Infants in this control condition came to the lab as often as infants in the experimental condition, but were exposed only to English; when tested at a later date, they were unable to distinguish the Mandarin phonemes. In a second experiment, the authors presented infants with audio or audiovisual recordings of Mandarin speakers and tested the infants' ability to distinguish between the Mandarin phonemes. In this condition, infants failed to distinguish the foreign language phonemes. This finding indicates that social interaction is a necessary component of language learning and that, even if infants are presented with the raw data of hearing a language, they are unable to take advantage of the statistical cues present in that data if they are not also experiencing the social interaction. [11]

Domain generality

Although the phenomenon of statistical learning was first discovered in the context of language acquisition and there is much evidence of its role in that purpose, work since the original discovery has suggested that statistical learning may be a domain general skill and is likely not unique to humans. [3] [33] For example, Saffran, Johnson, Aslin, and Newport found that both adults and infants were able to learn statistical probabilities of "words" created by playing different musical tones (i.e., participants heard the musical notes D, E, and F presented together during training and were able to recognize those notes as a unit at test as compared to three notes that had not been presented together). [34] In non-auditory domains, there is evidence that humans are able to learn statistical visual information whether that information is presented across space, e.g., [35] or time, e.g.,. [36] Evidence of statistical learning has also been found in other primates, e.g., [37] and some limited statistical learning abilities have been found even in non-primates like rats. [38] Together these findings suggest that statistical learning may be a generalized learning mechanism that happens to be utilized in language acquisition, rather than a mechanism that is unique to the human infant's ability to learn his or her language(s).

Further evidence for domain general statistical learning was suggested in a study run through the University of Cornell Department of Psychology concerning visual statistical learning in infancy. Researchers in this study questioned whether domain generality of statistical learning in infancy would be seen using visual information. After first viewing images in statistically predictable patterns, infants were then exposed to the same familiar patterns in addition to novel sequences of the same identical stimulus components. Interest in the visuals was measured by the amount of time the child looked at the stimuli in which the researchers named "looking time". All ages of infant participants showed more interest in the novel sequence relative to the familiar sequence. In demonstrating a preference for the novel sequences (which violated the transitional probability that defined the grouping of the original stimuli) the results of the study support the likelihood of domain general statistical learning in infancy. [39]

Related Research Articles

Language acquisition is the process by which humans acquire the capacity to perceive and comprehend language, as well as to produce and use words and sentences to communicate.

Phonotactics is a branch of phonology that deals with restrictions in a language on the permissible combinations of phonemes. Phonotactics defines permissible syllable structure, consonant clusters and vowel sequences by means of phonotactic constraints.

Baby talk is a type of speech associated with an older person speaking to a child or infant. It is also called caretaker speech, infant-directed speech (IDS), child-directed speech (CDS), child-directed language (CDL), caregiver register, parentese, or motherese.

In cognitive psychology, fast mapping is the term used for the hypothesized mental process whereby a new concept is learned based only on minimal exposure to a given unit of information. Fast mapping is thought by some researchers to be particularly important during language acquisition in young children, and may serve to explain the prodigious rate at which children gain vocabulary. In order to successfully use the fast mapping process, a child must possess the ability to use "referent selection" and "referent retention" of a novel word. There is evidence that this can be done by children as young as two years old, even with the constraints of minimal time and several distractors. Previous research in fast mapping has also shown that children are able to retain a newly learned word for a substantial amount of time after they are subjected to the word for the first time. Further research by Markson and Bloom (1997), showed that children can remember a novel word a week after it was presented to them even with only one exposure to the novel word. While children have also displayed the ability to have equal recall for other types of information, such as novel facts, their ability to extend the information seems to be unique to novel words. This suggests that fast mapping is a specified mechanism for word learning. The process was first formally articulated and the term 'fast mapping' coined Susan Carey and Elsa Bartlett in 1978.

<span class="mw-page-title-main">Vocabulary development</span> Process of learning words

Vocabulary development is a process by which people acquire words. Babbling shifts towards meaningful speech as infants grow and produce their first words around the age of one year. In early word learning, infants build their vocabulary slowly. By the age of 18 months, infants can typically produce about 50 words and begin to make word combinations.

Language development in humans is a process which starts early in life. Infants start without knowing a language, yet by 10 months, babies can distinguish speech sounds and engage in babbling. Some research has shown that the earliest learning begins in utero when the fetus starts to recognize the sounds and speech patterns of its mother's voice and differentiate them from other sounds after birth.

Poverty of the stimulus (POS) is the controversial argument from linguistics that children are not exposed to rich enough data within their linguistic environments to acquire every feature of their language. This is considered evidence contrary to the empiricist idea that language is learned solely through experience. The claim is that the sentences children hear while learning a language do not contain the information needed to develop a thorough understanding of the grammar of the language.

Speech segmentation is the process of identifying the boundaries between words, syllables, or phonemes in spoken natural languages. The term applies both to the mental processes used by humans, and to artificial processes of natural language processing.

Speech perception is the process by which the sounds of language are heard, interpreted, and understood. The study of speech perception is closely linked to the fields of phonology and phonetics in linguistics and cognitive psychology and perception in psychology. Research in speech perception seeks to understand how human listeners recognize speech sounds and use this information to understand spoken language. Speech perception research has applications in building computer systems that can recognize speech, in improving speech recognition for hearing- and language-impaired listeners, and in foreign-language teaching.

Artificial grammar learning (AGL) is a paradigm of study within cognitive psychology and linguistics. Its goal is to investigate the processes that underlie human language learning by testing subjects' ability to learn a made-up grammar in a laboratory setting. It was developed to evaluate the processes of human language learning but has also been utilized to study implicit learning in a more general sense. The area of interest is typically the subjects' ability to detect patterns and statistical regularities during a training phase and then use their new knowledge of those patterns in a testing phase. The testing phase can either use the symbols or sounds used in the training phase or transfer the patterns to another set of symbols or sounds as surface structure.

Bootstrapping is a term used in language acquisition in the field of linguistics. It refers to the idea that humans are born innately equipped with a mental faculty that forms the basis of language. It is this language faculty that allows children to effortlessly acquire language. As a process, bootstrapping can be divided into different domains, according to whether it involves semantic bootstrapping, syntactic bootstrapping, prosodic bootstrapping, or pragmatic bootstrapping.

Speech production is the process by which thoughts are translated into speech. This includes the selection of words, the organization of relevant grammatical forms, and then the articulation of the resulting sounds by the motor system using the vocal apparatus. Speech production can be spontaneous such as when a person creates the words of a conversation, reactive such as when they name a picture or read aloud a written word, or imitative, such as in speech repetition. Speech production is not the same as language production since language can also be produced manually by signs.

Phonological development refers to how children learn to organize sounds into meaning or language (phonology) during their stages of growth.

In linguistics, the innateness hypothesis, also known as the nativist hypothesis, holds that humans are born with at least some knowledge of linguistic structure. On this hypothesis, language acquisition involves filling in the details of an innate blueprint rather than being an entirely inductive process. The hypothesis is one of the cornerstones of generative grammar and related approaches in linguistics. Arguments in favour include the poverty of the stimulus, the universality of language acquisition, as well as experimental studies on learning and learnability. However, these arguments have been criticized, and the hypothesis is widely rejected in other traditions such as usage-based linguistics. The term was coined by Hilary Putnam in reference to the views of Noam Chomsky.

Elissa Lee Newport is a professor of neurology and director of the Center for Brain Plasticity and Recovery at Georgetown University. She specializes in language acquisition and developmental psycholinguistics, focusing on the relationship between language development and language structure, and most recently on the effects of pediatric stroke on the organization and recovery of language.

Richard N. Aslin is an American psychologist. He is currently a Senior Scientist at Haskins Laboratories and professor at Yale University. Until December, 2016, Dr. Aslin was William R. Kenan Professor of Brain & Cognitive Sciences and Center for Visual Sciences at the University of Rochester. During his time in Rochester, he was also Director of the Rochester Center for Brain Imaging and the Rochester Baby Lab. He had worked at the university for over thirty years, until he resigned in protest of the university's handling of a sexual harassment complaint about a junior member of his department.

In linguistics, functional morphemes, also sometimes referred to as functors, are building blocks for language acquisition. A functional morpheme is a morpheme which simply modifies the meaning of a word, rather than supplying the root meaning. Functional morpheme are generally considered a closed class, which means that new functional morphemes cannot normally be created.

Statistical language acquisition, a branch of developmental psycholinguistics, studies the process by which humans develop the ability to perceive, produce, comprehend, and communicate with natural language in all of its aspects through the use of general learning mechanisms operating on statistical patterns in the linguistic input. Statistical learning acquisition claims that infants' language-learning is based on pattern perception rather than an innate biological grammar. Several statistical elements such as frequency of words, frequent frames, phonotactic patterns and other regularities provide information on language structure and meaning for facilitation of language acquisition.

In the framework of Noam Chomsky's Minimalist Program, items of the lexicon are of two types: with or without substantive content. Items of the former category are called lexical items, whereas items of the latter category are functional items. Functional items carry the grammatical content of a sentence, which means that by taking them out of the sentence one would still understand the meaning, although it would not be grammatical. In other words, they are the 'glue' that holds the sentence together. Functional items can also be classified as closed class, that is, belonging to parts of speech that do not easily allow new members. If functional items are removed from a sentence, the words that would be left are the lexical items. The lexical items of a sentence are those that are used in telegraphic speech; functional items are the grammatical units that hold the sentence together and make it more fluid. Functional Items are feature sets. Functional items include two type of morphemes. Free morphemes, like modals, auxiliaries, determiners, complementizers and bound morphemes such as nominal and verbal affixes. Though functional items have feature structure, the do not enter into θ-marking.

Prosodic bootstrapping in linguistics refers to the hypothesis that learners of a primary language (L1) use prosodic features such as pitch, tempo, rhythm, amplitude, and other auditory aspects from the speech signal as a cue to identify other properties of grammar, such as syntactic structure. Acoustically signaled prosodic units in the stream of speech may provide critical perceptual cues by which infants initially discover syntactic phrases in their language. Although these features by themselves are not enough to help infants learn the entire syntax of their native language, they provide various cues about different grammatical properties of the language, such as identifying the ordering of heads and complements in the language using stress prominence, indicating the location of phrase boundaries, and word boundaries. It is argued that prosody of a language plays an initial role in the acquisition of the first language helping children to uncover the syntax of the language, mainly due to the fact that children are sensitive to prosodic cues at a very young age.

References

  1. 1 2 3 4 5 6 7 8 9 10 11 12 Saffran, Jenny R. (2003). "Statistical language learning: mechanisms and constraints". Current Directions in Psychological Science. 12 (4): 110–114. doi:10.1111/1467-8721.01243. S2CID   146485087.
  2. 1 2 Brent, Michael R.; Cartwright, Timothy A. (1996). "Distributional regularity and phonotactic constraints are useful for segmentation". Cognition. 61 (1–2): 93–125. doi: 10.1016/S0010-0277(96)00719-6 . PMID   8990969. S2CID   16185790.
  3. 1 2 3 4 5 6 7 8 Saffran, J. R.; Aslin, R. N.; Newport, E. L. (1996). "Statistical Learning by 8-Month-Old Infants". Science. 274 (5294): 1926–1928. Bibcode:1996Sci...274.1926S. doi:10.1126/science.274.5294.1926. PMID   8943209. S2CID   13321604.
  4. Saffran, Jenny R.; Newport, Elissa L.; Aslin, Richard N. (1996). "Word Segmentation: The Role of Distributional Cues". Journal of Memory and Language. 35 (4): 606–621. doi: 10.1006/jmla.1996.0032 .
  5. Aslin, R. N.; Saffran, J. R.; Newport, E. L. (1998). "Computation of Conditional Probability Statistics by 8-Month-Old Infants". Psychological Science. 9 (4): 321–324. doi:10.1111/1467-9280.00063. S2CID   14144332.
  6. 1 2 Saffran, Jenny R (2001a). "Words in a sea of sounds: the output of infant statistical learning". Cognition. 81 (2): 149–169. doi:10.1016/S0010-0277(01)00132-9. PMID   11376640. S2CID   14487666.
  7. 1 2 3 Saffran, Jenny R.; Wilson, Diana P. (2003). "From Syllables to Syntax: Multilevel Statistical Learning by 12-Month-Old Infants". Infancy. 4 (2): 273–284. doi:10.1207/S15327078IN0402_07.
  8. Mattys, Sven L.; Jusczyk, Peter W.; Luce, Paul A.; Morgan, James L. (1999). "Phonotactic and Prosodic Effects on Word Segmentation in Infants". Cognitive Psychology. 38 (4): 465–494. doi: 10.1006/cogp.1999.0721 . PMID   10334878. S2CID   17445419.
  9. 1 2 3 4 5 6 Gómez, Rebecca L.; Gerken, LouAnn (2000). "Infant artificial language learning and language acquisition". Trends in Cognitive Sciences. 4 (5): 178–186. doi:10.1016/S1364-6613(00)01467-4. PMID   10782103. S2CID   15933380.
  10. 1 2 3 4 Werker, J. F.; Yeung, H. H.; Yoshida, K. A. (2012). "How Do Infants Become Experts at Native-Speech Perception?". Current Directions in Psychological Science. 21 (4): 221–226. doi:10.1177/0963721412449459. S2CID   6132520.
  11. 1 2 3 Kuhl, Patricia K. (2004). "Early language acquisition: cracking the speech code". Nature Reviews Neuroscience. 5 (11): 831–843. doi:10.1038/nrn1533. PMID   15496861. S2CID   205500033.
  12. 1 2 Maye, Jessica; Werker, Janet F; Gerken, LouAnn (2002). "Infant sensitivity to distributional information can affect phonetic discrimination". Cognition. 82 (3): B101–B111. doi:10.1016/S0010-0277(01)00157-3. PMID   11747867. S2CID   319422.
  13. Maye, Jessica; Weiss, Daniel J.; Aslin, Richard N. (2008). "Statistical phonetic learning in infants: facilitation and feature generalization". Developmental Science. 11 (1): 122–134. doi:10.1111/j.1467-7687.2007.00653.x. PMID   18171374.
  14. 1 2 3 Kuhl, Patricia K. (1991). "Human adults and human infants show a "perceptual magnet effect" for the prototypes of speech categories, monkeys do not". Perception & Psychophysics. 50 (2): 93–107. doi: 10.3758/BF03212211 . PMID   1945741.
  15. Kuhl, P. K. (2000). "A new view of language acquisition". Proceedings of the National Academy of Sciences. 97 (22): 11850–11857. Bibcode:2000PNAS...9711850K. doi: 10.1073/pnas.97.22.11850 . PMC   34178 . PMID   11050219.
  16. 1 2 Kuhl, P.; Williams, K.; Lacerda, F; Stevens, K.; Lindblom, B (1992). "Linguistic experience alters phonetic perception in infants by 6 months of age". Science. 255 (5044): 606–608. Bibcode:1992Sci...255..606K. doi:10.1126/science.1736364. PMID   1736364. S2CID   45721427.
  17. 1 2 Seidenberg, M. S. (1997). "Language Acquisition and Use: Learning and Applying Probabilistic Constraints". Science. 275 (5306): 1599–1603. doi:10.1126/science.275.5306.1599. PMID   9054348. S2CID   18564256.
  18. Cartwright, Timothy A.; Brent, Michael R. (1997). "Syntactic categorization in early language acquisition: formalizing the role of distributional analysis". Cognition. 63 (2): 121–170. doi: 10.1016/S0010-0277(96)00793-7 . PMID   9233082. S2CID   43304896.
  19. Redington, M (1998). "Distributional information: A powerful cue for acquiring syntactic categories". Cognitive Science. 22 (4): 425–469. doi:10.1016/S0364-0213(99)80046-9.
  20. 1 2 Gomez, Rebecca L; Gerken, LouAnn (1999). "Artificial grammar learning by 1-year-olds leads to specific and abstract knowledge". Cognition. 70 (2): 109–135. doi:10.1016/S0010-0277(99)00003-7. PMID   10349760. S2CID   7447597.
  21. Hirsh-Pasek, Kathy; Kemler Nelson, Deborah G.; Jusczyk, Peter W.; Cassidy, Kimberly Wright; Druss, Benjamin; Kennedy, Lori (1987). "Clauses are perceptual units for young infants". Cognition. 26 (3): 269–286. doi:10.1016/S0010-0277(87)80002-1. ISSN   0010-0277. PMID   3677573. S2CID   31433274.
  22. Saffran, Jenny R. (2001b). "The Use of Predictive Dependencies in Language Learning". Journal of Memory and Language. 44 (4): 493–515. doi:10.1006/jmla.2000.2759.
  23. Schneider, Harry D.; Hopp, Jenna P. (2011). "The use of the Bilingual Aphasia Test for assessment and transcranial direct current stimulation to modulate language acquisition in minimally verbal children with autism". Clinical Linguistics & Phonetics. 25 (6–7): 640–654. doi:10.3109/02699206.2011.570852. PMID   21631313. S2CID   35308934.
  24. 1 2 Weiss, Daniel J.; Gerfen, Chip; Mitchel, Aaron D. (2009). "Speech Segmentation in a Simulated Bilingual Environment: A Challenge for Statistical Learning?". Language Learning and Development. 5 (1): 30–49. doi:10.1080/15475440802340101. PMC   3981102 . PMID   24729760.
  25. Franco, Ana; Cleeremans, Axel; Destrebecqz, Arnaud (2011). "Statistical Learning of Two Artificial Languages Presented Successively: How Conscious?". Frontiers in Psychology. 2: 229. doi: 10.3389/fpsyg.2011.00229 . PMC   3177082 . PMID   21960981.
  26. Yim, Dongsum; Rudoy, John (August 2012). "Implicit Statistical Learning and Language Skills in Bilingual Children". Journal of Speech, Language, and Hearing Research. 56 (1): 310–322. doi:10.1044/1092-4388(2012/11-0243). PMID   22896046.
  27. Antovich, Dylan M.; Graf Estes, Katharine (January 2017). "Learning across languages: bilingual experience supports dual language statistical word segmentation". Developmental Science. 21 (2): e12548. doi:10.1111/desc.12548. PMC   6594691 . PMID   28156032.
  28. 1 2 3 Yu, C.; Smith, L. B. (2007). "Rapid Word Learning Under Uncertainty via Cross-Situational Statistics". Psychological Science. 18 (5): 414–420. CiteSeerX   10.1.1.385.7473 . doi:10.1111/j.1467-9280.2007.01915.x. PMID   17576281. S2CID   729528.
  29. 1 2 3 Medina, T. N.; Snedeker, J.; Trueswell, J. C.; Gleitman, L. R. (2011). "How words can and cannot be learned by observation". Proceedings of the National Academy of Sciences. 108 (22): 9014–9019. Bibcode:2011PNAS..108.9014M. doi: 10.1073/pnas.1105040108 . PMC   3107260 . PMID   21576483.
  30. 1 2 3 Trueswell, John C.; Medina, Tamara Nicol; Hafri, Alon; Gleitman, Lila R. (2013). "Propose but verify: Fast mapping meets cross-situational word learning". Cognitive Psychology. 66 (1): 126–156. doi:10.1016/j.cogpsych.2012.10.001. PMC   3529979 . PMID   23142693.
  31. Kachergis, G. N.; Yu, C.; Shiffrin, R. M. (2012). "Cross-situational word learning is better modeled by associations than hypotheses". 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL). pp. 1–6. doi:10.1109/DevLrn.2012.6400861. ISBN   978-1-4673-4965-9. S2CID   9317229.
  32. Kuhl, P. K. (2003). "Foreign-language experience in infancy: Effects of short-term exposure and social interaction on phonetic learning". Proceedings of the National Academy of Sciences. 100 (15): 9096–9101. Bibcode:2003PNAS..100.9096K. doi: 10.1073/pnas.1532872100 . PMC   166444 . PMID   12861072.
  33. Turk-Browne, Nicholas B.; Jungé, Justin A.; Scholl, Brian J. (1 January 2005). "The Automaticity of Visual Statistical Learning". Journal of Experimental Psychology: General. 134 (4): 552–564. doi:10.1037/0096-3445.134.4.552. PMID   16316291.
  34. Saffran, Jenny R; Johnson, Elizabeth K; Aslin, Richard N; Newport, Elissa L (1999). "Statistical learning of tone sequences by human infants and adults". Cognition. 70 (1): 27–52. doi: 10.1016/S0010-0277(98)00075-4 . PMID   10193055. S2CID   9931459.
  35. Fiser, J.; Aslin, R. N. (2001). "Unsupervised Statistical Learning of Higher-Order Spatial Structures from Visual Scenes". Psychological Science. 12 (6): 499–504. doi:10.1111/1467-9280.00392. PMID   11760138. S2CID   2647150.
  36. Fiser, József; Aslin, Richard N. (2002). "Statistical learning of higher-order temporal structure from visual shape sequences". Journal of Experimental Psychology: Learning, Memory, and Cognition. 28 (3): 458–467. doi:10.1037/0278-7393.28.3.458. PMID   12018498.
  37. Newport, Elissa L.; Hauser, Marc D.; Spaepen, Geertrui; Aslin, Richard N. (2004). "Learning at a distance II. Statistical learning of non-adjacent dependencies in a non-human primate". Cognitive Psychology. 49 (2): 85–117. doi:10.1016/j.cogpsych.2003.12.002. PMID   15304368. S2CID   903539.
  38. Toro, Juan M.; Trobalón, Josep B. (2005). "Statistical computations over a speech stream in a rodent". Perception & Psychophysics. 67 (5): 867–875. doi: 10.3758/BF03193539 . PMID   16334058.
  39. Kirkham, Natasha Z; Slemmer, Jonathan A; Johnson, Scott P (2002). "Visual statistical learning in infancy: evidence for a domain general learning mechanism". Cognition. 83 (2): B35–B42. doi:10.1016/S0010-0277(02)00004-5. ISSN   0010-0277. PMID   11869728. S2CID   6203719.