How infants learn language

         This is not a chemical post, but I think most readers will be interested in the way that infants learn language and just how sophisticated their cognitive processes are, even before they begin to speak.  Some of it will be incomprehensible to anyone without a strong neuroscience and neurology background, but most of it should be understandable by all.   For why this post looks the way it does, see the end.
       [ Proc. Natl. Acad. Sci. vol. 97 pp. 11850 – 11857 ’00 ] Skinner (1957) thought that language developed in children because of external reinforcement and shaping (e. g. operant conditioning).   Infants were thought to learn language as rats learn to press a bar for a reward.  Chomsky (who else?) said there was a ‘language faculty’ in the brain which included innately specified constraints on the forms language could take.  These constraints included specification of a universal grammar and universal phonetics.  Fodor calls this a module — domain specific, informationally encapsulated and innate.    


      The current view is that infants engage in a type of learning in which language input is mapped in detail by the infant brain (so vague as to be useless — but the article is great).

       Infants can ‘parse’ speech (a difficult problem for computer language recognition) correctly at the phonetic level, and their abilities are universal across langugage.  This type of parsing isn’t limited to man or limited to speech.   Infants discriminate only between stimuli from different phonetic categories.  Unlike adults infants show the effect for the phonetic units of all languages (some 600 consonants and 200 vowels).  This implies that infants are biologically endowed with neural mechanisms responding to the phonetic contrasts used by the world’s languages (how is this different from what Chomsky said?).   This was true for monkeys– in tests of discrimination, monkeys show peaks in sensitivity coinciding with the phonetic boundaries used by languages.  

       This means that parsing of phonetic units isn’t specific for language (as animals without language show it).  This is sort of chicken and egg — do we say what we can hear or do we first hear what is able to be said?   Categorical perception has also been shown with nonspeech stimuli mimicking speech features without being percieved as speech in both adults and infants. 

       Eimas’ early model of speech perception was selectionist — the brain was wired to pick up certain sounds, and that depending on the sounds it heard, detectors were maintained or lost.  By 12 months of age, infants no longer discriminate non-native phonetic contrasts, even though they do at 6 months.  However, not all such phonetic abilities are completely lost.   However, there is more to language acquisition than losing preformed discriminative abilities, learning is also required (obvious, but they have shown it in work to be described at a primitive phonetic level). 

      [ Science vol. 306 p. 1127 ’04 ] As infants pick up the ability to recognize phonemes of their native language, they lose the ability to distinguish phonemes in other languages.  Infants who are better at 7 months at picking up native phonemes are better on all language measures at age 30 months (number of words produced, duration of speech, sentence complexity).   < e.g. their brain is being tuned ! >

          Rats raised in environments containing only white noise with no pitch or rhythm are unable to recognize everyday sounds and are greatly impaired even in their ability to discriminate different pitches [ Science vol. 300 pp. 498 – 502 ’03 ]

      l. Infants abstract patterns (p. 11852) —  6 month old infants, trained to produce a head-turn response when a sound from one category (the vowel in pop) and to inhibit that response when an instance from another vowel category (e.g. in peep) show ability to perceptually sort novel instances into categories.  They can sort vowels which vary across talkers and intonation counters, as well as syllables varying in their initial consonant in from of the vowels.  

      At birth, infants have been shown to prefer the language spoken by their mothers during pregnancy as opposed to another language (clearly this represents pattern discrimination).    Infants also prefer their mother’s voice over another female at birth.  The review article is great, but none of the methodology by which these statements are reached is given (but 114 references are given).   

      By 9 months infants show a preference for the stress pattern of words in the language they have heard (English typically puts the stress on the first syllable, while other languages put it on the second).  This preference is not visible at 6 months.   Some sound  patterns seen in Dutch aren’t seen in English.  By 9 months American infants listen longer to English words while Dutch infants show a listening preference for Dutch words.   The infants don’t know what the words mean. 

     All of the above shows that infants can abstract patterns.

    2. Infants exploit statistical properties of language input.  Infants can maintain the discrimination of two isolated syllables when they are later embedded in 3 syllable strings.  A complicated experiment (J. Child Lang. vol. 29 pp. 229 – 252 ’93 — reprint requested) using nonsense syllables and transition probabilities between them showed that 7 month infants could use pairs of syllables with high transition probabilities to discriminate other syllables from a stream of sounds (breaking a sound stream into words and syllables is hard for machines to do).  

       < [ Proc. natl. Acad. Sci. vol. 98 pp. 12874 – 12875 ’01 ] This process is called statistical learning.   This means that infants may find word boundaries by detecting syllable pairs with low transitional probabilities.  Infants as young as 8 months begin to perform these computations with as little as 2 minutes of exposure.  By soaking up the statistical regularities of seemingly meaningless acoustic events, infants are able to rapidly structure liguistic input into relevant and ultimately meaningful units.  Infants can also do this for tone sequences.   >

       This shows that an old principle of Gestalt psychology known as ‘common fate‘ plays a role in speech perception.  Phonemes typically linked in a language, and thus sharing a common fate are perceived as units by infants.  The same principle may underly object perception — physical entities whose properties cohere in space and move together, are perceieved as individuated objects.  It isn’t clear if the mechanisms for detecting common fate in vision and speech are the same. 

       3. Language experience warps perception:   The perceptual magnet effect is seen when tokens perceived as exceptionally good representatives of a phonetic category (prototypes) are used in tests of speech perception.  Native language phonetic prototypes evoke special responses when compared with nonprototypes.  When tested with a phonetic prototype as opposed to a nonprototype from the same category, infants show greater ability to generalize to other category members (what in the world does this mean?). 
      Thus the prototype appears to function as a magnet for other stimuli in the category, in a way similar to that shown for prototypes of other cognitive categories.   The effect is only seen in the langugage (English) to which a 6 month infant has been exposed (vs. Swedish) even using the same stimuli.   This is neat, because it is indepedent of the phonemes used (but one must be native to the language to which the infant has been exposed).   Interestingly, animals don’t show the perceptual magnet effect.  

       Thus there is a developmental sequence from universal perception of phonemes to language specific perception.  This is called the Native Language Magnet model.  It proposes that infants mapping of ambient language warps the acoustic dimensions underlying speech, producing a complex network or filter, through which language is perceived.  The language specific filter alters the dimensions of speech attended to, stretching and shrinking acoustic space to highlight the difference between language categories.    Once formed, language-specific filters make learning a second language harder, because the mapping appropriate for a primary language is completely different from that required by other language.  Studies of adult bilinguals, exposed to their second language after age 6 show perceptual magnet effects only for the first language.  

      The net effect of the above is the idea that infants, simply by listening to language, acquire sophisticated information about its properties.   Computers have a hard time recognizing similarities in language input (by different speakers).  By 6 months of age, infants can sort unfamiliar instances of known vowel sounds into categories.  They can do this across different speakers.    They can sort syllables by initial type of consonants (nasal vs. stop).  At 4 months infants listen equally long to Polish and English speech samples that have pauses inserted at clause boundaries as opposed to within clauses, but by 6 months, infants listen preferentially to pauses inserted at the clause boundaries appropriate only to their native language.  

      “The important point regarding development is that the initial perceptual biases shown by infants in tests of categorical perception as well as asymmetries in perception seen in infancy produce a contouring of the perceptual space that is universal.  The universal contouring soon gives way to a language specific mapping distorting perception, completely revising the perceptual space underlying speech processing (e.g. their brain is being tuned to what its getting).   The example is the inability of the Japanese infant to ‘hear’ ra vs. la after 6 – 9 months exposure to Japanese (the authoress doesn’t say so, but apparently they can hear the distinction at birth ).   They talk about a language specific filter for sounds which highlights the differences between (native) language categories.   The theory is called the Native Language Magnet theory.   The point is that the perceptual apparatus is altered by experience, not by any sort of reinforcement or conditioning.  

       Research on cognitive development confirms the fact that categorization, statistical learning, and prototype effects aren’t unique to speech (in man).    Monkeys don’t show the perceptual magnet effect.
       Early theories of speech perception held that speech was perceived with reference to production.  The new developmental data suggest that early in life, perceptual representations of speech are stored in memory.  Subsequently these representations guide the development of motor speech.  

       At 20 weeks infants look longer at a face pronouncing a vowel that matches one they hear as opposed to a mismatched face (I’d think just the opposite was the case).  This implies that what an infant hears and perceptual distortion are being matched to what they see as well as what they hear. 

       [ Proc. Natl. Acad. Sci. vol. 100 pp. 9096 – 9101 ’03 ] Between 6 and 12 months of age, the ability to discriminate foreign language phonetic units sharply declines.  In one experiments, 9 month old American infants were exposed to native Mandarin Chinese speakers in 12 laboratory seesions.  A control group also participated in 12 langugage sessions but heard only English.   Subsequent tests of Mandarin speech perception showed that exposure to Mandarin reversed the decline seen in the English control group.   The second experiment used only recordings of Mandarin, and there was no prevention of the decline. 

      Motherese is the universal speaking style used by caretakers around the world when they address infants.  It has exaggerated stress and increased pitch.  Looking back it is obvious that this helps infants in discriminating phonetic units.   A frequency analysis of motherese shows that the phonetic units of infant directed speech are acoustically exaggerated.    When introducing new words, parents repeat the word often in stereotyped frames, putting it at the end of a sentence.  

         Mismatch negativity — an event related potential elicited by a change in a repetitive sound pattern has been used to study language acquisition.  In 6 month olds it is seen with changes in both native and nonnative language contrasts, but by 12 months, mismatch negativity exists only for native language contrasts.   

        Functional MRI has shown that adult bilinguals who acquire both languages early in life activate overlapping regions of the brain when processing the two languages, while those who learn the second language later in life activate two distinct regions of the brain for the two languages < Nature vol. 388 pp. 172 – 174 ’97 >. 

       [ Science vol. 294 p. 1823 ’01 ] A videotape of 6 infants ages 6 – 12 months done in Montreal studied 3 infants learning English and 3 learning French.  Independent coders than scored the mouth movements on silenced video clips including babbles such as ‘da’ and extraneous sounds such as ‘ssss’.  The babies use the center of their mouths to produce nonbabbles, but they almost always babbled with the right side.  [ ibid. vol. 297 p. 1515 ’02 ] More from Montreal.  They are up to 10 babies between 5 and 12 months, acquiring either English or French.  They looked at smiles, babbles, nonbabbles and had two blind coders scoring 150 randomly selected segments accoring to whether the mouth moved equally on both sides, more on the right or more on the left. 

       How do you tell a babble from a nonbabble?   Babbles are defined as (1) vocalizations containing a reduced subset of possible sounds found in spoken language, (2) having repeated syllabic organization (consonant vowel alterations), (3) production without apparent meaning or reference.  All vocalizations lacking ANY of these 3 criteria were coded as non babbles.   

      Babbling babies opened the right side of their mouth more often, while smiling babies opened the left side of their mouths.  There were no significant differences seen between English and French babies.   This reflects the natural human left hemisphere dominance for speech.   They conclude that babling represents the onset of productive language capacity in man, rather than an exclusively oral-motor development (like chewing etc) which should be symmetric.   Again, emotional expression might be controlled by the right hemisphere even at the early age of 5 months. 

       [ Science vol. 298 pp. 553 – 554, 604 – 607 ’02 ] An interesting article about where statistical learning leaves off and grammatical learning begins.  They note that the statistics of natural language involves correlations over multiple types of information (type of syllables, order of syllables, order of words, types of words) simultaneously.  This may be too complex for other species to learn.  Alternatively, other species may lack innate grammatical capacities that make language learning possible. 

      Studies centered on understanding rule learning raise two major unresolved issues.  The distinction between a rule and a statistical generalization remains unclear.   The evidence for rule learning is mostly negative: cases where learning occurs but there is no obvious statistical explanation.   Another possibility is that languages may show only those structures that learners are able to track.  Thus the structure of language may have resulted in part from constraints imposed by the limits of human learning.  It is quite likely that someone more intelligent may have a language that we could never understand, or that they could talk to us in our language and put the added structure of their language on top, and communicate to each other invisibly to us.   

       Adults and 8 month old infants confronted with unfamiliar concatenated artificial speech tend to infer word boundaries at loci where the transitional probability between two adjacent syllables drops.   That is, word boundaries are inferred between two syllables that rarely appear in sequence and not between two syllables that alwayys appear together.  

      In one study < Cognition vol. 70 pp. 27 –> ’99 > 7 month infants behaved as if they had inferred a rule after having been familiarized with a large number of trisyllablic items consistent with it.  After familiarization, infants were presented with previously unheard items.  They behaved differently according to whether or not the items conformed to the rule.    Infants tend to extract rule-like regularities at least when they process a corpus of clearly deliminited items.  

        [ Science vol. 300 pp. 53 – 54 ’03 ]  Comments and reply on above.  One comment complained that ‘nowhere do they spell out what exactly statistical learning consists of’.   Broadening the notion of statistics from things like transitional probabilities between particular elements to relationships between any kind of information (concrete or abstract) trivializes the term, rendering it broad enough to encompass any lawful relationship.   They want to know how you would prove something is NOT statistical learning, otherwise the theory is unfalsifiable.   They propose that the ability to recognize the ABA form regardless of what A, B are couldn’t be statistical transition probabilities (if the subject had never heard a particular A or B before !). 

        [ Proc. Natl. Acad. Sci. vol. 99 pp. 15250 – 15251 ’02 ] Statistical learning might also be important in the perception of gramatical categories (e.g. noun vs. verb).  A system sensitive to only to the statistical distribution of words within sentences.  The internal representations of words that tend to occur in similar distributional contexts cluster together, and because nouns tend to occur in particlar sentential context and verbs in others, the clustering of words into these two classes is in some sense a statistical inevitability.   

       This work showed a similar ability of human infants to pick up the statistical properties of visual input.  

       Language and vision are tightly linked.  Watching what people look at when listening to a sentence is informative.  Hearing a sequence — “the man will ride” while viewing a scene portraying a man, a child (a girl), a motorbike and a carousel causes the eyes to be directed toward the motorbike during the verb ride, but when the sentence is ‘the girl will ‘ride, the eyes are directed toward the carousel (how old are the subjects).    The eye movements reflect conditional probabilities in respect to the ways in which entities in the real world interact with one another.

        [ Science vol. 298 pp. 2013 – 2015 ’02 ] Functional MRIs were done on 20 healthy nonsedated infants ages 2 – 3 months of age while they listened to 20 seconds of speech stimuli alternating with 20 seconds of silence (in Paris — this is Dehaene’s work).    Some of the speech was played backwards.  All was in the native language of the infants (French).  Backward speech violates several segmental and suprasegmental phonological properties that are universally seen in human speech.   4 day old neonates and 2 month old infants discriminate sentences in their native language from sentences in a foreign language, but this performance vanishes when the stimuli are played backward.   They expected that forward speech would elicit stronger activation than backward speech in brain areas engaged in the recognition of segmental and suprasegmental properties of the native language.  

      They also determined the hemodynamic response function (HRF) in infants (which hadn’t been done before in awake infants).   Most activated voxels showed a delay of about 5 seconds to a sinusoidal stimulus (about the same as a normal adult).

      There was stimulus induced activation in a large extent of the left temporal lobe (the same for forward and backward speech).  This may mean that the left temporal lobe is selectively activated in processing rapidly changing auditory stimuli, whether language or nonsense. Activation ranged from the superior temporal gyrus (including Heschl’s gyrus) to surrounding areas of the superior temporal sulcus and the temporal pole.  Activation was significantly greater in the left than in the right temporal lobe at the level of the planum temporale.  This is an example of finding what you expect — we’ll see what others find when they try to replicate this.  

       The left angular gyrus and left mesial (not medial?) parietal lobe (precuneus) were more activated by forward speech than by backward speech.  No region showed greater activation by backward speech than by forward speech.  

       This work implies that the left temporal cortex in infants is already specialized in listening to sound (forward and backward speech the same). 

       [ Proc. Natl. Acad. Sci. vol. 103 pp. 14240 – 14245 ’06 ] More Dehaene — in a study in adults, they showed the possibility of parsing brain activations based on the phase of the BOLD response to a single sentence.  The phase estimates the delay in activation relative to sentence onset.  It varies systematically across perisylvian areas.  Slower responses toward the temporal poles and inferior frontal areas implies that successive region, integrate speech information on different time scales, possible because they are sensitive to speech units of different granularity (size??).

       The present work showed much shorter sentences (2 seconds) to infants presented ever 15 seconds.  This allowed them to monitor the speed of the rise and fall of the infant’s BOLD response in different brain region.  

       An adultlike structure of fMRI response delays was seen along the superior temporal regions, suggesting a hierarchical processing scheme.  The fastest responses were recorded in the vicinity of Heschl’s gyrus .  Responses became increasingly slower toward the posterior part of the superior temporal gyrus and toward the temporal poles and inferior frontal regions (Broca’s area).  Activation in Broca’s area increased when the sentence was repeated after 14 second delay suggesting the early involvement of Broca’s area in verbal memory.  Amazing, Broca’s area is active in infants before the babbling stage.  They need to repeat the work using nonspeech stimuli (which I think they will do) to be sure that these regions are specific for language processing.

       Another point is that the default network areas of adult also deactivate when infants listen to sentences.  The deactivated network comprises mesial occipital and superior frontal cortices and caudate nuclei.  There is no evidence for the large bilateral temporo-parietal and ventromesial frontal deactivations which aqre easily seen in adults. 

       [ Neuron vol. 37 pp. 159 – 170 ’03 ] An interesting study looked at German/Italian bilinguals and divided them according to proficiency — high and low proficiency (HP and LP) and age of acquisition — early and late acquisition (EA and LA).   They had 11 EAHP 12 LAHP and 9 LALP (no LAHP !) subjects.  They then used functional MRI to see which areas were activated during semantic and grammatical judgements in the 3 groups.  

     The pattern of brain activity for semantic judgements was largely dependent on LP (regardless of age of acquisition), the pattern of brain activity for grammatical judgements was dependent on the age of acquisition of the language.   The second acquired language (L2) was compared to the first (L1).  

      During grammatical processing in L2 more extensive activation was found in Broca’s area as well as other areas (in the LAHP group but NOT in the EAHP).  Thus different areas of the brain are used to process the grammar of  second language in those learning the language late, than the first language.

     The results can’t easily be accomodated within the idea that more extensive activation indicates worse performance.   Several monolingual studies have suggested that a higher level of complexity increases the extent of cortical activation.  

       [ Proc. Natl. Acad. Sci. vol. 100 pp. 11702 – 11705 ’03 ] 12 full term neonates were studied using optical topography.  They played human speech to infants, human speech played backwards and silence and measured the concentration of total hemoglobin in response to auditory stimulation in 12 areas of the right hemisphere and 12 areas of the left.  The left hemisphere temporal areas showed significantly more activation (as measured by increased hemoglobin concentration) when the infants were exposed to normal speech than to backward speech or silence.  The brain is thus ‘wired’ to receive whatever characterizes normal human speech, and not whatever characterizes it played backwards.   The great advantage of optical tomography is that it is silent (while fMRI is not, as anyone who’s ever had an MRI knows).  
        [ Proc. Natl. Acad. Sci. vol. 105 pp. 14222 – 14227 ’08 ] More work using optical topgraphy (near-infrared spectroscopy — NIRS ).  Newborns listened to (nonsense) syllable sequences containing immediate repetitions (mubaba, penana intermixed with sequences with no repetitions.   There were increased responses to the repetition sequences in the temporal and left frontal area, implying that the newborn brain can differentiate the two patterns.        

       [ Proc. Natl. Acad. Sci. vol. 106 pp. 1245 – 1248 ’09 ] The somatosensory system is also involved in the perception of speech.  A robotic device creates patterns of facial skin deformation which would normally accompany spech production.  When the facial skin is stretched while people listen to words it alters the sounds they hear.    The perceptual change depends on the specific pattern of deformation. 

       [ Proc. Natl. Acad. Sci. vol. 106 pp. 18667 – 18872 ’09 ] Five month old infants can match speech sounds to human faces and monkey sounds to monkey faces.  They look longer at a human face when human speech is played, than a monkey face than vice versa.  They looked longer at a monkey face when it was presented with monkey vocalization.   They could also match speech to a human face in a language that they’d never heard (Japanese).

        Why this post looks that way it does.  It looks like a bunch of notes I took on papers that I liked.  Well it is. One of the reasons I’m so slow going through Clayden etc. etc. is that, in retirement,  I do try and keep up with molecular biology and the basic science of the brain (which underlies what I did for years as a neurologist).  When in practice I kept up with the literature on clinical neurology (and medicine) and had very little time for math or chemistry.  So the post is made of notes taken over the years on articles concerning infant language acquisition. 

        Most of the references are from PNAS, Science and Neuron which I read regularly along with Cell and Nature.  They’re quite informal.  I’ve found that taking such notes forces you to read the paper for the gist of it, and more importantly, move on.  Piles and piles of reprints that you’ll really get to some day are great for guilt but not much else.

       I was about to start an article in the current Neuron on the subject “Brain Mechanisms in Early Language Acquisition” Vol 67 pp. 713 – 727 ’10.  So I started reviewing what I had, found it fascinating, and more importantly, largely accessible to anyone.  Even better the author is Patricia Kuhl who wrote the first PNAS paper in this post 10 years ago. 

      Next up, September has passed and along with it the 40 year anniversary of L-DOPA’s introduction to the USA for Parkinson’s disease. What I saw along with every neurologist of my generation was remarkable — people get out of wheelchairs etc. etc.  Details to follow.

      After that, responses to Galaxy Rising’s criticisms (see the comments).
Post a comment or leave a trackback: Trackback URL.


  • Cindy  On October 21, 2010 at 5:13 pm

    I found your topic unexpectedly while i was looking for techniques to implement an elearning area for
    my students who want to take french lessons…thanks for that post, i saved your site address and will be back to check it out some more later

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: