Summary: Researchers report that the brain re-evaluates its interpretation of speech sounds the moment later sounds are heard, updating perceptions as necessary.
Source: NYU
Researchers have identified an “auto-correct” mechanism in the brain that revises how ambiguous speech sounds are perceived as additional context arrives. Published in the Journal of Neuroscience, the study reveals how the auditory system combines immediate sensory input with later information to improve speech comprehension.
“What a listener thinks they heard does not always match the raw signals reaching the ear,” says Laura Gwilliams, lead author and doctoral candidate in NYU’s Department of Psychology and a researcher at the Neuroscience of Language Lab at NYU Abu Dhabi. “Our results show that the brain re-evaluates an initial speech sound each time a new sound arrives, allowing interpretations to be updated in real time.”
Gwilliams adds, “Remarkably, context that appears up to a second later can change perception without the listener being aware that their original interpretation has been altered.”
Alec Marantz, principal investigator and professor in NYU’s Departments of Linguistics and Psychology, explains with an example: “An ambiguous initial sound like the difference between ‘b’ and ‘p’ can be perceived differently depending on later context — for instance, whether the full word is ‘parakeet’ or ‘barricade.’ Listeners remain unaware of the ambiguity even when the disambiguating cue does not appear until the middle of the third syllable.”
The study, the first to map how later-occurring information reshapes early auditory perceptions, also involved David Poeppel and Tal Linzen. It builds on the established idea that speech perception depends heavily on surrounding context — words, sentences, and other acoustic cues — and shows how the brain integrates context that arrives after the initial sound.
Everyday conversation produces many ambiguous sounds: a consonant or vowel may be unclear because of background noise, coarticulation, or speaker variability. Listeners rarely notice these ambiguities because the brain automatically resolves them, choosing a single interpretation and perceiving that as the intended sound. The researchers set out to discover the neural dynamics that enable this rapid resolution when disambiguating evidence comes later in the word.
In a series of experiments, participants listened to isolated syllables and words that began with deliberately ambiguous phonemes (for example, words like “barricade” and “parakeet” where the initial consonant could be heard either way). The team recorded brain activity using magnetoencephalography (MEG), which captures magnetic fields produced by neural electrical currents and maps the timing of auditory processing with millisecond precision.

The experiments revealed three key findings:
- The primary auditory cortex registers how ambiguous a speech sound is within roughly 50 milliseconds of the sound’s onset.
- As the word continues, the brain “re-activates” the neural representation of prior sounds while processing new ones, enabling re-evaluation of the earlier input in light of later context.
- Within about 450–500 milliseconds the brain tends to commit to a phonological interpretation, reflecting a balance between maintaining sensory detail and settling on a likely category.
Gwilliams notes the surprising flexibility of this system: “Context that occurs after a sound can still reshape how that sound is perceived. For instance, the same ambiguous onset may be heard as ‘k’ in ‘kiss’ and as ‘g’ in ‘gift’ because downstream sounds provide the cues needed to resolve the initial uncertainty.”
Their analysis suggests the auditory system actively preserves subphonemic detail — detailed acoustic features below the level of phoneme categories — in auditory cortex for extended periods. At the same time, the system runs fast, probabilistic guesses about word identity so that listeners can access meaning quickly. When later context supports an alternative interpretation, the brain can revisit and revise its earlier decisions to reduce misunderstandings.
Funding: The research was supported by the NYU Abu Dhabi Research Institute, the European Research Council, the French National Research Agency, and the U.S. National Institutes of Health.
Source: James Devitt, NYU. Publisher: Organized by NeuroscienceNews.com. Image credit: Kate Lord/New York University. Original research: Gwilliams, Linzen, Poeppel, and Marantz, Journal of Neuroscience, published July 16, 2018. DOI: 10.1523/JNEUROSCI.0065-18.2018.
Abstract
In spoken word recognition the future predicts the past
Speech is naturally noisy and ambiguous, and listeners rely on contextual information to determine meaning. While prior context is known to influence perception, the neural mechanisms that use subsequent context have been unclear. Recording MEG responses in auditory cortex to words with an ambiguously pronounced initial phoneme that is later disambiguated at the lexical uniqueness point, the study tested how the brain integrates later-arriving cues. Across experiments with fifty participants, results indicate that the primary auditory cortex is sensitive to phonological ambiguity as early as 50 ms after onset. Subphonemic acoustic detail is maintained over long timescales and is re-evoked at later phoneme positions, while categorical commitments develop in parallel and resolve on a shorter timescale of approximately 450 ms. These results demonstrate that later input can determine the perception of earlier speech sounds by preserving sensory features until they can be integrated with top-down lexical information.
Significance statement
The perception of speech sounds depends on surrounding context, which often arrives after the initial sensory input. This study is the first to show how the brain uses subsequent context: the auditory system stores acoustic detail in auditory cortex while making rapid probabilistic guesses about word identity. This strategy enables fast access to message content while still allowing re-analysis of the acoustic signal to reduce comprehension errors.