How the Brain Merges Sight and Sound to Speed Decision Making

Summary: New research reveals how the brain combines visual and auditory information to produce faster, more accurate decisions. Using electroencephalography (EEG), scientists show that auditory and visual decision signals begin on separate tracks but converge in motor-preparation areas, allowing quicker responses. Computational modeling indicates that this integration—rather than a simple race between senses—best explains behavior, particularly when one sensory input is slightly delayed. These results provide a concrete model of multisensory decision-making with potential relevance for clinical approaches to sensory and cognitive disorders.

The study demonstrates that multisensory advantages arise not merely from whichever sense finishes first, but from a coordinated merging of modality-specific decision signals into a single motor command. That merged signal improves reaction speed and accuracy when both senses are informative or when timing differences occur between them.

Key Facts:

  • Parallel then merge: Auditory and visual decision processes develop independently before combining in motor-preparation regions.
  • Integration outperforms race models: A combined-accumulator model explained behavior and EEG signals better than a pure race model, especially under small timing offsets between senses.
  • Clinical relevance: Understanding how the brain integrates multisensory evidence could guide diagnostics and treatments for disorders of sensory processing and decision-making.

Source: University of Rochester

It has long been known that experiencing two senses at once—such as seeing and hearing the same event—can enhance response speed and accuracy compared with relying on a single sense. The biological advantage is intuitive: an animal that both sees and hears an approaching threat is more likely to react in time. Yet how the brain combines those separate sensory inputs into one decision has remained an open question.

An international team led by researchers at the University of Rochester and collaborators in Dublin, Ireland, has provided new insight into this process. Their findings, published in Nature Human Behaviour, trace how modality-specific evidence accumulates in the brain and how those signals are routed toward action.

“Just like sensory integration, sometimes you need human integration,” said John Foxe, PhD, director of the Del Monte Institute for Neuroscience at the University of Rochester and a co-author on the study. “This work builds on decades of research and collaboration. Ideas need time to mature, and this project is a clear example of that gradual scientific progress.”

Simon Kelly, PhD, professor at University College Dublin and lead author, explained that his group was well positioned to pursue this question because of prior work on measuring decision-related signals with EEG. In 2012 Kelly’s lab developed a way to track information accumulation over time in the brain using a centro-parietal EEG signature, establishing tools that were essential for the current study.

Participants in the experiments watched a simple dot display while listening to repeated tones and were instructed to press a button when they detected a change in the dots, the tones, or both. EEG recordings allowed researchers to follow the evolving decision signal. When both visual and auditory changes occurred, the data showed distinct accumulation patterns for each modality that ultimately co-activated motor-preparation activity—enabling faster responses.

“The EEG accumulation signal reached different amplitudes for auditory versus visual targets, indicating distinct modality-specific accumulators,” Kelly said. To test how these accumulators combined to drive behavior, the team compared two computational models: a race model in which the faster modality triggers action, and an integration model in which the outputs of auditory and visual accumulators are combined and then sent to a common motor threshold.

Both models could account for many aspects of the data, but when the researchers introduced slight timing offsets between the audio and visual signals, the integration model provided a much better fit. This suggests that while evidence begins accumulating within sensory-specific channels, the decisive step for multisensory detection is the convergence of those channels onto a unified motor process.

“Our results provide a concrete neural architecture for multisensory decision-making,” Kelly noted. “Distinct decision processes gather information from different modalities, and their outputs converge on a single motor process where they combine to reach a single action threshold.”

Team Science Takes a Village

The study grew from long-term collaborations and mentorships. In the 2000s, Foxe’s Cognitive Neurophysiology Lab—then at City College of New York—hosted numerous young scientists, including Simon Kelly and Manuel Gomez-Ramirez, PhD, now an assistant professor of Brain and Cognitive Sciences at the University of Rochester and a co-author of the current paper. Those early interactions introduced methods for studying audiovisual detection and inspired experiments examining how auditory, visual, and tactile inputs are integrated.

“We come from different backgrounds but share the same drive to understand basic brain function,” Foxe said. “Our ongoing conversations and collaborations over years illustrate how scientific ideas evolve with time and collective effort.”

Other contributors include first author John Egan (University College Dublin) and Redmond O’Connell, PhD (Trinity College Dublin).

Funding: This work was supported by Science Foundation Ireland, the Wellcome Trust, the European Research Council (Consolidator), the Eunice Kennedy Shriver National Institute of Child Health and Human Development (UR-IDDRC), and the National Institute of Mental Health.

About this visual and auditory neuroscience research news

Author: Kelsie Smith Hayduk
Source: University of Rochester
Contact: Kelsie Smith Hayduk – University of Rochester
Image: The image is credited to Neuroscience News

Original Research: Closed access. “Distinct audio and visual accumulators co-activate motor preparation for multisensory detection” by Simon Kelly et al., Nature Human Behaviour.


Abstract

Distinct audio and visual accumulators co-activate motor preparation for multisensory detection

Detecting targets in multisensory environments is a fundamental brain function. It has been unclear whether evidence from different sensory modalities accumulates via separate neural processes and whether those processes follow separate decision criteria. To address this, the authors performed two experiments (n = 22 and n = 21) using a task that permits tracing neural evidence accumulation through a centro-parietal positivity measured with EEG and modeling response-time distributions.

Analysis of redundant (respond-to-either-modality) and conjunctive (respond-only-to-both) audiovisual detection, combined neural–behavioral modeling, and a follow-up experiment manipulating stimulus onset asynchrony, showed that auditory and visual evidence are accumulated by distinct processes during multisensory detection. The cumulative evidence from both modalities sub-additively co-activates a single, thresholded motor process during redundant detection. These findings resolve longstanding questions about how information is integrated and accumulated across modalities in multisensory conditions.