How Distinct Brain Circuits Track Decision Outcomes

To avoid repeating mistakes and to learn to make better choices, the brain must accurately evaluate the consequences of our decisions.

Exactly how the brain carries out this learning during value-based decision making has remained unclear—until now.

A team of neuroscientists from the Institute of Neuroscience and Psychology at the University of Glasgow has provided new insight into the neural mechanisms that support reward-guided learning.

Picture picking wild berries in a forest when suddenly a swarm of bees bursts from a bush. Instinctively, your motor system initiates a rapid escape before you consciously process the danger. That immediate, automatic response is a vital survival mechanism that protects you from imminent harm.

Separately, a slower, more deliberate learning process evaluates the outcome and updates future choices: after the bee encounter, berry-picking may no longer seem worth the risk. This reflective process helps reassign value to options and shapes how rewarding similar decisions will feel later.

“To date the biological validity and neural underpinnings of these separate value systems remain unclear,” said Dr. Marios Philiastides, who led the study published in Nature Communications.

To examine how the brain implements these systems, Philiastides’ team developed a cutting-edge neuroimaging approach that combines two complementary methods. Participants completed a reward-learning task while their brain activity was recorded simultaneously with electroencephalography (EEG) and functional magnetic resonance imaging (fMRI).

EEG offers millisecond-level timing information—answering “when” neural events occur—while fMRI provides high spatial resolution to locate brain activity—answering “where” events happen. Historically, these questions were addressed separately, but combining EEG and fMRI allows the researchers to map the spatiotemporal dynamics of learning processes in the human brain.

Recording EEG inside an MRI scanner poses technical challenges because the scanner produces large electromagnetic noise. The team overcame this by applying advanced signal-processing methods to remove scanner artifacts and recover the tiny electrical signals on the scalp.

During the experiment, volunteers viewed pairs of abstract symbols and chose the one they believed would be more profitable—earning more points. They learned through trial and error, using the outcomes of each choice to guide subsequent decisions. Correct choices rewarded points and increased the money they earned; incorrect choices yielded no reward. To keep the task engaging and realistic, there was a 30% chance that even the correct symbol would sometimes produce a penalty, adding uncertainty to the learning process.

The findings reveal two separate but interacting value systems, distinct in both time and brain location, that support reward-guided learning in humans. An early system responds primarily to negative outcomes and recruits arousal-related and motor-preparatory circuits. This fast signal appears tuned to detect adverse events and trigger immediate alertness and action.

A later system emerges after the initial response and drives the value updating that underlies approach and avoidance learning. This slower system differentially modulates areas of the reward network: it suppresses activity following negative outcomes and activates reward regions after positive outcomes, aligning with its role in reinforcing adaptive choices over time.

Importantly, when a negative outcome occurs the early system down-regulates the later system, allowing the brain to prioritize immediate avoidance and to adjust how rewarding similar options will seem in the future. The researchers observed that this interaction involves a thalamic influence on the ventral striatum—a thalamostriatal pathway—and the strength of this coupling predicted participants’ likelihood to switch choices and engage in avoidance learning.

“Our research opens up new avenues for investigating the neural systems underlying both normal and maladaptive decision making,” Dr. Philiastides said. He added that these findings could improve understanding of how everyday responses to rewarding or stressful events shape our ability to make optimal decisions. The work also has potential implications for studying psychiatric conditions marked by altered responses to aversive outcomes—such as chronic stress, obsessive-compulsive disorder, post-traumatic stress disorder, and depression—by revealing how those conditions may disrupt learning and strategic planning.

This shows a man in an EEG cap.
The ability to record EEG signals inside an MRI scanner depends on removing scanner-generated noise. Image is for illustrative purposes only. Credit: Chris Hope.
About this neuroscience research

Source: Stuart Forsyth – University of Glasgow
Image Credit: The image is credited to Chris Hope and is licensed CC BY 2.0
Original Research: “Two spatiotemporally distinct value systems shape reward-based learning in the human brain” by Elsa Fouragnan, Chris Retzler, Karen Mullinger and Marios G. Philiastides in Nature Communications. Published online September 8, 2015. doi:10.1038/ncomms9107


Abstract

Two spatiotemporally distinct value systems shape reward-based learning in the human brain

Avoiding repeated mistakes and reinforcing rewarding decisions are essential for adaptive behavior. However, the neural basis of distinct value systems that encode different decision outcomes has been elusive. By combining single-trial EEG with simultaneously acquired fMRI, the study uncovers two separate but interacting value systems that encode outcomes across space and time. An early system, activated only by negative outcomes, engages arousal and motor-preparatory structures consistent with a role in alertness and behavioral switching. A later system, associated with reward-based learning, suppresses or activates regions of the reward network after negative and positive outcomes, respectively. Following negative outcomes, the early system interacts with and downregulates the late system through thalamic input to the ventral striatum. The strength of this thalamostriatal coupling predicts participants’ switching behavior and avoidance learning, implicating this pathway in reward-driven learning.

“Two spatiotemporally distinct value systems shape reward-based learning in the human brain” by Elsa Fouragnan, Chris Retzler, Karen Mullinger and Marios G. Philiastides in Nature Communications. Published online September 8, 2015. doi:10.1038/ncomms9107

Feel free to share this neuroscience news.