Summary: A landmark study overturns a long-held assumption in neuroscience: the speed of learning is not determined only by repetition and experience but is strongly influenced by the size of the reward.
Researchers show that larger rewards produce higher-amplitude, longer-lasting dopamine signals in the brain. This prolonged dopamine response increases engagement and amplifies learning from each trial, dramatically shrinking training times. In some cases, a few high-value rewards taught a complex task far faster than thousands of small, incremental rewards.
Key Facts
- Repetition Is Not Everything: The prevailing belief—that learning requires many uniform, small rewards to gradually shape behavior regardless of reward value—has been challenged.
- Cookie vs. M&M Effect: In experiments, thirsty mice given a few large water rewards mastered a task within a single day after fewer than ten rewards. Mice receiving many small sips needed weeks of practice to reach comparable skill.
- Reduced Individual Variability: Standard small-reward approaches produced wide variations in learning speed between animals. Large rewards drastically narrowed that gap, sending most subjects to expert performance within days.
- Extended Dopamine Response: Bigger rewards not only create larger dopamine spikes but also prolong the dopamine signal, changing the time course of neural reinforcement.
- Engagement Drives Speed: Large rewards increase three key components of efficient learning—greater learning per repetition, improved carryover of progress between days, and stronger task engagement—with engagement emerging as the dominant factor affecting individual learning rates.
- Mice Can Learn More Complex Tasks: Shorter training times and heightened engagement enable rodents to be trained on tasks previously considered too complex, expanding the range of questions neuroscientists can study in mice.
Source: HHMI
Background: Scientists long believed that the pace of learning depended mainly on experience—the number of attempts and rewards—rather than on the actual size of each reward. The assumption was that whether the prize is $100 or $100 million, repeated practice—wins and losses—is what shapes skill.
New work from the Dudman Lab at HHMI’s Janelia Research Campus shows the size of the reward matters more than previously recognized and that larger rewards can accelerate learning.
How Reward Size Changes Learning Speed
Traditionally, neuroscientists assumed animals require many repetitions, each paired with a small reward, to learn tasks slowly over time. Many labs simply adopted small, uniform reward magnitudes without testing whether reward size itself could influence learning efficiency.
“The whole field has been using this approach for decades and, quite literally, nobody had checked whether reward size matters,” says Janelia Senior Group Leader Josh Dudman.
When Dudman’s team tested this, the results were dramatic. Thirsty mice receiving a few substantial water rewards learned tasks far faster than mice given many small sips. In the simplest illustration, giving a mouse a few larger rewards—like offering a single M&M versus a cookie’s worth of crumbs—led to mastery in one day with fewer than ten reinforcements, while standard small-reward protocols required many days and thousands of trials.
Unexpectedly, larger rewards also reduced variability between animals. Under typical small-reward training, one mouse might learn within a week and another might take a month. With larger rewards, most animals reached expert-level performance within a few days.
“We resigned ourselves to training animals for weeks before seeing results,” says Luke Coddington, senior scientist in the Dudman Lab and lead author of the new study. “Now, within a day, mice are already performing at a high level.”
How Dopamine Influences Learning Speed
The team identified three contributors to faster learning when reward size is increased:
- greater learning per repetition
- better retention of gains across days
- higher engagement during training sessions
Compared with small rewards, larger rewards generated bigger dopamine responses, and crucially, those dopamine signals lasted longer. When researchers artificially lengthened dopamine signals tied to small rewards, learning accelerated as well.
Longer dopamine responses led animals to learn more from each trial and to remain engaged for longer periods, which together produced faster acquisition. Among the three components, sustained engagement accounted for the largest share of individual differences in learning speed.
“By boosting dopamine responses in these experiments, it’s as if all the ‘students’ in our ‘classroom’ became fully engaged,” Coddington explains.
What This Means for Neuroscience Research
These findings have practical and conceptual implications. Practically, larger rewards cut training times and reduce variability between subjects, simplifying the study of skill acquisition. The Dudman Lab has already adopted larger rewards across many projects, reporting substantial changes in experimental workflow.
Conceptually, the results suggest mice can be trained more efficiently for complex navigation, motor, and decision-making tasks than previously thought, opening opportunities to investigate aspects of cognition once considered beyond reach.
“Beyond saving time, this could let us study cognitive processes in mice that we didn’t realize were possible,” Coddington says. “If we can truly engage them, the possibilities expand.”
Key Questions Answered:
A: Larger rewards alter dopamine signaling—the brain’s primary system for motivation and reinforcement. Small rewards produce brief dopamine flashes, while large rewards trigger dopamine responses that persist longer. That extended signal helps the brain consolidate the memory of the successful action more effectively and more quickly.
A: Sustained dopamine activity associated with a large reward keeps subjects focused. In conventional setups, animals often lose focus and show wide variability in learning speed. The prolonged dopamine wave produced by a big reward effectively increases attention and motivation across individuals, turning distracted learners into consistently engaged ones.
A: It significantly reduces training time and costs. Tasks that once required weeks or months of routine training to establish behavioral baselines can reach mastery in under 48 hours with appropriately sized rewards, freeing resources to explore more complex cognitive experiments.
Editorial Notes:
- This article was edited by a Neuroscience News editor.
- The journal paper was reviewed in full.
- Additional context was provided by editorial staff.
About this learning and neuroscience research news
Author: Halea Kerr-Layton
Source: HHMI
Contact: Halea Kerr-Layton – HHMI
Image: The image is credited to Neuroscience News
Original Research: Closed access.
“Reward magnitude determines reinforcement learning efficiency” by Sheng Gong, Alyssa Martell, Joshua T. Dudman, and Luke T. Coddington. Science
DOI: 10.1126/science.aeb0813
Abstract
Reward magnitude determines reinforcement learning efficiency
INTRODUCTION
Across fields that study learning—from artificial intelligence to experimental psychology—the learning rate is often treated as a parameter that controls individual differences in how quickly skill is acquired and is typically viewed as independent of reward size. This implies learning efficiency depends mainly on the amount of experience (number of rewards) an individual receives.
Recent theoretical work linking dopamine function to reinforcement learning, together with classic findings that dopamine encodes reward magnitude, raises a different possibility: learning rates may depend on reward size. If so, standard laboratory reward magnitudes could be suboptimal and may have slowed training times and led researchers to underestimate animals’ true learning capacity.
RATIONALE
Previous influential observations suggested dopamine neuron activity implements the reward prediction error in reinforcement learning. Newer proposals indicate dopamine may map onto the learning rate during acquisition—the parameter that determines how quickly learning converges. Because dopamine responses scale with reward magnitude, reward size could therefore control learning efficiency.
Yet there are few data on optimal reward magnitudes for learning in laboratory animals, particularly for the complex navigation, motor, and decision tasks common in modern systems neuroscience. The field typically uses rewards that are very small relative to a mouse’s daily needs (<1%). This study set out to test whether increasing reward magnitude would enhance learning efficiency and, if so, why.
RESULTS
Raising reward magnitude by one to two orders of magnitude above commonly used sizes substantially improved learning efficiency across several tasks. Mice required at least an order of magnitude fewer trials to learn a hidden-target navigation task, an effort-based reach-to-pull motor skill, and a sensorimotor decision-making task, while final performance quality remained comparable.
In extreme cases, some mice learned a hidden-target navigation task after only a few reinforced experiences—whereas the same task typically requires hundreds or thousands of reinforcements under standard reward sizes.
These effects were explained by three critical components that determine learning efficiency: (i) the learning rate, (ii) the ability to retain improvements between sessions, and (iii) sustained engagement during a task. Larger rewards improved all three and produced longer-lasting activity in dopamine neurons during reward consumption.
Using optogenetic tools to prolong dopamine activity during standard small rewards increased learning efficiency in both the hidden-target navigation and the effort-based motor skill tasks. Dopamine stimulation raised learning rates and reduced disengagement but did not improve carryover of prior learning. The study also found that while larger rewards generally enhanced dopamine measures of learning, they did not always produce clear improvements in every behavioral paradigm—for instance, large rewards can disrupt anticipatory responses in classical conditioning.
CONCLUSION
Larger rewards than those conventionally used in the field can enhance learning efficiency in mice across navigation, motor, and decision-making tasks. A major source of individual variance was the ability to remain engaged, whereas variance in intrinsic learning rate appeared smaller than expected. Consequently, larger rewards can substantially reduce differences between individuals in learning efficiency. Mesolimbic dopamine neuron activity can influence learning in multiple ways depending on both the magnitude and the temporal profile of activation.