High Rewards Speed Up Learning by Prolonging Brain Signals

Summary: A landmark study overturns a long-standing assumption in neuroscience: learning speed is influenced not only by repetition and experience but also by reward magnitude. The research shows that larger rewards produce stronger, longer-lasting dopamine signals in the brain, driving higher engagement and substantially faster acquisition of complex skills. In practice, a few high-value rewards can teach a task far more efficiently than thousands of small, repetitive reinforcements.

Key Facts

  • Challenging the Repetition Model: Traditional thinking held that learning arises mainly from the number of repetitions with small rewards, independent of reward size. This study demonstrates that reward magnitude itself strongly shapes learning efficiency.
  • The Cookie vs. M&M Effect: In experiments, thirsty mice given a small number of large water rewards learned a task in a single day with fewer than 10 rewards. Mice receiving many tiny sips needed weeks to reach the same skill level.
  • Reduced Individual Variability: Standard small-reward protocols produce wide differences across subjects. Large rewards dramatically compressed that variability, bringing nearly all animals to expert performance within days.
  • Extended Dopamine Response: Larger rewards not only increase dopamine peak levels but also extend the duration of the dopamine signal, which appears critical for rapid learning.
  • Engagement as a Learning Driver: The study identified three reward-driven components that accelerate learning: greater per-trial retention, improved day-to-day memory retention, and increased task engagement. Of these, task engagement most strongly predicted individual learning speed.
  • New Opportunities for Complex Behavioral Studies: By shortening training time and reducing variability, higher rewards enable researchers to train mice on more complex cognitive and motor tasks previously considered impractical for rodents.

Source: HHMI

Background: For decades, researchers assumed that learning speed depended mainly on how many times an animal practiced a behavior, not how large the reward was. The common analogy was that experience — playing many hands of poker — improves skill regardless of whether the stakes are modest or immense. New experiments from the Dudman Lab at HHMI’s Janelia Research Campus show that this view misses a critical factor: jackpot size matters.

Researchers in the Dudman Lab tested whether increasing reward magnitude would change learning dynamics. Their results were striking: mice given a few large water rewards learned tasks far more rapidly than mice given many small rewards. A behavior that ordinarily required thousands of tiny reinforcements could often be learned in a single day when rewards were scaled up.

How Reward Size Affects Learning Speed

The field has long used small reward magnitudes as a default in rodent training. The Dudman Lab questioned that convention and systematically increased reward sizes by an order of magnitude or more. Across multiple paradigms — including hidden-target navigation, an effort-based reach-to-pull motor skill, and sensorimotor decision-making — larger rewards led to dramatically faster learning without degrading final performance quality.

In practical terms, mice with larger rewards learned from far fewer trials. In some cases, animals reached high performance after only a handful of successful trials, whereas mice under standard reward regimes required hundreds or thousands of reinforcements. Importantly, the change also reduced inter-animal variability: instead of some mice learning quickly and others slowly over weeks, most animals reached proficiency in just a few days.

How Dopamine Controls Learning Speed

The researchers linked these effects to dopamine, a key neuromodulator of learning and motivation. Larger rewards produced both bigger dopamine responses and a longer-lasting dopamine signal during reward consumption. These extended dopamine dynamics improved three aspects of learning efficiency:

  • greater learning per trial (higher effective learning rate),
  • better retention of gains across days, and
  • stronger, more sustained engagement during training sessions.

To test causality, the team used optogenetic stimulation to artificially prolong dopamine signals associated with small rewards. Sustained dopamine activation reproduced several benefits of large rewards, increasing learning rate and engagement, although it did not fully replicate the carry-over improvements between sessions. These findings indicate that both the magnitude and the temporal profile of dopamine activity shape how quickly and consistently animals learn.

The experiments also revealed that engagement is a dominant source of individual differences: mice that remained focused and active during training learned much faster, and large rewards helped convert otherwise distracted individuals into reliably engaged learners.

Implications for Neuroscience Research

This work has practical and conceptual implications. Practically, using larger rewards can sharply reduce training time and experimental variability, freeing resources and accelerating research timelines. Conceptually, the findings require a re-evaluation of common assumptions about how reward parameters should be chosen in behavioral neuroscience and reinforcement learning studies.

The Dudman Lab already applies higher reward magnitudes across current projects, reshaping experimental design and enabling studies of cognitive functions in mice that were previously impractical due to prohibitive training demands. By improving engagement and accelerating acquisition, this approach opens new avenues to probe learning, decision-making, and neural mechanisms at a complexity closer to primate-level tasks.

Key Questions Answered:

Q: Why does the size of a prize change how fast the brain builds a new skill?

A: Larger rewards alter dopamine dynamics. Small rewards trigger brief dopamine flashes; large rewards evoke bigger and longer-lasting dopamine signals. That sustained neuromodulatory state enhances encoding of successful actions, accelerating the consolidation of new skills.

Q: If engagement is the secret to learning, how does a big reward act like a great teacher?

A: Sustained dopamine responses produced by large rewards maintain focus and motivation. In practice, this reduces distraction and variability across subjects, effectively turning less engaged animals into attentive, high-performing learners.

Q: How does this discovery change daily laboratory operations?

A: Increasing reward magnitude can dramatically reduce training overhead and shorten timelines. Tasks that previously required weeks of training can reach mastery within 48 hours, letting researchers allocate time and resources to more advanced behavioral questions.

Editorial Notes:

  • This article was edited by a Neuroscience News editor.
  • The journal paper was reviewed in full.
  • Additional context was provided by editorial staff.

About this learning and neuroscience research news

Author: Halea Kerr-Layton
Source: HHMI
Contact: Halea Kerr-Layton – HHMI
Image: Image credited to Neuroscience News

Original Research: Closed access. “Reward magnitude determines reinforcement learning efficiency” by Sheng Gong, Alyssa Martell, Joshua T. Dudman, and Luke T. Coddington. Science. DOI: 10.1126/science.aeb0813


Abstract

Reward magnitude determines reinforcement learning efficiency

INTRODUCTION

Across fields that study learning — from artificial intelligence to experimental psychology — a common assumption is that a parameter called the learning rate governs individual differences in acquisition and is relatively independent of reward size. This implies that learning primarily reflects the amount of experience (the number of rewards). New theoretical and experimental work, however, suggests reward magnitude may influence the learning rate itself, raising the possibility that typical reward distributions used in the laboratory are suboptimal and may underestimate animals’ true learning efficiency.

RATIONALE

Dopamine neuron activity has been linked to reward prediction errors in reinforcement learning models, and classic experiments show dopamine correlates with reward magnitude. Together, these observations motivate the hypothesis that reward size could determine learning efficiency. Yet there are limited data testing what reward magnitudes best support learning across the range of tasks commonly used in systems neuroscience with mice. The field has tended to use small reward magnitudes relative to a mouse’s daily needs. This study examined whether increasing reward magnitude enhances learning efficiency and why that might occur.

RESULTS

Increasing reward magnitude by one to two orders of magnitude relative to standard field practices substantially improved learning efficiency across diverse tasks. Mice learned with an order of magnitude fewer trials in hidden-target navigation, an effort-based motor skill, and a sensorimotor decision task, without notable loss in final performance quality. At the upper limits, some mice acquired navigation skills after only a few reinforcements, where standard rewards would require hundreds or thousands of trials.

The improvement in efficiency could be explained by enhancements in three components: the learning rate, retention of improvements between sessions, and sustained engagement. Large rewards increased sustained dopamine activity during consumption. Optogenetic prolongation of dopamine responses during standard rewards was sufficient to boost learning rate and reduce disengagement but did not fully replicate improvements in between-session retention. Not all behaviors improved uniformly; for example, large rewards sometimes disrupted anticipatory responses in classical conditioning.

CONCLUSION

Larger reward magnitudes than typically used in the field can substantially enhance learning efficiency across navigation, motor, and decision-making tasks in mice. One major source of variability across animals was sustained engagement, which large rewards helped normalize. Individual differences in intrinsic learning rate appeared smaller than previously thought. Mesolimbic dopamine activity can influence multiple aspects of learning depending on its magnitude and temporal profile, suggesting practical and theoretical adjustments to how reward parameters are chosen in behavioral neuroscience.