Summary: For more than a century, associative learning has been framed by Pavlov’s model: repetition strengthens the link between a cue and a reward. A recent UCSF study challenges that long-standing idea, showing that the interval between rewards—not sheer repetition—determines how effectively the brain learns.
The research demonstrates that the brain’s dopamine system is tuned to prioritize timing. When rewards are rare and spaced apart, learning is more efficient: fewer experiences are needed before the brain begins to anticipate an outcome. This timing-based mechanism offers a biological explanation for why spaced practice outperforms cramming and suggests new directions for both education and artificial intelligence.
Key Facts
- The Timing Rule: Learning strength is governed by the elapsed time between cue–reward pairings rather than the absolute count of repetitions within a fixed period.
- Dopamine Acceleration: Greater spacing between rewards reduces the number of pairings required before dopamine is released at the cue, signaling anticipation.
- Sparse Learning Efficiency: In experiments, mice that received rewards only 10% of the time learned as quickly—or faster—than mice that experienced rewards far more frequently.
- The “Cramming” Effect: When events occur in rapid succession, the brain downregulates learning from each instance, producing diminishing returns from repetitive exposure.
- Implications for AI: Models inspired by this “sparse learning” principle could potentially learn faster from fewer examples, addressing a major limitation of current data-hungry AI systems.
Source: UCSF
Background: More than a century ago, Ivan Pavlov showed that a dog could learn to associate a bell with food, and that observation shaped theories of associative learning ever since: more pairings should yield stronger associations. The UCSF team reexamined this idea to test whether repetition alone drives learning or if the timing between events matters more.

“The interval between cue–reward pairings helps the brain decide how much to learn from each experience,” said Vijay Mohan K. Namboobidiri, PhD, associate professor of Neurology and senior author of the study published in Nature Neuroscience. Their experiments indicate that when events are clustered closely together, the brain learns less from each occurrence—shedding light on why cramming often produces weaker retention than distributed practice.
How the experiments worked
The researchers trained mice to associate a brief auditory cue with a sweetened water reward while varying the interval between trials. Some mice experienced trials every 30–60 seconds, while others had trials spaced five to ten minutes or longer apart. Over the same time window, mice with closely spaced trials received many more rewards than those with long intervals.
If associative learning depended only on repetition, mice with more frequent rewards should have learned faster. Instead, the mice exposed to infrequent rewards learned at equal—or sometimes greater—rates despite far fewer cue–reward pairings. “This shifts the perspective from ‘practice makes perfect’ to ‘timing matters,’” said Dennis Burke, PhD, the study’s first author.
The team monitored dopamine signals in the animals’ brains and found that spaced rewards produced earlier and stronger dopaminergic responses to the cue. In a further test, cues were presented at short intervals (about 60 seconds apart) but only followed by reward 10% of the time. Even with this low reward probability, the animals developed dopamine responses to the cue after relatively few rewarded trials—demonstrating that rarity and spacing accelerate anticipatory dopamine signaling.
Broader implications
These findings reshape how we understand learning and habit formation. They provide a biological account for why intermittent rewards—such as gambling wins or unpredictable social media notifications—are especially compelling and habit-forming: unpredictability maintains sensitivity in the dopamine system and strengthens learning.
The results also offer practical guidance for education and skill acquisition. Distributing study or practice sessions with appropriate spacing is likely to produce stronger learning than concentrated, repetitive sessions. For addiction treatment, the study suggests that constant delivery of a drug (e.g., via a nicotine patch) may reduce cue–reward associations by removing the intermittent timing that keeps the dopamine system responsive.
Looking ahead, Namboodiri and colleagues are exploring how this timing principle could inform new AI learning algorithms. Current machine learning models typically refine predictions after every interaction and require enormous datasets. A learning framework that prioritizes informative, spaced experiences could improve learning efficiency and reduce data requirements.
Authors: Additional contributors include Annie Taylor, Huijeong Jeong, SeulAh Lee, Leo Zsembik, Brenda Wu, Joseph Floeder, Gautam Naik, and Ritchie Chan, all from UCSF.
Funding: Supported by the National Institutes of Health (grants R00MH118422, R01MH129582, F32DA060044), the National Science Foundation, the Klingenstein-Simons Fellowship, the David and Lucile Packard Foundation, and the Shurl and Kay Curci Foundation.
Key Questions Answered:
A: Not at all. The findings emphasize spacing over continuous grinding. For many skills—languages, music, or complex concepts—multiple shorter sessions separated by breaks typically produce better retention than one long, uninterrupted session.
A: From an information-processing perspective, rare events provide stronger signals. If an outcome happens constantly, it blends into the background; when it’s rare, the brain treats it as more informative and gives extra weight to its timing.
A: Intermittent and unpredictable rewards keep the dopamine system highly responsive, which can deepen habit formation. That unpredictability is a key reason activities like gambling and some forms of social media use are so addictive.
Editorial Notes:
- This article was edited by a Neuroscience News editor.
- Journal paper reviewed in full.
- Additional context added by staff.
About this learning and neuroscience research
Author: Laura Kurtzman
Source: UCSF
Contact: Laura Kurtzman – UCSF
Image: Image credited to Neuroscience News
Original Research: Open access. “Duration between rewards controls the rate of behavioral and dopaminergic learning” by Dennis A. Burke, Annie Taylor, Huijeong Jeong, SeulAh Lee, Leo Zsembik, Brenda Wu, Joseph R. Floeder, Gautam A. Naik, Ritchie Chen & Vijay Mohan K Namboodiri. Nature Neuroscience
DOI:10.1038/s41593-026-02206-2
Abstract
Duration between rewards controls the rate of behavioral and dopaminergic learning
Understanding how animals learn which cues predict valuable outcomes is vital for survival. Mesolimbic dopamine is central to cue–reward associative learning and is commonly described as signaling a reward prediction error. Typical dopamine-based learning models are trial-based, assuming learning accumulates with the number of cue–outcome pairings experienced within a fixed time.
This study identifies a biological principle that challenges that assumption. Across multiple experimental conditions in mice, both behavioral learning and dopaminergic response rates scale with the duration between rewards or punishments. Consequently, total learning accumulated over a fixed time can be independent of the sheer number of cue–outcome events. A dopamine-based retrospective learning model accounts for these observations and offers a unified explanation for the underlying biological mechanisms of learning.