Researchers have developed the first biologically realistic mathematical model describing how the brain plans and learns during complex decision-making.
Scientists at the University of Cambridge have created a comprehensive, biologically grounded model that explains how networks of neurons plan ahead, evaluate alternative courses of action, and adapt after errors. Unlike prior accounts that focused mainly on simple habitual behaviour, this model reproduces both observed choices and neural activity in a wide range of tasks, from straightforward binary choices to multi-step sequential decisions.
The new spiking-neuron model—reported in the Journal of Neuroscience—captures behavioural choice probabilities and successfully predicts hallmark signatures of complex planning such as choice reversal. By matching experimental data across different task types, it offers a unified account of how the brain implements goal-directed decision making and how synapses change to support learning.
Decision making spans a spectrum. At one end are habit-based choices, such as a familiar daily commute: once learned, these actions are retrieved quickly and automatically, like cached data in a computer. At the other end are goal-based choices, which require planning across possible future outcomes. For example, if a road is closed you must evaluate alternate routes at each intersection, considering how each choice affects subsequent options and outcomes.
“Goal-based decisions are computationally demanding because they involve exploring a branching set of possible futures,” said Dr Johannes Friedrich (who conducted the work while a postdoctoral researcher at Cambridge and is first author on the paper). “A detour on your commute forces separate decisions at multiple junctions.”
Whereas the neural mechanisms behind habitual choices are relatively well understood, the circuit-level basis for prospective, goal-directed planning has remained unclear. To address this, Friedrich and Dr Máté Lengyel built a biologically plausible neural circuit that performs online value estimation—computing expected cumulative rewards for sequences of actions—using spiking neurons and local synaptic plasticity rules.
The model demonstrates how synaptic weights can embed knowledge of the task structure: which situations follow which actions and what immediate rewards each transition yields. Crucially, the same local learning rules allow those synapses to adapt over time when outcomes differ from expectations. This mirrors learning observed in human and animal studies, where connections strengthen or weaken depending on prior success or failure.

By integrating planning and learning within a single framework, the researchers produced what they describe as the most complete model to date of complex, goal-directed decision making at the computational, algorithmic and implementational levels. Beyond improving our understanding of healthy cognition, the model has implications for disorders in which goal-directed control is impaired.
For instance, obsessive-compulsive disorder (OCD) has been associated with a selective deficit in goal-directed control, causing patients to rely excessively on habits. Impaired decision making is also implicated in addiction, suicide attempts, and Parkinson’s disease; a clearer circuit-level picture of planning and learning could inform new approaches to understanding and treating these conditions.
Source: Sarah Collins – University of Cambridge
Image credit: Seung Lab. Image licensed CC BY-NC-SA 3.0 and adapted from a University of Cambridge press release.
Original research: “Goal-Directed Decision Making with Spiking Neurons” by Johannes Friedrich and Máté Lengyel, Journal of Neuroscience. Published online February 3, 2016. doi:10.1523/JNEUROSCI.2854-15.2016
Abstract
Goal-Directed Decision Making with Spiking Neurons
Behavioral and neuroscientific evidence distinguishes habitual from goal-directed action selection. Habit formation—updating cached values—has been extensively studied and is well explained by reward prediction error theories of dopamine function. In contrast, the circuit mechanisms that support goal-directed choice, which require iterative online value estimation, have not been established. Here we present a spiking neural network that provably solves the online value estimation problem underlying goal-directed decisions in a near-optimal manner and reproduces both behavioural and neurophysiological data across tasks ranging from simple binary choice to sequential decision making. Using local plasticity rules, the network learns synaptic weights that enable optimal performance and resolves one-step decision problems commonly studied in neuroeconomics, as well as more demanding sequential tasks within about one second. The resulting decision times, their dependence on task parameters, and final choice probabilities match behavioural observations, while the evolution of neural activity in the model mirrors neural responses recorded in frontal cortical areas during these tasks. This framework provides a principled account of the neural basis of goal-directed decision making and makes testable predictions for sequential tasks involving multiple rewards.
Significance statement: Goal-directed actions require prospective planning, yet their circuit-level mechanisms have been elusive. We show how a biologically realistic spiking-neuron circuit can perform this computationally challenging task. With synaptic weights learned through local plasticity, the dynamics of the network produce near-optimal plans. By systematically comparing model outputs with empirical data, the model reproduces behavioural decision times, choice probabilities, and neural response patterns across a diverse set of tasks, offering the first biologically grounded account of complex goal-directed decision making at multiple explanatory levels.