Most scientific accounts of psychological mechanisms begin by observing behavior and then proposing theories to explain it.
Researchers at the University of Pennsylvania and the University of Texas at Austin took the opposite approach. Working like engineers or physicists, they modeled every step that leads from a moving object to a speed estimate: light scattering from the object, passage through the eye’s optics, sampling by the retina, and transmission through early visual pathways. They combined these components to build an optimal, principled model of speed estimation.
This kind of comprehensive model is called an “ideal observer”: it uses all available information in the most effective way possible. The team then compared the ideal observer’s performance to human observers in controlled speed-discrimination experiments using small patches of natural image movies. The close match between human performance and the ideal observer indicates that the brain’s speed-estimation computations are near optimal and can be precisely characterized. The findings also suggest practical benefits: engineers developing vision systems for cameras, robotics, or autonomous vehicles may improve performance by emulating the computations used by biological visual systems.
Unlike many previous studies that relied on simplified, artificial stimuli (for example, drifting bars on a uniform background), this study trained and tested the model on brief, natural image patches. Using realistic visual input makes the conclusions more directly relevant to how speed estimation operates in daily life and natural environments.
The study was led by Johannes Burge, assistant professor in the Department of Psychology at the University of Pennsylvania, and Wilson Geisler, professor and director of the Center for Perceptual Systems at the University of Texas at Austin. Their work was published in Nature Communications.
“There are many descriptions of what the visual system does when estimating motion, but fewer predictions about how it should do it,” Burge explained. “By starting from a best-case, or optimal, scenario, we can ask whether humans make use of available visual information in the most efficient way. If human behavior closely matches the ideal observer, that tells us a lot about the neural computations underlying speed perception.”
The researchers focused on how the visual system estimates the speed of moving images on the retina, a central aspect of motion perception that is essential for navigation and survival. Given the evolutionary importance of accurate motion estimates, it is plausible that visual systems have been tuned to perform these computations efficiently.
To build the ideal observer, the team first modeled the physical and biological stages that shape visual information: the optical blur and filtering produced by the eye’s lens, photon capture and sampling by photoreceptors on the retina, and the initial transformations performed by early visual cortex. Next they asked which stimulus features are most informative for estimating motion speed. Different sensory neurons are sensitive to different features—each has a receptive field with specific space-time selectivity. For example, one neuron may respond strongly to a bright edge moving leftward, while another responds to the same edge moving rightward. The challenge was to identify a compact set of such receptive fields that together give the most accurate speed estimates from natural inputs.
“We identified the small population of receptive fields that best supports precise motion estimation,” Burge said. “If an organism wanted to maximize the accuracy of local speed estimates, these are the kinds of receptive fields it should have.”
By combining those optimal receptive fields with a physical model of how photons reach and excite them, the model predicts how local motion signals should be encoded and decoded into speed estimates. Training and testing on natural scene patches—visual inputs akin to looking through a narrow aperture while moving—allowed the researchers to evaluate performance under realistic stimulus statistics. Importantly, image motion on the retina depends on depth: features from distant objects move more slowly, while nearby features move faster. Determining how to combine these local speed estimates into accurate judgments about self-motion and object motion remains an important question for future work.

To compare the model to perception, human participants viewed thousands of pairs of short natural-image movies. Each pair contained two movies that moved at slightly different speeds, and participants reported which movie appeared faster. Across a wide range of speed differences, human discriminations closely matched the predictions of the ideal observer.
“It is uncommon to see such clean, consistent data in perceptual psychology,” Burge noted. “The tight agreement between human performance and the ideal observer indicates that we have captured the essential computations behind human speed estimation. That understanding can be translated into better machine vision algorithms.”
Beyond immediate applications to biological and artificial vision, the researchers emphasize that this theory-driven, integrative approach to psychological science is broadly valuable. Their effort combined modeling of optics, neural sampling, receptive-field selection, and behavioral measurement—each a distinct discipline—into a single cohesive framework.
“Each component—how light reaches the retina, how neurons capture that light, which stimulus features matter, and how behavior reflects underlying computations—could be a standalone project,” Burge said. “We assembled them to improve our mechanistic understanding of visual speed estimation.”
Funding: The research was supported by the National Science Foundation through grant IIS-1111328 and by the National Institutes of Health through grants EY011747 and EY021462.
Source: Evan Lerner – University of Pennsylvania
Image Credit: The image is credited to the researchers/Nature Communications
Original Research: Full open access research: “Optimal speed estimation in natural image movies predicts human performance” by Johannes Burge and Wilson S. Geisler in Nature Communications. Published online August 4, 2015; doi:10.1038/ncomms8900
Abstract
Optimal speed estimation in natural image movies predicts human performance
Accurate motion perception depends on precise estimates of retinal image speed. We analyzed natural image movies to identify the optimal space-time receptive fields for encoding local motion speed in a particular direction, given early visual constraints. From the receptive-field responses to natural stimuli, we derived the computations that optimally combine and decode those responses into speed estimates. These computations suggest how selective, invariant speed-tuned units could be constructed by the nervous system. In psychophysical experiments using matched natural stimuli, human performance was nearly optimal: a single efficiency parameter accurately predicted the detailed shapes of many psychometric functions. We conclude that key properties of speed-selective neurons and human speed discrimination are predicted by the optimal computations, and that natural stimulus variation affects optimal and human observers in similar ways.
“Optimal speed estimation in natural image movies predicts human performance” by Johannes Burge and Wilson S. Geisler in Nature Communications. Published online August 4, 2015; doi:10.1038/ncomms8900