New Brain Mechanism Discovered That Links Objects in Memory

Summary: Using machine learning together with brain imaging, researchers have quantified how objects that frequently appear together in the world are linked in the mind and located a specific brain region that represents those contextual associations.

Source: Johns Hopkins University

When people see a single object — a toothbrush, a car, a tree — their brain automatically brings to mind other items and settings that typically accompany that object. These automatic links help us form context, anticipate what to expect in an environment, and recognize scenes more efficiently.

Researchers combined large-scale image databases, machine-learning models, and functional MRI to measure how often objects co-occur in real-world scenes and to determine where in the brain these statistical associations are represented.

The study, published in Nature Communications, provides the first large-scale quantification of object co-occurrence in visual contexts and pinpoints a cortical region that appears to encode these associations.

“When we see a refrigerator, we aren’t just processing the appliance itself — we also activate a web of related items and settings, like cabinets, stoves, or sinks that typically appear in a kitchen,” said corresponding author Mick Bonner, a cognitive scientist at Johns Hopkins University. “Our work is the first to quantify that effect and to identify a brain region where this contextual information is represented.”

In a two-part approach, Bonner and co-author Russell Epstein of the University of Pennsylvania analyzed a labeled image database containing thousands of photographs of indoor and outdoor scenes. Each photograph included detailed annotations for individual objects — mugs, cars, trees, utensils and more — enabling the team to measure which objects tend to appear together across real-world scenes.

Using statistical modeling and machine-learning algorithms, the researchers derived quantitative measures of object co-occurrence: for example, the probability of encountering a pen when a keyboard is visible, or the likelihood of seeing a boat when a dishwasher appears in the data set. These models produced low-dimensional representations that capture the latent statistical structure of object contexts.

This shows groups of different words in different colors — Objects that frequently occur together, displayed in a heat-map style. Credit: Johns Hopkins University

With object co-occurrence quantified, the team then searched for neural representations of these statistics. While participants viewed isolated object images during functional MRI scanning, the researchers tested whether any brain region’s responses covaried with the statistical co-occurrence profiles produced by their models.

They identified a region in the visual cortex known for processing spatial layouts and scenes — the parahippocampal cortex, overlapping the anterior portion of the scene-selective parahippocampal place area — as the site most strongly encoding visual co-occurrence statistics. In other words, responses in this region to single objects reflected the ensembles of other items those objects typically appear with.

“When you look at an airplane, this region not only signals the plane’s shape and features but also activates contextual information such as sky, clouds and runways,” Bonner explained. “A brain region long associated with spatial scene processing also carries information about which objects go together in the world.”

By contrast, a language-based statistical model that captured how object names co-occur in written text showed stronger relationships with nearby object-selective visual cortex regions, suggesting a dissociation between visual experience–based and language-based contextual representations in neighboring cortical areas.

Previous work has shown people are slower to recognize objects presented out of their usual context. This study extends that knowledge by providing a fine-grained, data-driven account of the statistical relationships among objects in visual environments and identifying how those relationships are represented in the human brain.

“We demonstrate that the brain’s sensory coding of objects reflects rich statistical information about the contexts in which objects are typically encountered,” Bonner said. “These findings help explain how context and expectation shape visual perception.”

About this neuroscience research news

Source: Johns Hopkins University
Contact: Jill Rosen – Johns Hopkins University
Image: The image is credited to Johns Hopkins University

Original Research: Open access.
“Object representations in the human brain reflect the co-occurrence statistics of vision and language” by Michael F. Bonner & Russell A. Epstein. Nature Communications

Abstract

Object representations in the human brain reflect the co-occurrence statistics of vision and language

A central regularity of visual perception is that certain objects frequently co-occur in natural environments. Here we used machine learning and fMRI to test whether object co-occurrence statistics are encoded in the human visual system and whether these statistics are triggered by viewing individual objects.

We identified low-dimensional embeddings that capture latent statistical structure of object co-occurrence in real-world scenes, and we mapped these representations onto voxel-wise fMRI responses recorded while participants viewed single objects.

Our findings show that cortical responses to isolated objects are predicted by the statistical ensembles in which those objects typically appear, and that this link between objects and their visual contexts is strongest in the parahippocampal cortex, overlapping the anterior portion of the scene-selective parahippocampal place area. In contrast, a language-derived statistical model of object name co-occurrence in text predicted responses in neighboring object-selective visual cortex regions.

Together, these results indicate that sensory coding of objects in the human brain reflects latent statistical information about object context derived from both visual and linguistic experience.