Summary: Researchers have produced one of the most detailed, data-driven maps of the mouse brain to date. Using an artificial intelligence model named CellTransformer, the team identified about 1,300 distinct regions and subregions, including previously uncharted areas, offering a new, high-resolution view of brain organization.
Unlike traditional atlases that rely on human interpretation, this map is defined entirely by molecular and cellular measurements, drawing boundaries from similarities in cell types and local cellular neighborhoods. The transformer-based approach behind CellTransformer is tissue-agnostic and may be applied beyond neuroscience to map other organs or to analyze tumor tissue at similar scale and resolution.
Key Facts:
- AI brain mapping: CellTransformer analyzed large spatial transcriptomics datasets to reveal roughly 1,300 mouse brain regions and subregions.
- Data-driven discovery: Region boundaries are defined by cellular and molecular data rather than manual annotation.
- Broad potential: The same AI framework can be adapted to other organs and to studying complex tissues such as cancers when suitable spatial data are available.
Source: Allen Institute
In a collaboration between the University of California, San Francisco (UCSF) and the Allen Institute, researchers developed CellTransformer, an AI model that produces a fine-grained parcellation of the mouse brain into approximately 1,300 regions and subregions.
This fresh, high-resolution map reveals previously uncataloged subregions and provides a framework for linking cellular composition to brain function, behavior, and disease. The results were published in Nature Communications.
By enabling researchers to associate specific functions and disease states with smaller, well-defined cellular regions, the map creates new experimental opportunities and testable hypotheses about the roles these regions play in the brain.
“It’s like moving from a map that shows only continents and countries to one that shows states and cities,” said Bosiljka Tasic, Ph.D., director of molecular genetics at the Allen Institute and coauthor of the study. “This parcellation is based purely on data rather than expert annotation, and the new subregions we see likely correspond to specialized functions waiting to be characterized.”
CellTransformer applies a transformer-style encoder-decoder architecture to spatial transcriptomics data. Spatial transcriptomics provides a readout of where particular cell types and molecular signatures are found across tissue, but it does not by itself define region boundaries. CellTransformer learns to recognize shared cellular neighborhoods and uses those patterns to outline regions—much like defining a city by the types of buildings and neighborhoods it contains.
“Transformers are widely used in language models because they excel at understanding context,” said Reza Abbasi-Asl, Ph.D., associate professor of neurology and bioengineering at UCSF and senior author of the study. “We use that same capacity to model the relationships between nearby cells in space. The model predicts a cell’s molecular features from its neighborhood and, from those predictions, infers the larger tissue organization.”
CellTransformer successfully reproduces well-known anatomical regions, such as the hippocampus, and also uncovers previously unrecognized subregions in less-characterized areas—for example, subdivisions within the midbrain reticular nucleus that may relate to movement control.
What makes this brain map distinct
This map emphasizes brain regions rather than individual cell types and is uniquely data-driven: boundaries arise from cellular and molecular similarity instead of manual labeling. With roughly 1,300 regions and subregions, it ranks among the most granular, automated maps of any animal brain produced to date.
Role of the Allen Institute’s Common Coordinate Framework (CCF)
The Allen Institute’s Common Coordinate Framework (CCF) served as the reference standard for validating CellTransformer’s results. By comparing CellTransformer’s automated regions to the CCF, the team confirmed that the AI-derived boundaries align closely with established, expert-defined anatomical structures.
“The strong agreement with the CCF gave us confidence that many of the newly discovered subregions are biologically meaningful,” said Alex Lee, a Ph.D. candidate at UCSF and first author of the study. The authors plan further computational and experimental validation to better understand these novel regions.
Beyond neuroscience, CellTransformer’s scalable, self-supervised approach can be applied to other organ systems and to tumor tissues when large-scale spatial transcriptomics datasets exist, helping reveal tissue organization and molecular heterogeneity relevant to health and disease.
Key Questions Answered:
A: It defines brain regions solely from cellular and molecular data without relying on human annotation, producing higher resolution and potentially more objective parcellations.
A: Built on a transformer encoder-decoder framework, it learns latent representations of tissue by modeling spatial relationships between cells, then clusters those representations to discover domains.
A: The map provides a precise blueprint for studying brain function, pathology, and cellular interactions, and the approach can be adapted to other organs or tumor tissues to support new discoveries.
About this brain mapping and artificial intelligence research news
Author: Peter Kim
Source: Allen Institute
Contact: Peter Kim – Allen Institute
Image: The image is credited to UCSF
Original Research: Open access. “Data-driven fine-grained region discovery in the mouse brain with transformers” by Bosiljka Tasic et al., published in Nature Communications.
Abstract
Data-driven fine-grained region discovery in the mouse brain with transformers
Spatial transcriptomics enables mapping where molecular signatures and cell types occur across tissue. To scale analysis across organ-level, multimillion-cell datasets, the authors developed a self-supervised workflow for detecting spatial domains. The encoder-decoder architecture, CellTransformer, learns hierarchical tissue features from cellular and molecular patterns and integrates data across tissue sections. Coupled with GPU-accelerated clustering, the workflow scales to multi-million-cell MERFISH and Slide-seqV2 datasets where other methods struggle.
CellTransformer reproduces known brain domains consistent with established ontologies such as the Allen Mouse Brain Common Coordinate Framework while discovering hundreds of previously uncataloged subregions with preserved spatial coherence. The model recapitulates prior neuroanatomical findings in regions like the subiculum and superior colliculus and identifies likely uncataloged subregions in subcortical areas that lack fine annotations. Applied across multiple animals, the workflow achieved near-perfect consistency for up to 100 spatial domains in a dataset spanning four mice and nine million cells across more than 200 tissue sections. Overall, CellTransformer advances spatial transcriptomics by enabling robust detection of fine-grained tissue domains at scale.