Summary: A novel deep learning algorithm that assesses the burden of genomic variants achieves about 70% accuracy in distinguishing mental health disorders among African American patients.
Source: CHOP
Researchers at the Children’s Hospital of Philadelphia (CHOP) report that a deep learning model applied to whole genome sequencing data shows promising accuracy for identifying and differentiating common mental health disorders in African American patients—an understudied population in genomic research.
The model can differentiate between single disorders and detect multiple co-occurring conditions, enabling earlier and more tailored interventions. The study was published in the journal Molecular Psychiatry.
Diagnosing mental disorders is often complex, particularly for very young children who cannot complete standard questionnaires or rating scales. This diagnostic challenge is compounded in minority populations that have been under-represented in genomic and machine learning studies. While prior genetic studies have identified association signals for several mental health conditions—some suggesting potential therapeutic targets—most of those efforts have focused on non-Hispanic White populations.
Deep learning approaches have successfully supported diagnosis for complex conditions such as attention deficit hyperactivity disorder (ADHD) and certain cancers, but they have seldom been tested on large cohorts of African American patients. To address this gap, CHOP researchers generated blood-derived whole genome sequencing data from 4,179 African American individuals, including 1,384 patients diagnosed with at least one mental disorder.
The study concentrated on eight prevalent mental health diagnoses: ADHD, depression, anxiety, autism spectrum disorder, intellectual disability, speech and language disorders, developmental delay, and oppositional defiant disorder (ODD). Using the burden of genomic variants from both coding and non-coding regions as features, the deep learning model was trained to distinguish affected patients from unaffected controls and to label specific diagnoses, including multiple concurrent disorders.

“Most prior studies focus on a single disease, and minority populations have been under-represented in machine learning research on mental disorders,” said senior author Hakon Hakonarson, MD, PhD, Director of the Center for Applied Genomics at CHOP. “We wanted to evaluate whether a deep learning model could accurately separate patients from controls and correctly classify disorder types—even when patients have multiple diagnoses.”
The algorithm evaluated variant burden across the genome and achieved overall accuracy of roughly 65–70% for distinguishing individuals with any of the studied mental disorders from controls. It also performed comparably when labeling patients with multiple conditions: the model returned an exact diagnostic match in about 10% of cases and maintained acceptable multi-label performance (hamming loss < 0.3).
Beyond classification performance, the model highlighted genomic regions and genes enriched for associations with mental disorders. Enriched biological pathways included immune-related responses, antigen and nucleic acid binding, chemokine signaling, and G-protein–coupled receptor activity. Notably, variants located in non-coding genomic regions—such as intronic, intergenic, and non-coding RNA loci—were implicated at similar or higher frequency than coding variants, suggesting they may serve as alternative genetic markers even though they do not form clear “hotspots” like some coding regions do.
“Identifying these variants and the pathways they converge on provides targets for future functional studies,” Hakonarson added. “Such work can offer mechanistic insight into how these disorders develop and guide more personalized approaches to treatment in African American populations.”
About this deep learning, genetics, and mental health research news
Author: Press Office
Source: CHOP
Contact: Press Office – CHOP
Image: The image is in the public domain
Original Research: Open access.
“Application of deep learning algorithm on whole genome sequencing data uncovers structural variants associated with multiple mental disorders in African American patients” by Yichuan Liu et al. Molecular Psychiatry
Abstract
Application of deep learning algorithm on whole genome sequencing data uncovers structural variants associated with multiple mental disorders in African American patients
Mental disorders represent a significant global health challenge, and accurately diagnosing them can be difficult—especially when patients have multiple co-occurring conditions or are very young and cannot complete standardized assessments. Over the past decade, genomic studies have reported association signals for several mental disorders, some of which suggest possible therapeutic targets. At the same time, machine learning methods, particularly deep learning, have shown promise in classifying complex diseases.
In this study, blood-derived whole genome sequencing from 4,179 African American individuals—including 1,384 patients with at least one diagnosed mental disorder—was analyzed. Researchers used the burden of genomic variants from both coding and non-coding regions as feature vectors for a deep learning model. The model achieved approximately 65–70% accuracy in differentiating patients from controls, and demonstrated effective multi-label performance for patients with multiple disorders (hamming loss < 0.3), with exact diagnostic matches around 10%.
Genes and genomic regions weighted most heavily by the model were enriched for pathways involved in immune response, antigen and nucleic acid binding, chemokine signaling, and G-protein receptor functions. Importantly, non-coding variants (such as those in ncRNA, intronic, and intergenic regions) performed comparably to coding variants in the model’s predictive ability, though they did not form the same concentrated hotspots and instead showed narrower variability—indicating they may act as alternative markers for disease risk.