Study Links Autism to Mutations in Noncoding DNA

Summary: Using artificial intelligence, researchers have identified spontaneous mutations in noncoding regions of the human genome that may contribute to autism. These regulatory mutations are associated with altered gene expression in the brains of children with autism spectrum disorder (ASD) and overlap genes already linked to neuronal development, migration and synaptic function.

Source: Simons Foundation

Researchers using artificial intelligence and deep learning have provided the first clear functional link between mutations in so-called “noncoding” or previously labeled “junk” DNA and autism. The study, published May 27 in Nature Genetics, demonstrates how regulatory changes outside protein-coding regions can meaningfully contribute to ASD risk.

The study was led by Olga Troyanskaya in collaboration with Robert Darnell. Troyanskaya is deputy director for genomics at the Flatiron Institute’s Center for Computational Biology (CCB) and a professor of computer science at Princeton University. Darnell is a professor at Rockefeller University and an investigator at the Howard Hughes Medical Institute.

The team applied machine learning models to whole-genome sequences from 1,790 individuals diagnosed with autism and their unaffected parents and siblings. These families, drawn from the Simons Simplex Collection, were chosen because the affected child had no prior family history of autism, making de novo (spontaneous) mutations the most likely genetic cause.

Only 1–2% of the human genome encodes proteins; the remaining 98% contains regulatory elements that control when and where genes are expressed. Historically, research has focused on protein-coding mutations, which account for up to about 30% of autism cases in sporadic ASD. However, the new analysis shows that noncoding mutations can have an impact comparable to disruptive protein-coding mutations by altering gene regulation.

Pinpointing causative noncoding mutations is challenging: each person carries many unique noncoding variants, and most are rare or private. To overcome this, the researchers trained a deep-learning model to predict how any given DNA sequence affects gene expression and regulatory activity. This allowed them to evaluate the functional impact of individual noncoding mutations, even when those mutations were rare or previously unseen.

“This represents a shift in genetic studies,” says Chandra Theesfeld, a research scientist in Troyanskaya’s lab. “Instead of relying solely on common variants across many individuals, we can now predict the functional consequence of specific mutations, including rare or unique ones.”

The team applied the predictive model to the Simons Simplex Collection, which includes nearly 2,000 family quartets (an affected child, an unaffected sibling, and their parents). By comparing predicted regulatory effects of de novo noncoding mutations in each proband to the corresponding sequence in their unaffected sibling, the researchers used the sibling as an internal control to identify mutations with significantly higher predicted impact in ASD cases.

This shows how the AI discovered the mutations
Genes predicted to be disrupted by regulatory mutations in people with autism tended to be involved in brain cell functioning and fell into two categories. One category relates to synapses, communication hubs between neurons, and the other relates to chromatin, the highly structured form of DNA and proteins required for proper gene expression from chromosomes. Image courtesy of the Troyanskaya lab.

The analysis revealed that a significant subset of de novo noncoding mutations in ASD probands were predicted to disrupt transcriptional and post-transcriptional regulation. These high-impact noncoding changes are enriched near genes involved in neuronal development, migration and synaptic transmission—processes central to brain wiring and function and previously implicated in autism.

“The design of the Simons Simplex Collection is what allowed us to do this study,” says co-author Jian Zhou. “Having unaffected siblings provides a built-in control for predicting mutation effects.”

To validate the computational predictions, the researchers tested several high-impact noncoding mutations in laboratory cell assays. Introducing predicted disruptive variants caused allele-specific changes in gene expression, confirming the model’s ability to identify functionally relevant regulatory mutations.

These results point to a convergent genetic landscape for ASD in which both coding and noncoding mutations affect overlapping biological pathways. The work also suggests noncoding variants contribute to the heterogeneity seen in ASD, including differences in cognitive outcomes among affected individuals.

Troyanskaya and colleagues emphasize the broader implications: the same deep-learning framework can be applied to other complex diseases—such as cancer and cardiovascular conditions—to reveal contributions from noncoding regions that standard analyses often ignore. “Right now, 98 percent of the genome is usually being thrown away,” Troyanskaya notes. “Our work opens the possibility of using that information to improve diagnosis and, ultimately, treatment.”

About this neuroscience research article

Source:
Simons Foundation
Media Contacts:
Anastasia Greenebaum – Simons Foundation
Image Source:
Image courtesy of the Troyanskaya lab.

Original Research: Closed access
Article title: Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk
Authors: Jian Zhou, Christopher Y. Park, Chandra L. Theesfeld, Aaron K. Wong, Yuan Yuan, Claudia Scheckel, John J. Fak, Julien Funk, Kevin Yao, Yoko Tajima, Alan Packer, Robert B. Darnell & Olga G. Troyanskaya
Journal: Nature Genetics. doi: 10.1038/s41588-019-0420-0

Abstract

Using a deep-learning framework that predicts regulatory effects of DNA sequences, the authors analyzed whole-genome data from 1,790 ASD simplex families to detect the contribution of noncoding mutations to autism. The study finds that probands carry de novo transcriptional- and post-transcriptional-regulation-disrupting mutations with significantly higher predicted functional impact than those found in unaffected siblings. Further analyses implicate synaptic transmission and neuronal development pathways and reveal a convergent picture that combines coding and noncoding genetic risk in ASD. Experimental assays confirmed allele-specific regulatory activity for prioritized mutations and highlighted links between noncoding variation and variability in IQ among probands. The framework prioritizes high-impact noncoding mutations for follow-up and is broadly applicable to other complex human diseases.

Feel free to share this Neuroscience News.