Summary: A large genome-wide analysis of five reading- and language-related skills reveals shared genetic foundations that contribute to these abilities.
Source: Max Planck Institute
What is the biological basis of our uniquely human capacity to speak, read and write?
An extensive genome-wide study of five reading and language skills, conducted across many thousands of people and published in PNAS, identifies overlapping biology underlying these traits. The research was led by teams at the Max Planck Institute for Psycholinguistics and the Donders Institute in Nijmegen, the Netherlands, and is the first major output from the international GenLang consortium.
Previous smaller genetic studies of language and literacy often reported associations that did not hold up in larger samples. By combining data from 22 cohorts worldwide, the GenLang team assembled datasets large enough to investigate the many common DNA variants that each exert very small effects on complex cognitive traits like reading and language.
Most participants were native English speakers, but the pooled samples also included individuals with Dutch, Spanish, German, Finnish, French and Hungarian as mother tongues. With up to 34,000 participants per trait, the study had sufficient power to assess the contribution of several million single-nucleotide variants using methods that have proven effective in biomedical genetics.
Assessing reading and language traits
Each cohort had measured participants on psychometric tests tapping key aspects of reading and language. Three tasks assessed reading and spelling: reading real words aloud (e.g., “horse”), reading pronounceable nonwords (e.g., “chove”) and spelling accuracy. A fourth measure, phoneme awareness, evaluated the ability to detect and manipulate speech sounds—tasks like removing the initial sound from a word or producing spoonerisms. The fifth trait, nonword repetition, required participants to repeat unfamiliar spoken nonwords of varying length and complexity, a task that engages speech perception, verbal short-term memory and articulation.
Genetic data were available for all cohorts, enabling a genome-wide association study (GWAS). The researchers used genetic correlation and multivariate modeling to determine how DNA variants overlapped across the five skills and how these genetic signals related to other cognitive and brain imaging measures.
“We have long known that individual differences in language and reading skills are influenced by genetic variation,” says Else Eising from the Max Planck Institute for Psycholinguistics, first author of the study. “By bringing together tens of thousands of participants, we can now more reliably identify the many DNA variants that contribute.”
Shared genetic architecture
The GenLang analyses revealed that the five reading- and language-related traits share a substantial genetic basis. Structural equation modeling showed a common genetic factor that explains most of the variation in word reading, nonword reading, spelling and phoneme awareness. Nonword repetition shared some genetic influences with the other traits but also showed distinct genetic components.
While the study detected correlations with general cognitive ability (both verbal and nonverbal), genetic overlap with nonverbal IQ was relatively modest. The researchers also note that many previously reported candidate gene associations from much smaller studies did not replicate in this larger, better-powered analysis, suggesting that some earlier findings were likely false positives.

The team identified genetic associations linked to individual variation in the left superior temporal sulcus, a brain region strongly implicated in processing spoken and written language. Heritability was also enriched in genomic regions that regulate gene expression in the fetal brain, highlighting developmental influences on later language and literacy skills.
Biology and environment together
“This research demonstrates the value of large-scale collaborative science for uncovering the molecular genetics of complex human traits like language,” says Simon Fisher, director at the Max Planck Institute and a founder of the GenLang consortium. The authors emphasize that the biological architecture of language and reading is complex and multifaceted. Genetic predispositions interact with exposure to spoken language and formal instruction in reading, so both nature and nurture are essential for developing these skills.
Looking ahead, the GenLang team plans to expand genetically informative datasets to include a broader range of language-relevant abilities, such as grammatical processing. They also aim to develop scalable, online tests that can efficiently assess reading and language skills in very large samples—an important step for future genetic research on language and literacy.
About this language and genetics research news
Author: Press Office
Source: Max Planck Institute
Contact: Press Office – Max Planck Institute
Image: The image is in the public domain
Original Research: Open access. “Genome-wide analyses of individual differences in quantitatively assessed reading- and language-related skills in up to 34,000 people” by Else Eising et al., PNAS
Abstract
Genome-wide analyses of individual differences in quantitatively assessed reading- and language-related skills in up to 34,000 people
Spoken and written language is a defining human capacity. Twin studies indicate that individual differences in reading- and language-related skills are heritable, with estimates ranging from roughly 30 to 80% depending on the trait. However, the genetic architecture is complex and multifactorial, and prior investigations of single-nucleotide polymorphisms (SNPs) were underpowered.
We report a multicohort GWAS of five psychometrically assessed traits—word reading, nonword reading, spelling, phoneme awareness and nonword repetition—in 13,633 to 33,959 participants aged 5 to 26 years. We observed a genome-wide significant association for word reading at rs11208009 (P = 1.098 × 10−8), at a locus not previously linked to intelligence or educational attainment. All five traits showed robust SNP heritability, accounting for 13 to 26% of trait variance.
Genomic structural equation modeling identified a shared genetic factor explaining most of the variation in word and nonword reading, spelling and phoneme awareness; this factor only partially overlapped with genetic influences on nonword repetition, intelligence and educational attainment. A multivariate GWAS combining word/nonword reading, spelling and phoneme awareness increased power for downstream analyses.
Genetic correlation with neuroimaging traits highlighted an association with surface area of the banks of the left superior temporal sulcus, a region involved in processing spoken and written language. Heritability was enriched in genomic elements regulating gene expression in the fetal brain and in chromosomal regions depleted of Neanderthal variants.
Together, these findings open routes to understanding the biological foundations of uniquely human language and literacy abilities.