Study: Infants Hear More Speech Than Music at Home

Summary: A new University of Washington study compared how much music and speech infants hear at home. Using full-day audio recordings, researchers found that infants consistently hear more spoken language than music, and that the gap increases as infants grow older.

The study also shows that most of the music in infants’ soundscapes comes from electronic sources—background streaming, radio, or other recorded media—whereas speech is much more likely to be delivered in person. By analyzing daylong recordings and crowd-sourced annotations, the team aimed to clarify how music and speech differently populate infants’ everyday auditory environments and what that might mean for early development.

Key Facts:

  1. Across the first two years, infants hear substantially more spoken language than music, and the disparity grows with age.
  2. Music reaching infants most often originates from electronic sources; in contrast, speech typically comes from in-person interactions.
  3. The findings are based on daylong LENA audio recordings annotated by volunteers, offering a detailed look at infants’ natural auditory exposure.

Source: University of Washington

Speech and music are the dominant auditory signals in infants’ daily lives, but their roles may differ. While previous research has firmly established how important spoken input is for language development, comparatively little is known about the music infants routinely encounter at home.

Published May 21 in Developmental Science, this University of Washington study is the first to directly compare the amount and characteristics of music versus speech captured in infants’ home environments over time. The researchers analyzed daylong recordings from English-learning infants at 6, 10, 14, 18 and 24 months. At every age examined, the recordings contained more spoken language than music, and that gap widened as infants aged.

This shows a baby in headphones.
Because audio recordings do not capture visual or situational context, researchers are interested in when and how musical moments occur in infants’ daily lives. Credit: Neuroscience News

“We wanted a realistic snapshot of what infants hear in their homes,” said corresponding author Christina Zhao, a UW research assistant professor in speech and hearing sciences. “Many studies measure the number of words directed to a baby and show that infant-directed speech strongly supports language outcomes. But we knew much less about the frequency, sources, and intent of music that babies encounter.”

The recordings revealed a clear pattern: at every recorded age, music detected in the home was more likely to come from electronic devices than from in-person singing or live performance. For speech, the reverse held true—most spoken input came from people physically present with the infant. While the proportion of speech that was infant-directed increased over time, the share of music that appeared to be intended specifically for the baby remained roughly constant across ages.

“We were surprised by how little live, intentional music showed up in these naturalistic recordings,” Zhao said, who directs the Lab for Early Auditory Perception (LEAP) at I-LABS. “Much of the music seems ambient—background radio, streaming playlists, or other electronic sources—not necessarily songs sung directly for the child.”

These real-world findings contrast with the laboratory-based music interventions Zhao and colleagues have run, where caregivers play music, babies handle instruments, and movements are synchronized with sound. In those controlled interventions, music enhanced infants’ neural responses to speech sounds. The current observational study was designed to begin addressing whether similar benefits occur within everyday home environments.

To avoid relying on parental recall—which prior work shows can overestimate how much adults talk or sing to infants—the team used Language Environment Analysis (LENA) recording devices to capture up to 16 hours per day over two days at each age. The audio snippets were then annotated through a crowdsourcing workflow on the Zooniverse platform: volunteers identified whether a clip contained speech or music, judged whether the sound source was in-person or electronic, and noted if the audio appeared directed to the infant.

Because the dataset is limited to a North American, English-learning sample, the authors plan to expand their approach to other cultural and linguistic groups. A forthcoming follow-up will analyze similar recordings from Latinx families to test whether patterns of music and speech exposure differ across populations. Researchers also want to map music moments to daily contexts—mealtimes, car rides, play sessions—to better understand when music is most likely to reach infants and how it might support development.

“We are interested in whether musical exposure predicts later developmental outcomes independently from speech input,” Zhao said. “In our current data, speech and music exposure are not strongly correlated, so families who talk more do not necessarily provide more musical input. That raises the question of whether music contributes uniquely to aspects of social, emotional, or cognitive development.”

About this music, language, and neurodevelopment research news

Author: Lauren Kirschman
Source: University of Washington
Contact: Lauren Kirschman – University of Washington
Image: Image credit: Neuroscience News

Original Research: Closed access.
Title: “Comparison of speech and music input in North American infants’ home environment over the first 2 years of life” by Christina Zhao et al., Developmental Science.


Abstract

Comparison of speech and music input in North American infants’ home environment over the first 2 years of life

From the earliest months, infants are immersed in a range of sounds that shape auditory processing and learning. Across cultures, speech and music are among the most pervasive acoustic signals in young children’s daily lives. While research over decades has documented the critical roles of speech quantity and quality for language development, comparatively little quantitative work has examined how much music infants routinely hear.

This longitudinal study analyzed daylong audio recordings from English-learning infants at 6, 10, 14, 18 and 24 months. Using crowd-sourced annotations from 643 naïve listeners who labeled 12,000 ten-second audio clips via Zooniverse, the study found that infants receive significantly more speech input than music, with the gap widening over the first two years. At every age, music was more often electronic in origin, while speech was more often in-person. The share of infant-directed speech rose over time; the proportion of music directed to infants remained stable.

The authors discuss possible reasons for the relatively limited live musical input observed in this North American sample and outline directions for future research, including cross-cultural comparisons and the potential links between musical exposure and later developmental outcomes. The paper also reflects on the strengths and limitations of using crowdsourcing methods to analyze large-scale audio datasets.