AI Tool Distinguishes Baby Cries: Fussy, Hungry, or in Pain

Summary: A new deep learning approach, adapted from automatic speech recognition techniques, can accurately detect meaningful features in an infant’s cry and distinguish between typical and atypical cry signals.

Source: Chinese Association of Automation

Every parent knows the uncertainty of responding to a baby’s cry—wondering if the infant is hungry, tired, wet, seeking comfort, or experiencing distress. Researchers in the United States have developed an artificial intelligence method that identifies and classifies infant cries, separating normal signals from abnormal ones that may indicate illness or other medical concerns. This cry-analysis technique, based on cry language recognition and automatic speech recognition principles, has potential use both at home and in clinical settings where quicker, objective assessment of an infant’s condition can aid caregivers and clinicians.

The full study appears in the May issue of IEEE/CAA Journal of Automatica Sinica (JAS), a joint publication of the IEEE and the Chinese Association of Automation.

Experienced caregivers and pediatric professionals can often infer meaning from a baby’s cry, because cries caused by similar needs share common acoustic features. However, extracting and interpreting those subtle patterns from noisy recordings has presented a long-standing challenge. The research team applied machine learning and signal-processing techniques to analyze the acoustic features of cries and demonstrate that automated cry recognition is feasible and effective for classifying different cry types.

The researchers built their system using methods from automatic speech recognition, adapting them to the infant cry domain. To handle large-scale and noisy audio data—typical of domestic and clinical environments—the team used compressed sensing, a signal reconstruction technique that efficiently recovers signals from sparse measurements. Compressed sensing helps the algorithm focus on meaningful cry features even when recordings include background noise or are taken from imperfect conditions.

The resulting cry language recognition algorithm is designed to be speaker-independent, meaning it does not rely on the identity of an individual infant. That makes it more broadly applicable in real-world scenarios, where systems must generalize across many different babies. In tests using practical data, the approach showed promising accuracy at distinguishing normal cries from abnormal ones and at recognizing features associated with different cry meanings.

While each baby’s cry is unique, they share common features when cries arise from the same causes. The image is in the public domain.

“Like a special language, there are lots of health-related information in various cry sounds. The differences between sound signals actually carry the information. These differences are represented by different features of the cry signals. To recognize and leverage the information, we have to extract the features and then obtain the information in it,” says Lichuan Liu, corresponding author and Associate Professor of Electrical Engineering and Director of the Digital Signal Processing Laboratory, whose group led the study.

The study identifies and extracts a range of audio features in both the time and frequency domains. Feature extraction techniques used include linear predictive coding (LPC), linear predictive cepstral coefficients (LPCC), Bark frequency cepstral coefficients (BFCC), and Mel frequency cepstral coefficients (MFCC). After feature extraction, the compressed sensing framework was applied to classify cry signals, and the team validated their approach with real-world recordings.

The authors emphasize that this research could extend to other healthcare scenarios where diagnostic decisions often rely on practitioner experience. By providing an objective, data-driven tool for interpreting infant vocalizations, the technology aims to reduce uncertainty for parents and support clinical triage in pediatric care. “The ultimate goals are healthier babies and less pressure on parents and caregivers,” Liu adds. The research team is pursuing collaborations with hospitals and medical research centers to gather more data, refine their models, and explore potential clinical applications.

About this neuroscience research article

Source:
Chinese Association of Automation
Media Contacts:
Yan Ou – Chinese Association of Automation
Image Source:
The image is in the public domain.

Original Research: Open access
“Infant Cry Language Analysis and Recognition: An Experimental Approach”. Lichuan Liu, Wei Li, Xianwen Wu, Benjamin X. Zhou.
IEEE/CAA Journal of Automatica Sinica. doi: 10.1109/JAS.2019.1911435

Abstract

Infant Cry Language Analysis and Recognition: An Experimental Approach

Although natural language processing has advanced considerably, infant crying—an early and primary form of communication—remains underexplored because it is not an easily interpreted verbal language. Cry signals do convey information about an infant’s wellbeing and, to some degree, can be interpreted by experienced caregivers and clinical experts. This study analyzes audio features of infant cries in time and frequency domains and uses those features to classify cry signals by meaning. Extracted features include LPC, LPCC, BFCC, and MFCC. Compressed sensing was employed for classification, and practical datasets were used to design and verify the proposed methods. Experimental results indicate that the proposed infant cry recognition approaches achieve accurate and promising performance for distinguishing cry types in noisy, real-world conditions.

Feel free to share this Neuroscience News.