How AI Is Transforming Child Abuse Detection

Summary: A new multicenter study shows that artificial intelligence (AI) and machine learning can more accurately identify cases of physical child abuse in emergency departments than conventional diagnostic coding alone. By combining injury codes with abuse-specific diagnostic data and applying a LASSO logistic regression model, researchers produced more reliable prevalence estimates and reduced the errors associated with relying solely on ICD-10-CM abuse codes.

The AI-driven method outperformed approaches that depend only on provider- or administrator-assigned diagnostic codes, which on average produced an 8.5% absolute estimation error. These results indicate that machine learning tools could strengthen early detection, inform clinical decision-making, and improve safety for children seen in emergency settings.

Key Facts:

Improved accuracy: Site-level prevalence estimates based only on abuse-specific ICD-10-CM codes had average absolute errors of 8.5% compared with child abuse pediatrician determinations.
AI advantage: Predictive models incorporating injury codes and abuse-specific codes reduced average absolute error to 1.8% across study sites.
Study population: Analysis included 3,317 emergency department encounters for children under 10 years old; median age was 8.4 months, with 74% of patients younger than two years.

Source: Pediatric Academic Societies (PAS 2025)

Overview

Researchers presented these findings at the Pediatric Academic Societies (PAS) 2025 meeting in Honolulu. The team developed and tested a machine-learning approach to improve estimation of physical abuse (PA) prevalence among children evaluated in emergency departments. Their work addresses documented limitations of International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) codes for accurately capturing cases of child physical abuse in ED data.

This shows the outline of children. — Relying on abuse codes alone misdiagnosed on average 8.5% of cases. Credit: Neuroscience News

The investigators linked data from CAPNET (a multicenter child abuse research network) and the Pediatric Health Information System (PHIS) across seven children’s hospitals. They included encounters from February 2021 through December 2022 for children under 10 who were evaluated by a child abuse pediatrician (CAP) because of concern for physical abuse. True PA (the study reference standard) was defined using CAP-assigned likelihood ratings on a seven-point scale.

Instead of relying only on abuse-specific ICD-10-CM codes, the researchers incorporated all 4-digit injury ICD-10-CM codes together with abuse-related codes, including suspected abuse codes adapted from the Centers for Disease Control and Prevention (CDC) surveillance definitions. They then trained LASSO logistic regression models to predict CAP-determined physical abuse for encounters both with and without abuse-specific codes.

Findings and significance

Of 6,178 CAPNET encounters, 3,317 met inclusion criteria and were successfully linked with PHIS records for ED visits. CAP diagnosed physical abuse in 35% (n=1,145) of these encounters: 63.4% (n=905) of encounters that had abuse-specific codes and 12.7% (n=240) of encounters without abuse-specific codes. Overall, at least one abuse-specific code was assigned in 43% of encounters.

When sites estimated PA prevalence using only abuse-specific code assignment, estimation errors ranged from 2.0% to 14.3% (average absolute error 8.5%), indicating overestimation at several sites. By contrast, the predictive models produced site-specific prevalence estimates with much smaller errors, ranging from -3.0% to 2.6% and an average absolute error of 1.8%. Error decreased for six of seven sites compared with code-only estimates.

These results demonstrate that a data-driven, machine-learning approach that considers both injury patterns and abuse-related diagnostic codes can substantially improve the accuracy of PA prevalence estimates derived from administrative ED data. More accurate prevalence estimates can guide quality improvement, resource allocation, and research on prevention and detection strategies.

Implications for clinicians and researchers

Accurate identification of child physical abuse in emergency departments is critical for timely protection and appropriate care. Administrative diagnostic codes are practical for surveillance but can misclassify cases when used alone. Integrating injury code patterns with abuse-specific codes through machine-learning models offers a more reliable approach for surveillance, epidemiologic research, and evaluation of interventions. This method may support clinicians, hospital systems, and public health teams in better recognizing trends and targeting efforts to protect vulnerable children.

About this research

Author: PAS 2025
Source: Pediatric Academic Societies (PAS)
Contact: PAS 2025 – Pediatric Academic Societies
Image credit: Neuroscience News

Abstract

Title: A Machine Learning Approach to Improve Estimation of Physical Abuse

Background: ICD-10-CM codes alone are imperfect for determining physical abuse prevalence in emergency department settings. Including injury codes alongside abuse-specific codes may yield more accurate prevalence estimates.

Objective: To develop and validate a coding schema using machine learning to improve estimation of physical abuse prevalence.

Design/Methods: Secondary analysis of children under 10 evaluated by child abuse pediatricians for suspected physical abuse from Feb 2021–Dec 2022 across seven children’s hospitals contributing to CAPNET and PHIS. True PA was defined by CAP-assigned ratings indicating high likelihood of abuse. All 4-digit injury ICD-10-CM codes and abuse-specific codes were used to build LASSO logistic regression models predicting CAP-determined PA and to calculate site-specific prevalence estimates. Estimation error was defined as the difference between model-based or code-based estimated prevalence and the CAP-determined prevalence.

Results: Of 3,317 linked ED encounters, median age was 8.4 months (74% < 2 years). CAP diagnosed PA in 35% of encounters. Abuse-specific codes were assigned in 43% of encounters. Site estimates based only on abuse-specific codes overestimated PA prevalence (errors 2.0%–14.3%, average absolute error 8.5%), while LASSO model–based estimates reduced errors to -3.0%–2.6% (average absolute error 1.8%).

Conclusions: Predictive models that incorporate injury and abuse codes provide more accurate estimates of physical abuse prevalence in emergency department data than reliance on abuse-specific diagnostic codes alone.