Machine Learning for Early Disease Outbreak Detection in the United States: An Integrative Systematic Review of Multisource Data Integration
Akinyemi Sadeeq Akintola
*
Nova Information Management School (NOVA IMS), Universidade Nova de Lisboa, Portugal.
Prosper Aimé Tchoumo
Department of Statistics, Iowa State University, Ames, Iowa, USA.
Habeeb Abdulrauf
School of Communication, Western Michigan University, United States.
Okonkwo Emmanuella
College of Professional Studies, Northeastern University, United States of America.
Ayeoribe Abiodun Olorunfemi
Ivy College of Business, Iowa State University, Ames, Iowa, United States.
*Author to whom correspondence should be addressed.
Abstract
Early detection of outbreaks is predicated on rapid response, yet this process is hindered by delayed and fragmented surveillance. The objective of this study is to evaluate the landscape, performance, and implementation readiness of machine-learning (ML) models for early outbreak detection in U.S. public health systems. An integrative systematic review was conducted in accordance with the PRISMA 2020 guidelines, encompassing a comprehensive search of major bibliographic and grey literature sources from 2015 to the present. A total of 18 studies satisfied the inclusion criteria, incorporating machine learning (ML) algorithms to detect potential syndromes, employ nowcasting with detection relevance, or identify variant emergence. Studies reported detection metric families, including sensitivity/recall, specificity, positive predictive value (PPV)/false-alarm rate, and timeliness (lead-time in days), alongside accuracy proxies for nowcasting. Data sources spanned EHR, ED free-text, web search/social media, mobility, IoT thermometry & wearables, wastewater, and genomic sequences. Current evidence supports high potential for earlier situational awareness. Public health use hinges on standardized prospective validation, clear governance for alert thresholds, and sustained model-monitoring (MLOps). We also outline stakeholder-specific actions to guide procurement and deployment.
Keywords: Emergency, disease, detection, variant