TB Research

An Open-Source Data Driven Hybrid Modeling System for Infectious Disease Surveillance and Early Warning

Jianyi Zhang, Cui Haoliang, Xing Yiwen, Wang Zekun, Luo Wenkai, Wei Chaozhuo, Jia Zhongwei, Division of Surveillance, Early Warning and Emergency Response, Heilongjiang Provincial of Disease Control and Prevention, Harbin City, Heilongjiang Province, China

China CDC Weekly · 2026-01

Abstract

Introduction: The increasing trend of globalization has led to a heightened risk of imported epidemics; however, existing surveillance systems remain fragmented and reliant on laboratory confirmation. We developed an open-source data-driven hybrid modeling system to provide earlier and more reliable alerts, designed to complement China's multipoint trigger early-warning framework. Methods: This system integrates heterogeneous signals, including official epidemiology, digital traces, mobility, meteorology, and pathogen genomics, using semantic harmonization and a hybrid analytic stack. Seasonality-adjusted baselines with anomaly detection, mobility- and climate-aware SEIR models, and short-horizon learners generated calibrated early-warning scores. Thresholds were constrained by positive predictive value. Pilot studies were conducted for coronavirus disease 2019 (COVID-19) in Yantai and severe fever with thrombocytopenia syndrome virus (SFTSV) in Shandong and Henan, with tuberculosis indicators embedded for programmatic use. Results: Across deployments, the system achieved 83.3% sensitivity and 76.9% positive predictive value, providing a median lead time of 9.3 days before official confirmation. Forecasting accuracy reached 92.1% for COVID-19 in Yantai, 90.3% for SFTSV in Shandong, and 89.8% for SFTSV in Henan. Early warnings were aligned with subsequent confirmations and supported targeted screening and resource allocation. Conclusion: An open-source data-driven hybrid modeling system can deliver calibrated and timely alerts across diverse pathogens. By broadening inputs, enabling cross-agency linkage, and offering operator-oriented dashboards, it serves as a practical complement to China's national early-warning system and has the potential for scaling out with One Health inputs.

MeSH terms

  • Infectious disease (medical specialty)
  • Data-driven
  • Computer science
  • Software deployment
  • Hybrid system
  • Anomaly detection
  • Pneumococcal disease
  • Complement (music)
  • Coronavirus disease 2019 (COVID-19)
  • Harmonization
  • Healthcare system
  • Pandemic
  • Warning system
  • Systems modeling
  • Risk analysis (engineering)
  • System dynamics
  • Expert system
  • Disease surveillance