Development and validation of prediction algorithm to identify tuberculosis in two large California health systems
Fischer H, Qian L, Li Z, Bruxvoort K, Skarbinski J, Ni Y, Ku JH, Lewin B, et al. (13 authors)
Nature communications · 2025-04
Abstract
California data demonstrate failures in latent tuberculosis screening to prevent progression to tuberculosis disease. Therefore, we developed a clinical risk prediction model for tuberculosis disease using electronic health records. This study included Kaiser Permanente Southern California and Northern California members ≥18 years during 2008-2019. Models used Cox proportional hazards regression, Harrell's C-statistic, and a simulated TB disease outcome accounting for cases prevented by current screening which includes both observed and simulated cases. We compared sensitivity and number-needed-to-screen for model-identified high-risk individuals with current screening. Of 4,032,619 and 4,051,873 Southern and Northern California members, tuberculosis disease incidences were 4.1 and 3.3 cases per 100,000 person-years, respectively. The final model C-statistic was 0.816 (95% simulation interval 0.805-0.824). Model sensitivity screening high-risk individuals was 0.70 (0.68-0.71) and number-needed-to-screen was 662 (646-679) persons-per tuberculosis disease case, compared to a sensitivity of 0.36 (0.34-0.38) and number-needed-to-screen of 1632 (1485-1774) with current screening. Here, we show our predictive model improves tuberculosis screening efficiency in California.
MeSH terms
- Humans
- Tuberculosis
- Mass Screening
- Risk Factors
- Adolescent
- Adult
- Aged
- Middle Aged
- Regional Health Planning
- California
- Female
- Male
- Young Adult
- Electronic Health Records
- Datasets as Topic
- Prediction Algorithms