TB Research

Development and validation of prediction algorithm to identify tuberculosis in two large California health systems.

Heidi Fischer, Lei Qian, Zhuoxin Li, Katia Bruxvoort, Jacek Skarbinski, Yuching Ni, Jennifer H Ku, Bruno Lewin, et al. (13 authors)

Nature communications · 2025-04

Abstract

California data demonstrate failures in latent tuberculosis screening to prevent progression to tuberculosis disease. Therefore, we developed a clinical risk prediction model for tuberculosis disease using electronic health records. This study included Kaiser Permanente Southern California and Northern California members ≥18 years during 2008-2019. Models used Cox proportional hazards regression, Harrell's C-statistic, and a simulated TB disease outcome accounting for cases prevented by current screening which includes both observed and simulated cases. We compared sensitivity and number-needed-to-screen for model-identified high-risk individuals with current screening. Of 4,032,619 and 4,051,873 Southern and Northern California members, tuberculosis disease incidences were 4.1 and 3.3 cases per 100,000 person-years, respectively. The final model C-statistic was 0.816 (95% simulation interval 0.805-0.824). Model sensitivity screening high-risk individuals was 0.70 (0.68-0.71) and number-needed-to-screen was 662 (646-679) persons-per tuberculosis disease case, compared to a sensitivity of 0.36 (0.34-0.38) and number-needed-to-screen of 1632 (1485-1774) with current screening. Here, we show our predictive model improves tuberculosis screening efficiency in California.

MeSH terms

  • Humans
  • Male
  • Female
  • Adolescent
  • Young Adult
  • Adult
  • Middle Aged
  • Aged
  • California
  • Regional Health Planning
  • Tuberculosis
  • Mass Screening
  • Risk Factors
  • Electronic Health Records
  • Datasets as Topic
  • Prediction Algorithms