A machine learning-based prediction model for treatment efficacy in smear and/or chest X-ray positive tuberculosis patients.
Xiaohua Cui, Wei Fu, Xuan Wu, Zhe Peng, Wentao Wu
BMC infectious diseases · 2026-03
Abstract
OBJECTIVE: To develop a machine learning (ML)-based prediction model for tuberculosis (TB) treatment failure, and evaluate the predictive performance and clinical utility.
METHODS: Patients were randomly allocated to a training set and a validation set in a 7:3 ratio. Data collected included demographic characteristics, clinical features, and laboratory parameters. Univariate analysis and binary logistic regression were applied to the training set to identify factors associated with treatment outcome. Based on common predictive modeling standards, an AUC > 0.8 was considered good, and > 0.9 was considered excellent. Three prediction models—Random Forest (RF), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN)—were constructed. Model performance was evaluated based on accuracy, precision, recall, F1-score, and the area under the receiver operating characteristic curve (AUC).
RESULTS: Among 541 enrolled patients, 133 (24.58%) experienced treatment failure (92 [24.27%] in the training set and 41 [25.31%] in the validation set).Cavitation, diabetes comorbidity, radiographic disease extent, TB type (pulmonary vs. extrapulmonary), lymphocyte percentage (LYMPH%), and serum albumin (ALB) level were identified as significant predictors of treatment outcome ( < 0.05). The RF, SVM, and KNN models achieved AUC values of 0.783, 0.707, and 0.668, respectively.
CONCLUSION: The ML-based prediction model shows fair to good predictive performance (AUC up to 0.783), suggesting potential clinical utility with further validation. This model may assist in early risk stratification and support individualized treatment planning for tuberculosis patients.
CLINICAL TRIAL NUMBER: Not applicable.
SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12879-026-13173-1.