IFN-γ and IL-6 as Key Predicting Biomarkers for Active TB Among PLWH: Results from Four Machine Learning Methods.
Juan Wan, Virasakdi Chongsuvivatwong, Pei Zhang, Qiujing Li, Liping Zhao, Lingqing Zou, Jun Zhao, Jingyi Dai
International journal of general medicine · 2026-01
Abstract
PURPOSE: Tuberculosis remains a major cause of mortality in people living with HIV (PLWH), yet early diagnosis remains challenging. This study aimed to identify novel biomarker combinations and develop machine learning models, and to predict active TB in PLWH in a random and a chronological subset.
PATIENTS AND METHODS: We enrolled 760 PLWH with pulmonary symptoms. Demographic and clinical data and cytokine profiles were analyzed. Participants were first randomly split into training and validation sets. Subsequently, the whole dataset was re-analysed using the first 609 records as the training set and subsequent 151 records as the test set. Four models were developed with 10-fold cross-validation, incorporating feature selection and hyperparameter optimization. Model performance was assessed through ROC-AUC, sensitivity, specificity, and variable importance analysis.
RESULTS: For the randomly split datasets, with active TB patients showed significantly elevated IFN-γ (median 5.7 vs 3.9 pg/mL,<0.001) and IL-6 levels (25.3 vs 13.2 pg/mL,<0.001) compared to without active TB cases. These two biomarkers were strong predictors based on the gradient boosting machine (GBM) model. AUCs (95% CI) on the randomly selected training dataset, was 0.96 (0.95, 0.97). That on the randomly selected test dataset was 0.73 (95% CI: 0.65-0.81). However, on chronological order, GBM model trained from the first 609 records AUC of 0.92 (0.91, 0.94) poorly predicted the 151 final records with the AUC of 0.66 (0.58, 0.75).
CONCLUSION: TB might have activated the two inflammatory biomarkers among the PLWH. The best predictive machine learning method still have limitation in generalizability to predict the outcome on other data sets.