Artificial intelligence as a promising tool for predicting drug-induced liver injury in patients receiving tuberculosis medication: A systematic review
Stanley Bulain, Trisna Belani Pamarta, Revina Maharani, Dhila Aulia Rahmah Mukti
Digital Medicine · 2025-09
Abstract
Background: Tuberculosis (TB) remains a significant challenge due to its high health-care burden and treatment difficulty, with approximately 5%-28% of TB patients undergoing therapy experiencing antituberculosis drug-induced liver injury (TB-DILI), which leads to treatment discontinuation. Artificial intelligence (AI) utilization in medical practice enables large data processing, which aids health-care workers in predicting TB-DILI and potentially improves TB management outcomes. Hence, the aim of this study was to evaluate the diagnostic performance of AI in predicting TB-DILI. Methods: This systematic review involved studies that evaluated AI performance in predicting TB-DILI. Literature searches across PubMed, ScienceDirect, Google Scholar, SpringerLink, ProQuest, and Scopus were based on the Preferred Reporting Items for Systematic Review and Meta-Analysis ( PRISMA ) guidelines , focusing on observational studies published between 2015 and 2025. Risk of bias was assessed using the risk of bias in non-randomized studies of exposure (ROBINS-E) tool. The extracted data included study characteristics, AI type, parameters used, and diagnostic accuracy. Results: Twelve studies in China that involved 15,645 patients were included in the analysis. Of the 9 AI types included, eXtreme Gradient Boosting (XGBoost), random forest, and support vector machine were the most frequently used. The risk factors identified include age, sex, weight, TB medication, comorbidity (diabetes, hypertension, and liver disease), and aspartate aminotransferase and alanine transferase levels. The reported sensitivity, specificity, and accuracy ranged from 15.2% to 86.0%, 71.4% to 98.0%, and 71.0% to 92.9%, respectively. All areas under the curve (AUC) were above 0.7, indicating moderate-to-excellent diagnostic performances. Most AI models (50%) had an AUC between 0.8 and 0.9 (good diagnostic performance), and 26.92% of the models had AUC ranging from 0.9 to 1.0 (excellent), including XGBoost and random forest. Among the remaining models, 23.07% had AUC ranging from 0.7 to 0.8 (fair but clinically acceptable). Conclusion: The moderate-to-high diagnostic performance of the AI models shows the reliability of AI tools in early risk prediction of TB-DILI.
MeSH terms
- Medicine
- Observational study
- Tuberculosis
- MEDLINE
- Machine learning
- Intensive care medicine
- Artificial intelligence
- Risk assessment
- Liver injury
- Comorbidity
- Decision tree
- Systematic review
- Alanine aminotransferase
- Diagnostic accuracy
- Support vector machine
- Boosting (machine learning)
- Clinical Practice