TB Research

Prediction of Tuberculosis Patients’ Treatment Outcomes Using Multinomial Naive Bayes Algorithm and Class-Imbalanced Data

Wei Lian Willian Foh, Sau Loong Ang, Chia Yean Lim, Arvindran Alaga, Gik Hong Yeap

Abstract

Tuberculosis (TB) is a severe and highly contagious disease that affects millions of people worldwide. The current TB treatment programs are challenging to complete for many patients due to numerous factors, including limited human resources and financial resources. To address these challenges, a solution is needed to aid in resource allocation strategies. This study suggests a machine learning methodology for predicting the treatment outcomes of TB patients. This will enable healthcare facilities to optimize resource allocation based on the prediction made. A large multi-variate TB patient dataset from the Brazilian Information System for Notifiable Disease (SINAN) was used in this study, containing attributes related to patient characteristics, clinical information, and laboratory data. The proposed model used the Naive Bayes algorithm due to its simplicity and efficiency in predicting treatment outcomes. The dataset was pre-processed, and the Synthetic Minority Oversampling Technique (SMOTE) was applied to overcome the class imbalance issues in the dataset. The combination of the borderline SMOTE and Naive Bayes algorithms on the preprocessed dataset was found to have achieved the highest levels of accuracy among the combinations sampled. This demonstrated the capability of the new algorithm in predicting the treatment outcome of TB patients. The proposed model could assist healthcare providers in implementing more targeted follow-ups and more appropriate resource allocation strategies to improve the overall treatment outcome of TB patients.

MeSH terms

  • Oversampling
  • Naive Bayes classifier
  • Machine learning
  • Computer science
  • Artificial intelligence
  • Outcome (game theory)
  • Tuberculosis
  • Resource allocation
  • Health care
  • Algorithm
  • Data mining