TB Research

AcousticTB: A Hybrid Deep Learning and Gradient Boosting Framework for Noise-Robust Tuberculosis Screening from Cough Audio

Saatvik Kesarwani

Abstract

Tuberculosis (TB), a contagious respiratory disease, affects approximately one in four people worldwide. Over 1.2 million people died of TB in 2023 alone. However, most people who pass away with TB were not diagnosed in time or correctly because of challenges in accessing healthcare facilities. To combat this, this paper aims to develop a low-cost, non-invasive, and accessible digital screening tool. AcousticTB utilizes coughing, a common objective biomarker for TB. The model uses cough data collected across seven countries, which, after augmentation, totals 29,316 samples. AcousticTB incorporates Mel-spectrogram feature extraction combined with demographic and clinical data. It employs a novel hybrid architecture in which a convolutional neural network (CNN) first extracts acoustic embeddings from 64x64 log-Mel spectrograms. These embeddings are combined with clinical features and passed through a tuned XGBoost classifier, and finally, a logistic regression (LR) model is stacked on top of the XGBoost probabilities to improve predictive performance. The model achieved a receiver operating characteristic area under the curve (ROC-AUC) of 0.850, a precision-recall area under the curve (PR-AUC) of 0.748, 80.4% specificity (surpassing World Health Organization triage requirements), and 72.9% sensitivity, with an overall accuracy of 78.1%. AcousticTB outperforms existing CODA baselines, achieving superior discriminative ability compared to cough-only models (AUC 0.69-0.74) and published cough+metadata approaches (AUC 0.81). Its performance remained stable under realistic environmental noise, with ROC-AUC values of 0.862 (clean), 0.855 (Gaussian noise), and 0.833 (environmental noise overlays). Overall, AcousticTB poses as a potential noise-resilient TB screening tool, aligned with global public health needs.

MeSH terms

  • Artificial intelligence
  • Deep learning
  • Machine learning
  • Triage
  • Discriminative model
  • Convolutional neural network
  • Computer science
  • Receiver operating characteristic
  • Feature extraction
  • Gradient boosting
  • Logistic regression
  • Artificial neural network
  • Margin (machine learning)
  • Tuberculosis
  • Pattern recognition (psychology)
  • Boosting (machine learning)
  • Feature (linguistics)
  • Noise (video)
  • Medicine
  • Health care
  • Digital health
  • Binary classification