Federated learning framework for predicting multi-drug resistant tuberculosis across regional databases.
Harsha Avinash Bhute, Avinash N Bhute, Kishor B Waghulde, Bharati P Vasgi, Reshma Sonar, Shalaka Prasad Deore
The Indian journal of tuberculosis · 2025-12
Abstract
BACKGROUND: Multi-Drug Resistant Tuberculosis (MDR-TB) is a major public health problem around the world, especially in low- and middle-income countries where quick and accurate testing is needed for effective treatment and control. Traditional ways of diagnosing, like culture-based drug resistance tests and genetic studies, have problems like taking a long time to run, being expensive, and not being easy to get to in places with few resources. While machine learning models that combine genetic and clinical data show promise in predicting MDR-TB, their progress is slowed by worries about data privacy and rules that make it illegal for regional healthcare centres to pool data in one place.
METHODS: A study proposed a federated learning framework that would let healthcare organizations work together to create MDR-TB prediction models without sharing actual patient data. By combining genomic and clinical data from different regional sources, the framework performs as well as centralised methods, with an AUC close to 91 % and accuracy close to 89 %. It is important that it protects patient privacy while also taking into account differences between local datasets.
RESULTS: A real-world MDR-TB dataset with whole-genome sequencing and clinical data from several healthcare centres was used to test the system. With accuracy above 88 %, precision and recall numbers above 86 %, and AUC-ROC close to 91 %, the pooled learning model did about as well at predicting the future as a centralised model. Even though the datasets were different, client-wise study showed steady results.
CONCLUSION: The suggested shared learning approach provides a secure, flexible, and accurate way to predict MDR-TB across multiple area databases. It strikes a good mix between the need for shared data use and strict privacy rules, which encourages more people to use it in managing dangerous diseases. In the future, researchers will look into adding more types of data and making transmission faster and more secure in order to make the model even more reliable and useful in clinical settings.
MeSH terms
- Humans
- Tuberculosis, Multidrug-Resistant
- Machine Learning
- Databases, Factual
- Whole Genome Sequencing
- Federated Learning