TB Research

Predicting pyrazinamide resistance in Mycobacterium tuberculosis using a graph convolutional network.

Dylan Dissanayake, Viktoria Brunner, Dylan Adlard, Joseph A Morrone, Philip W Fowler

BMC microbiology · 2026-03

Abstract

BACKGROUND: Pyrazinamide is an important first-line antibiotic for treating tuberculosis, with resistance primarily driven by mutations in thegene. Traditional machine learning models are able to predict pyrazinamide resistance with some success but are limited in their ability to incorporate 3-dimensional protein structural information. Graph neural networks offer the potential to integrate protein structure and residue-level features to better predict the impact of mutations on drug resistance.

RESULTS: We trained a graph convolutional network on PncA variants containing missense mutations and evaluated its ability to classify resistance to pyrazinamide. Each PncA variant was represented as an amino acid-level graph, with edges calculated from 3-dimensional spatial proximity, and node features derived from chemical properties and mutation meta-predictors. We used AlphaFold2 to generate predicted structures of the PncA variants, which we used to create the protein graphs. The predicted structures of resistant PncA variants showed greater deviation from the wild-type structure compared to susceptible variants. Our model achieved an F1 score of 81.6%, sensitivity of 81.6% and specificity of 80.4% on the test set and either matched or exceeded the performance of a published set of traditional machine learning models. We show that both structural graph connectivity and node features contribute significantly to model performance. Furthermore, we employ additional train/test dataset splits to demonstrate the GCN’s ability to generalise and predict resistance in samples with mutations in unseen positions and structural regions.

CONCLUSIONS: Our study demonstrates that graph-based deep learning can leverage protein structure and biochemical features to accurately predict antimicrobial resistance, despite being trained on a small dataset with little variation. We present this as a proof-of-concept for these methods to be applied to resistance phenotype prediction in more genetically diverse pathogens to predict the more complex observed patterns of antimicrobial resistance.

SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12866-026-04876-1.