Complete genomes reveal the full extent of <i>Mycobacterium tuberculosis</i> complex diversity across evolutionary scales
Ana María García García, Manuela Torres‐Puente, Llúcia Martínez-Priego, Griselda De Marco, Miguel Moreno-Molina, Martin Hunt, Zamin Iqbal, Ana Gil-Brusola, et al. (13 authors)
bioRxiv (Cold Spring Harbor Laboratory) · 2025-08
Abstract
ABSTRACT Advances in short-read sequencing have enhanced our understanding of Mycobacterium tuberculosis complex (MTBC), but fail to capture its complete genomic diversity. We applied long-read sequencing to 216 isolates from the Valencia Region (Spain) and generated high-quality, complete genomes, revealing detailed insights into MTBC evolution across timescales. Complete genome comparisons increased the estimated evolutionary rate by 1.5-fold, resulting in a median of 312 (–1 to 792) additional SNPs per pairwise comparison. Multiple diversity hotspots were identified, mostly in the pe/ppe genes and driven by gene conversion. However, most PE/PPE epitopes were hyperconserved, with notable exceptions involving vaccine candidates. Incorporating previously undetected SNPs and indels improved resolution in transmission analyses. Furthermore, patient-specific reference mapping validates only 5–10% of within-host diversity detected by standard pipelines, indicating substantial overestimation in previous studies. These findings expand our view of MTBC diversity and have important implications for understanding host-pathogen interactions, epidemiology, and transmission dynamics.
MeSH terms
- Diversity (politics)
- Mycobacterium tuberculosis complex
- Mycobacterium tuberculosis
- Evolutionary biology
- Genome
- Biology
- Tuberculosis
- Genetics