Large-scale proteogenomics characterization of microproteins in Mycobacterium tuberculosis
Eduardo Vieira de Souza, Pedro Ferrari Dalberto, Adriana C. Miranda, Alan Saghatelian, Antônio F. M. Pinto, Luiz Augusto Basso, Pablo Machado, Cristiano Valim Bizarro
Scientific Reports · 2024-12
Abstract
Tuberculosis remains a burden to this day, due to the rise of multi and extensively drug-resistant bacterial strains. The genome of Mycobacterium tuberculosis (Mtb) strain H37Rv underwent an annotation process that excluded small Open Reading Frames (smORFs), which encode a class of peptides and small proteins collectively known as microproteins. As a result, there is an overlooked part of its proteome that is a rich source of potentially essential, druggable molecular targets. Here, we employed our recently developed proteogenomics pipeline to identify novel microproteins encoded by non-canonical smORFs in the genome of Mtb using hundreds of mass spectrometry experiments in a large-scale approach. We found protein evidence for hundreds of unannotated microproteins and identified smORFs essential for bacterial survival and involved in bacterial growth and virulence. Moreover, many smORFs are co-expressed and share operons with a myriad of biologically relevant genes and play a role in antibiotic response. Together, our data presents a resource of unknown genes that play a role in the success of Mtb as a widespread pathogen.
MeSH terms
- Proteogenomics
- Proteome
- Biology
- Computational biology
- Genome
- Druggability
- Mycobacterium tuberculosis
- Proteomics
- Open reading frame
- Bacterial genome size
- Gene
- Genetics
- Tuberculosis
- Genomics