TY - JOUR
T1 - A comprehensive update to the Mycobacterium tuberculosis H37Rv reference genome
AU - Chitale, Poonam
AU - Lemenze, Alexander D.
AU - Fogarty, Emily C.
AU - Shah, Avi
AU - Grady, Courtney
AU - Odom-Mabey, Aubrey R.
AU - Johnson, W. Evan
AU - Yang, Jason H.
AU - Eren, A. Murat
AU - Brosch, Roland
AU - Kumar, Pradeep
AU - Alland, David
N1 - Publisher Copyright:
© 2022, The Author(s).
PY - 2022/12
Y1 - 2022/12
N2 - H37Rv is the most widely used Mycobacterium tuberculosis strain, and its genome is globally used as the M. tuberculosis reference sequence. Here, we present Bact-Builder, a pipeline that uses consensus building to generate complete and accurate bacterial genome sequences and apply it to three independently cultured and sequenced H37Rv aliquots of a single laboratory stock. Two of the 4,417,942 base-pair long H37Rv assemblies are 100% identical, with the third differing by a single nucleotide. Compared to the existing H37Rv reference, the new sequence contains ~6.4 kb additional base pairs, encoding ten new regions that include insertions in PE/PPE genes and new paralogs of esxN and esxJ, which are differentially expressed compared to the reference genes. New sequencing and de novo assemblies with Bact-Builder confirm that all 10 regions, plus small additional polymorphisms, are also present in the commonly used H37Rv strains NR123, TMC102, and H37Rv1998. Thus, Bact-Builder shows promise as an improved method to perform accurate and reproducible de novo assemblies of bacterial genomes, and our work provides important updates to the primary M. tuberculosis reference genome.
AB - H37Rv is the most widely used Mycobacterium tuberculosis strain, and its genome is globally used as the M. tuberculosis reference sequence. Here, we present Bact-Builder, a pipeline that uses consensus building to generate complete and accurate bacterial genome sequences and apply it to three independently cultured and sequenced H37Rv aliquots of a single laboratory stock. Two of the 4,417,942 base-pair long H37Rv assemblies are 100% identical, with the third differing by a single nucleotide. Compared to the existing H37Rv reference, the new sequence contains ~6.4 kb additional base pairs, encoding ten new regions that include insertions in PE/PPE genes and new paralogs of esxN and esxJ, which are differentially expressed compared to the reference genes. New sequencing and de novo assemblies with Bact-Builder confirm that all 10 regions, plus small additional polymorphisms, are also present in the commonly used H37Rv strains NR123, TMC102, and H37Rv1998. Thus, Bact-Builder shows promise as an improved method to perform accurate and reproducible de novo assemblies of bacterial genomes, and our work provides important updates to the primary M. tuberculosis reference genome.
UR - http://www.scopus.com/inward/record.url?scp=85142255556&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85142255556&partnerID=8YFLogxK
U2 - 10.1038/s41467-022-34853-x
DO - 10.1038/s41467-022-34853-x
M3 - Article
C2 - 36400796
AN - SCOPUS:85142255556
SN - 2041-1723
VL - 13
JO - Nature communications
JF - Nature communications
IS - 1
M1 - 7068
ER -