Concordance and discordance of sequence survey methods for molecular epidemiology

Eduardo Castro-Nallar, Nur A. Hasan, Thomas A. Cebula, Rita R. Colwell, Richard A. Robison, W. Evan Johnson, Keith A. Crandall

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


The post-genomic era is characterized by the direct acquisition and analysis of genomic data with many applications, including the enhancement of the understanding of microbial epidemiology and pathology. However, there are a number of molecular approaches to survey pathogen diversity, and the impact of these different approaches on parameter estimation and inference are not entirely clear. We sequenced whole genomes of bacterial pathogens, Burkholderia pseudomallei, Yersinia pestis, and Brucella spp. (60 new genomes), and combined them with 55 genomes from Gen- Bank to address how different molecular survey approaches (whole genomes, SNPs, and MLST) impact downstream inferences on molecular evolutionary parameters, evolutionary relationships, and trait character associations. We selected isolates for sequencing to represent temporal, geographic origin, and host range variability. We found that substitution rate estimates vary widely among approaches, and that SNP and genomic datasets yielded different but strongly supported phylogenies. MLST yielded poorly supported phylogenies, especially in our low diversity dataset, i.e., Y. pestis. Trait associations showed that B. pseudomallei and Y. pestis phylogenies are significantly associated with geography, irrespective of the molecular survey approach used, while Brucella spp. phylogeny appears to be strongly associated with geography and host origin. We contrast inferences made among monomorphic (clonal) and non-monomorphic bacteria, and between intra- and inter-specific datasets. We also discuss our results in light of underlying assumptions of different approaches.

Original languageEnglish (US)
Article numbere761
Issue number2
StatePublished - 2015
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • General Neuroscience
  • General Biochemistry, Genetics and Molecular Biology
  • General Agricultural and Biological Sciences


  • Biological weapons
  • Bioterrorism
  • Data type
  • Genomes
  • High-throughput sequencing
  • MLST
  • Molecular epidemiology
  • Phylogenomics
  • Phylogeography
  • SNP


Dive into the research topics of 'Concordance and discordance of sequence survey methods for molecular epidemiology'. Together they form a unique fingerprint.

Cite this