Evidence That Inconsistent Gene Prediction Can Mislead Analysis of Dinoflagellate Genomes

Yibi Chen, Raúl A. González-Pech, Timothy G. Stephens, Debashish Bhattacharya, Cheong Xin Chan

Research output: Contribution to journalLetter


Comparative algal genomics often relies on predicted genes from de novo assembled genomes. However, the artifacts introduced by different gene-prediction approaches, and their impact on comparative genomic analysis remain poorly understood. Here, using available genome data from six dinoflagellate species in the Symbiodiniaceae, we identified methodological biases in the published genes that were predicted using different approaches and putative contaminant sequences in the published genome assemblies. We developed and applied a comprehensive customized workflow to predict genes from these genomes. The observed variation among predicted genes resulting from our workflow agreed with current understanding of phylogenetic relationships among these taxa, whereas the variation among the previously published genes was largely biased by the distinct approaches used in each instance. Importantly, these biases affect the inference of homologous gene families and synteny among genomes, thus impacting biological interpretation of these data. Our results demonstrate that a consistent gene-prediction approach is critical for comparative analysis of dinoflagellate genomes.

Original languageEnglish (US)
Pages (from-to)6-10
Number of pages5
JournalJournal of Phycology
Issue number1
StatePublished - Feb 1 2020


All Science Journal Classification (ASJC) codes

  • Aquatic Science
  • Plant Science

Cite this