Nearly tight bounds on the learnability of evolution

Andris Ambainis, Richard Desper, Martin Farach, Sampath Kannan

Research output: Contribution to journalConference articlepeer-review

15 Scopus citations

Abstract

Evolution is often modeled as a stochastic process which modifies DNA. One of the most popular and successful such processes are the Cavender-Farris (CF) trees, which are represented as edge weighted trees. The Phylogeny Construction Problem is that of, given k samples drawn from a CF tree, output a CF tree which is close to the original. Each CF tree naturally defines a random variable, and the gold-standard for reconstructing such trees is the Maximum Likelihood Estimator of this variable. This approach is notoriously computationally expensive. In this paper, we show that a very simple algorithm, which is a variant on one of the most popular algorithms used by practitioners, converges on the true tree at a rate which differs from the optimum by a constant. We do this by analyzing upper and lower bounds for the convergence rate of learning very simple CF trees, and then show that the learnability of each CF tree is sandwiched between two such simpler trees. Our results rely on the fact that, if the right metric is used, the likelihood space of CF trees is smooth.

Original languageEnglish (US)
Pages (from-to)524-533
Number of pages10
JournalAnnual Symposium on Foundations of Computer Science - Proceedings
StatePublished - 1997
Externally publishedYes
EventProceedings of the 1997 38th IEEE Annual Symposium on Foundations of Computer Science - Miami Beach, FL, USA
Duration: Oct 20 1997Oct 22 1997

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Nearly tight bounds on the learnability of evolution'. Together they form a unique fingerprint.

Cite this