Maximum A posteriori classification of DNA structure from sequence information.

D. M. Loewenstern, H. M. Berman, H. Hirsh

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

We introduce an algorithm, LLLAMA, which combines simple pattern recognizers into a general method for estimating the entropy of a sequence. Each pattern recognizer exploits a partial match between subsequences to build a model of the sequence. Since the primary features of interest in biological sequence domains are subsequences with small variations in exact composition, LLLAMA is particularly suited to such domains. We describe two methods, LLLAMA-length and LLLAMA-alone, which use this entropy estimate to perform maximum a posteriori classification. We apply these methods to several problems in three-dimensional structure classification of short DNA sequences. The results include a surprisingly low 3.6% error rate in predicting helical conformation of oligonucleotides. We compare our results to those obtained using more traditional methods for automated generation of classifiers.

Original languageEnglish (US)
Pages (from-to)669-680
Number of pages12
JournalPacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
StatePublished - 1998

All Science Journal Classification (ASJC) codes

  • Biomedical Engineering
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Maximum A posteriori classification of DNA structure from sequence information.'. Together they form a unique fingerprint.

Cite this