Fast motif selection for biological sequences

Pavel Kuksa, Vladimir Pavlovic

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

We consider the problem of identifying motifs, recurring or conserved patterns, in the sets of biological sequences. To solve this task, we present new deterministic and exact algorithms for finding patterns that are embedded as exact or inexact instances in all or most of the input strings. The proposed algorithms (1) improve search efficiency compared to existing exact algorithms by focusing search on a selected set of potential motif instances, and (2) scale well with the input length and the size of alphabet. Our algorithms are orders of magnitude faster than existing exact algorithms for common pattern identification. We evaluate our algorithms on benchmark motif finding problems and real applications in biological sequence analysis and show that they exhibit significant running time improvements compared to the state-of-the-art approaches.

Original languageEnglish (US)
Title of host publication2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009
Pages79-82
Number of pages4
DOIs
StatePublished - Dec 1 2009
Event2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009 - Washington, D.C., United States
Duration: Nov 1 2009Nov 4 2009

Publication series

Name2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009

Other

Other2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009
CountryUnited States
CityWashington, D.C.
Period11/1/0911/4/09

Fingerprint

Benchmarking
Sequence Analysis

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Biomedical Engineering
  • Health Informatics

Keywords

  • Algorithms
  • Sequences
  • Tree searching

Cite this

Kuksa, P., & Pavlovic, V. (2009). Fast motif selection for biological sequences. In 2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009 (pp. 79-82). [5341854] (2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009). https://doi.org/10.1109/BIBM.2009.41
Kuksa, Pavel ; Pavlovic, Vladimir. / Fast motif selection for biological sequences. 2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009. 2009. pp. 79-82 (2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009).
@inproceedings{14ef27d785c449c2895413820789e826,
title = "Fast motif selection for biological sequences",
abstract = "We consider the problem of identifying motifs, recurring or conserved patterns, in the sets of biological sequences. To solve this task, we present new deterministic and exact algorithms for finding patterns that are embedded as exact or inexact instances in all or most of the input strings. The proposed algorithms (1) improve search efficiency compared to existing exact algorithms by focusing search on a selected set of potential motif instances, and (2) scale well with the input length and the size of alphabet. Our algorithms are orders of magnitude faster than existing exact algorithms for common pattern identification. We evaluate our algorithms on benchmark motif finding problems and real applications in biological sequence analysis and show that they exhibit significant running time improvements compared to the state-of-the-art approaches.",
keywords = "Algorithms, Sequences, Tree searching",
author = "Pavel Kuksa and Vladimir Pavlovic",
year = "2009",
month = "12",
day = "1",
doi = "10.1109/BIBM.2009.41",
language = "English (US)",
isbn = "9780769538853",
series = "2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009",
pages = "79--82",
booktitle = "2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009",

}

Kuksa, P & Pavlovic, V 2009, Fast motif selection for biological sequences. in 2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009., 5341854, 2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009, pp. 79-82, 2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009, Washington, D.C., United States, 11/1/09. https://doi.org/10.1109/BIBM.2009.41

Fast motif selection for biological sequences. / Kuksa, Pavel; Pavlovic, Vladimir.

2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009. 2009. p. 79-82 5341854 (2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Fast motif selection for biological sequences

AU - Kuksa, Pavel

AU - Pavlovic, Vladimir

PY - 2009/12/1

Y1 - 2009/12/1

N2 - We consider the problem of identifying motifs, recurring or conserved patterns, in the sets of biological sequences. To solve this task, we present new deterministic and exact algorithms for finding patterns that are embedded as exact or inexact instances in all or most of the input strings. The proposed algorithms (1) improve search efficiency compared to existing exact algorithms by focusing search on a selected set of potential motif instances, and (2) scale well with the input length and the size of alphabet. Our algorithms are orders of magnitude faster than existing exact algorithms for common pattern identification. We evaluate our algorithms on benchmark motif finding problems and real applications in biological sequence analysis and show that they exhibit significant running time improvements compared to the state-of-the-art approaches.

AB - We consider the problem of identifying motifs, recurring or conserved patterns, in the sets of biological sequences. To solve this task, we present new deterministic and exact algorithms for finding patterns that are embedded as exact or inexact instances in all or most of the input strings. The proposed algorithms (1) improve search efficiency compared to existing exact algorithms by focusing search on a selected set of potential motif instances, and (2) scale well with the input length and the size of alphabet. Our algorithms are orders of magnitude faster than existing exact algorithms for common pattern identification. We evaluate our algorithms on benchmark motif finding problems and real applications in biological sequence analysis and show that they exhibit significant running time improvements compared to the state-of-the-art approaches.

KW - Algorithms

KW - Sequences

KW - Tree searching

UR - http://www.scopus.com/inward/record.url?scp=74549215063&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=74549215063&partnerID=8YFLogxK

U2 - 10.1109/BIBM.2009.41

DO - 10.1109/BIBM.2009.41

M3 - Conference contribution

AN - SCOPUS:74549215063

SN - 9780769538853

T3 - 2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009

SP - 79

EP - 82

BT - 2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009

ER -

Kuksa P, Pavlovic V. Fast motif selection for biological sequences. In 2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009. 2009. p. 79-82. 5341854. (2009 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2009). https://doi.org/10.1109/BIBM.2009.41