A feature extraction technique using bi-gram probabilities of position specific scoring matrix for protein fold recognition

Alok Sharma, James Lyons, Abdollah Dehzangi, Kuldip K. Paliwal

Research output: Contribution to journalArticlepeer-review

114 Scopus citations

Abstract

Discovering a three dimensional structure of a protein is a challenging task in biological science. Classifying a protein into one of its folds is an intermediate step for deciphering the three dimensional protein structure. The protein fold recognition can be done by developing feature extraction techniques to accurately extract all the relevant information from a protein sequence and then by employing a suitable classifier to label an unknown protein. Several feature extraction techniques have been developed in the past but with limited recognition accuracy only. In this work, we have developed a feature extraction technique which is based on bi-grams computed directly from Position Specific Scoring Matrices and demonstrated its effectiveness on a benchmark dataset. The proposed technique exhibits an absolute improvement of around 10% compared with existing feature extraction techniques.

Original languageEnglish (US)
Pages (from-to)41-46
Number of pages6
JournalJournal of Theoretical Biology
Volume320
DOIs
StatePublished - Mar 7 2013
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Modeling and Simulation
  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Agricultural and Biological Sciences(all)
  • Applied Mathematics

Keywords

  • Bi-gram features
  • Position specific scoring matrix (PSSM)
  • Protein fold recognition
  • Protein sequence

Fingerprint

Dive into the research topics of 'A feature extraction technique using bi-gram probabilities of position specific scoring matrix for protein fold recognition'. Together they form a unique fingerprint.

Cite this