mRNA polyadenylation is responsible for the 3′ end formation of most mRNAs in eukaryotic cells and is linked to termination of transcription. Prediction of mRNA polyadenylation sites [poly(A) sites] can help identify genes, define gene boundaries, and elucidate regulatory mechanisms. Current methods for poly(A) site prediction achieve moderate sensitivity and specificity. Here, we present a method using support vector machine for poly(A) site prediction. Using 15 cis -regulatory elements that are over-represented in various regions surrounding poly(A) sites, this method achieves higher sensitivity and similar specificity when compared with polyadq, a common tool for poly(A) site prediction. In addition, we found that while the polyadenylation signal AAUAAA and U-rich elements are primary determinants for poly(A) site prediction, other elements contribute to both sensitivity and specificity of the prediction, indicating a combinatorial mechanism involving multiple elements when choosing poly(A) sites in human cells.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics