RNA molecules are an important component of the cellular machinery. They are now known to be essential for numerous biological processes, including protein synthesis, transcription regulation, chromosome replication, viral infection, and RNA interference. However, our knowledge of RNA molecules is still limited. This research project fills important gaps in current RNA studies by introducing novel molecular models and efficient computational tools. Specifically, the research team aims to solve the following problems under a coherent theme of studying pseudoknotted RNA structure and understanding their properties: (1) Estimation of entropy of key secondary elements of RNA molecules; (2) Identification of stable pseudoknot motifs from RNA sequences and developing libraries of pseudoknot motifs for RNA families; (3) Prediction of three dimensional ensemble of pseudoknotted RNA molecules and characterize their folding mechanism. All these problems involve exploration of probability distributions on very large state spaces where novel mathematical and statistical tools must be developed. Specifically, the research team studies and develops several techniques including efficient constrained Sequential Monte Carlo (SMC) methods, efficient Markov Chain Monte Carlo (MCMC) methods and mixing rate acceleration schemes and their combinations. The methodological development provides a solid foundation for solving the underlying biological problems. In return, those problems serve as the testing ground and inspiration of new statistical ideas and procedures. The cross-fertilization is ideal for significant advances in both biological and statistical sciences. It provides a perfect environment of education and training of the next generation of scientists and researchers in the interdisciplinary field of mathematics/ statistics and biology. Integrated education and research activities at post-doc, graduate and undergraduate levels are conducted. A set of free software are produced for implementing the developed algorithms.This project intends to improve our understanding of RNA, an important class of biomolecules and an important component of the cellular machinery. They are now known to be essential for numerous biological processes. A deeper understanding of RNA, its dynamics and functionality, will increase our ability to develop new medicines and diagnostic procedure and propel further technological advancement, hence beneficial to the human society. Innovative statistical tools are developed to solve the underlying problems. Such tools can also be used in many other applications. The project is a cross-fertilization between statistical science and bioinformatics, computational biology, and biophysics. It provides a perfect environment of education and training of the next generation of scientists and researchers in the interdisciplinary field of mathematics/statistics and biology. Integrated education and research activities at post-doc, graduate and undergraduate levels are conducted and special attentions are paid to attract women and minority students into the wonderful research career in the field of math-biology. A set of public and free software are developed for implementing the developed algorithms. It is able to empower biologists and bioinformatics researchers with new algorithms and software in their own research and discovery.
|Effective start/end date||7/15/08 → 6/30/12|
- National Science Foundation (National Science Foundation (NSF))