Extensive sequencing is increasingly leaving us with a potential goldmine of 'not-quite-useful-yet' data. Existing tools that gauge the specifics of protein sequence-encoded functionality lack both functional specificity and sensitivity. Our long-range goal is to develop novel computational methods that can accurately identify residues specifically relevant for protein function, reveal the range of functions encoded by a given meta-, gen-, ex-, transcript-ome, and reduce the experimental work needed to describe the variome-mediated functional differences. The particular objective of this proposal is to elucidate the molecular functional make-up of the currently available meta-proteomes using per-residue functional significance predictions to profile/cluster protein sequences. We suggest a three-tiered approach: first, predict functional sequence (FuSe) residues using in silico mutagenesis. Then, align experimentally annotated orthologues and close paralogues to extract FuSe Signatures (FuSeS) - sets of FuSe residues representative of specific protein functions. Use FuSeS to gauge functions of available un-annotated sequences. Finally, cluster the pool of FuSeS-less proteins to build a collection of new FuSeS defining yet unknown functions. Note that while all aims are logically interconnected, the project is modular and the completion of one aim/module is sufficiently independent of the others; i.e. data collection and proofs of concept for all aims may proceed simultaneously, while modules unsuccessful in development may be replaced. The expected outcome of this project is a database of protein functional signatures (FuSeS) and a corresponding computational tool (FuSeScanner) for protein function annotation from sequence alone. The innovation of FuSeS is in building on established methodologies to create a completely unique, novel and highly informative functional view of existing proteome data. This is also highly significant as FuSeS can be used to generate new experimentally testable hypotheses about the make up and optimization of specific microbiotic environments. Understanding the human gut microbiome, for instance, could facilitate research in the directions of food safety and childhood obesity. Deeper knowledge of electron transfer chains in microbial communities could potentially aid research and development of sustainable energy resources. FuSeS will also be easily, cheaply, and accurately applicable to any -omic study requiring a more succinct annotation of protein function.
|Effective start/end date||7/1/12 → 6/30/17|
- National Institute of Food and Agriculture (National Institute of Food and Agriculture (NIFA))