Project Details


Microbes dominate life on Earth. Microbes evolve under the pressure of environmental stresses, such as climate change and pollution; these changes have global impact. For both microbes and their communities (or microbiomes), understanding the consequences of large-scale changes on microbes requires that we understand their starting point under normal conditions. In this project, normal baselines will be defined showing the functional abilities of microbes in a community and environment. Microbes will be grouped into similar functional classes, the 'distance' between two classes will include all of the differences in their functional abilities, which indicates their lifestyle preferences under existing conditions. In this way, a microbial class becomes a small set of shared functions, and is the basis for a novel way to infer microbial and microbiome diversity. The shared functions will serve as a guide to discovering new functional pathways or new members of known pathways, which can lead to broader impacts in a number of industrial applications. The technical advances produced through this project will include developing new algorithms, and combining new and existing data types, for the exploration of the microbial world. Building on principles shared between biology and computer science, the project will contribute to advances in knowledge extraction and graph analysis. Through education and outreach having hands-on lab activities the project will enhance bioinformatics education of undergraduates, urban high school students, and community lab members, while training postdocs to teach effectively. Technical summary: Limited resolution of microbial taxonomy with regard to molecular functionality, calls for novel, fast, and reliable classifications, capturing microbial processes, diversity, and interactions. Researchers will computationally analyze existing microbial genomic data using a new metric of whole-organism molecular function similarity. Using this similarity metric, a new classification scheme will be built, defining microbial clades according to their functional capacities. This scheme will be reflective of heredity as well as other forms of genetic transfer and of environmental factors. This will also provide an opportunity to explore the influence of inheritance versus horizontal gene transfer in acquiring new functions. Based on the consistent co-occurrence of functions in microorganisms, common molecular pathways and minimal pathways required for life within specific environmental niches will be elucidated. The power of this approach is its ability to assign proteins to pathways that remain experimentally uncharacterized. Overlaying available annotations of microorganism habitat preferences (e.g. temperature, oxygen requirements, and pH), broadly explains metabolic end-points and unique environmental adaptations. The newly developed tools will be used to describe niche-specific microbiomes for precise analysis of the environmental effects, whether in natural processes or for synthetic function design and industrial process optimization. The specific goals of the project include: (i) develop a function-based organism similarity metric, facilitating a functional clade definition, (ii) identify core sets of functions most discriminative of clade assignment and, thus, most descriptive of environment requirements and (iii) enable fast and accurate metagenome diversity and functional ability analysis. All tools and other resources will be publicly accessible at Students will be included in formulating relevant scientific questions and trained in methods for answering them through the building, evaluating, and applying computational methods. Two postdocs will be trained to conduct this research and in educational methods for leading an associated course, the first bioinformatics methods design course at this institution. Undergraduate students will (i) acquire skills in quantitative biology for which (ii) impact will be measured for both understanding of bioinformatics and perception of research as a whole. The postdocs will assist in outreach activities at a New York City high school and at Genspace - New York City's community biolab.
Effective start/end date4/15/163/31/21


  • National Science Foundation (National Science Foundation (NSF))


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.