Most scientific conclusions rely on statistical analysis of data collected from some physical process. The statistical analysis, in turn, relies on assumptions which, if violated, can lead to incorrect or misleading outcomes. In many modern applications, the statistical modeling is further complicated by inherent structural inhomogeneities in the population. Specific examples include the spread of information in social networks, genetic variation among mitochondrial DNA sequences from different species, collisions of particles suspended in a non-uniform medium, and identification of suspicious activity in a criminal or terrorist network. With these applications in mind, the project initiates a systematic study of probabilistic models and inferential principles for data taken from heterogeneous or specially structured populations. Attainment of these goals requires techniques from several mathematical and scientific areas and should lead to progress at the intersection of applied science, statistics, probability, and mathematics. Project outcomes will lead to a better understanding of how fundamental statistical assumptions affect the validity of scientific conclusions. The mathematical theory and statistical methods developed should have a broad societal impact, as combinatorial stochastic processes are used throughout modern applications in physical, biological, and social sciences, national security, and beyond. The PI has extensive plans for training graduate and undergraduate students in the methods to be used, via courses on the foundations of statistics at both levels, the supervision of Ph.D. theses, and conferences and workshops on these topics for these and other early-career researchers. The plans are innovative and the PI will devote significant time and resources to them. In particular, a strong emphasis will be placed on attracting and hiring students from underprivileged backgrounds who are first-generation college students.The specific technical objective is a rigorous mathematical theory for understanding random structures that exhibit relative exchangeability and other invariance principles. A hallmark of the project is the in-depth study and rigorous development of theory and applications for edge exchangeable network models, which were first introduced by the PI as a novel invariance principle for network analysis. Desired outcomes include relative invariance principles, structural properties, and characterization theorems for evolving combinatorial structures. The project will also exploit deep connections between combinatorics, algebra, logic, and probability theory to refine prior work by de Finetti, Kingman, Aldous, Hoover, and Kallenberg, on graph limits and Levy-Ito-type representations of Feller processes on combinatorial state spaces. Theoretical developments should guide methodological advances in specific applications, including climate science, network science, and phylogenetics.
|Effective start/end date||8/1/16 → 7/31/21|
- National Science Foundation (National Science Foundation (NSF))