CIF: III: SMALL: HIGH-DIMENSIONAL LINEAR MODELS? BRING 'EM ON!

    Project Details

    Description

    One of the fundamental problems in statistical data analysis is to learn the relationship between the samples of a dependent variable (e.g., the malignancy of a tumor) and the samples of predictor variables (e.g., the expression data of genes). This problem was relatively easy in the data-starved world of yesteryears. Our inability to observe too many variables meant that a single sample had dimensions in the tens or hundreds. Times have changed. We now live in a data-rich world. DNA microarrays, for example, can provide us with the expression data for hundreds of thousands of genes (predictors) per tissue sample. This is just one of the countless examples in modern statistics where a single sample comprises thousands or billions of predictors, while there are only hundreds or thousands of samples available for analysis. Computational and analytical tools developed in the 20th century, however, were not designed to work in such high-dimensional settings. The challenge then is developing new sets of computationally efficient methods that analyze the high-dimensional data in an optimal manner.This research addresses the challenge of high-dimensional data analysis within the context of linear models by developing low-complexity inference methods based on marginal correlations of predictors with the response variable. One of the distinguishing features of this research is its emphasis on mathematical characterization of the performance of developed methods in the most general of terms. This is accomplished by drawing connections with the literature on finite frame theory. Because of the fairly general nature of this research, it significantly advances the state-of-the-art in inference problems arising in myriad areas, such as genomics, tumor classification, network monitoring and computer tomography. In addition, the frame-theoretic focus of this research lays the foundations for future cross-fertilization of ideas between statistical inference and frame theory.
    StatusFinished
    Effective start/end date9/1/128/31/15

    Funding

    • National Science Foundation (National Science Foundation (NSF))

    Fingerprint

    Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.