Statistical Methods and Theory in Some High-Dimensional Problems

Project Details

Description

The research project will focus on developing practical methods, efficient algorithms and solid theory for the selection of important features, estimation of unknown parameters and prediction of responses with high-dimensional data, especially in the case where the number of features is much larger than the number of samples. It will further develop recently proposed methodologies and algorithms for feature selection in linear regression, extend them to more general high-dimensional statistical models, investigate their consistency and optimality properties in selection and estimation. The methodologies developed in the project will be directly relevant to many applications. The project will specifically investigate applications in two important areas. The first one is signal processing, including efficient sampling, representation, transmission and recovery of data objects. The second one is communications networks, including detection and estimation of significant patterns in volume and changes in data streams.

High-dimensional data is an area of intense current interest in statistical research and practice due to the rapid development of information technologies and their applications to modern scientific experiments. Important fields with an abundance of high-dimensional data include bioinformatics, signal processing, neural imaging, communications networks and more. In many suchscientific and engineering applications, the size of the problem is measured by the number of features: genetic components in bioinformatics, brain regions or voxels in neural imaging, or computers and routers in theInternet. A main challenge in high-dimensional data is that the size of the problem is often much larger than the size of the data to be used. The project is motivated and will be directly applicable to signal processing and monitoring communications networks. Due to mathematical and statistical commonalities of problems involving high-dimensional data, the project will also be directly applicable to bioinformatics, neural imaging and many more disciplines where modern information technologies prosper. Furthermore, the project will have significant educational impact.

StatusFinished
Effective start/end date9/1/098/31/13

Funding

  • National Science Foundation: $221,627.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.