Abstract
In this paper, we study high-dimensional sparse Quadratic Discriminant Analysis (QDA) and aim to establish the optimal convergence rates for the classification error. Minimax lower bounds are established to demonstrate the necessity of structural assumptions such as sparsity conditions on the discriminating direction and differential graph for the possible construction of consistent high-dimensional QDA rules. We then propose a classification algorithm called SDAR using constrained convex optimization under the sparsity assumptions. Both minimax upper and lower bounds are obtained and this classification rule is shown to be simultaneously rate optimal over a collection of parameter spaces, up to a logarithmic factor. Simulation studies demonstrate that SDAR performs well numerically. The algorithm is also illustrated through an analysis of prostate cancer data and colon tissue data. The methodology and theory developed for high-dimensional QDA for two groups in the Gaussian setting are also extended to multigroup classification and to classification under the Gaussian copula model.
Original language | English (US) |
---|---|
Pages (from-to) | 1537-1568 |
Number of pages | 32 |
Journal | Annals of Statistics |
Volume | 49 |
Issue number | 3 |
DOIs | |
State | Published - Jun 2021 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty
Keywords
- Classification
- Constrained l minimization
- High-dimensional data
- Minimax lower bound
- Optimal rate of convergence
- Quadratic discriminant analysis