Variable selection and estimation in generalized linear models with the seamless $L0 $ penalty

Zilin Li, Sijian Wang, Xihong Lin

Research output: Contribution to journalArticlepeer-review

17 Scopus citations

Abstract

In this paper, we propose variable selection and estimation in generalized linear models using the seamless $L_0$ (SELO) penalized likelihood approach. The SELO penalty is a smooth function that very closely resembles the discontinuous $L_0$ penalty. We develop an efficient algorithm to fit the model, and show that the SELO-GLM procedure has the oracle property in the presence of a diverging number of variables. We propose a Bayesian information criterion (BIC) to select the tuning parameter. We show that under some regularity conditions, the proposed SELO-GLM/BIC procedure consistently selects the true model. We perform simulation studies to evaluate the finite sample performance of the proposed methods. Our simulation studies show that the proposed SELO-GLM procedure has a better finite sample performance than several existing methods, especially when the number of variables is large and the signals are weak. We apply the SELO-GLM to analyze a breast cancer genetic dataset to identify the SNPs that are associated with breast cancer risk.

Original languageEnglish (US)
Pages (from-to)745-769
Number of pages25
JournalCanadian Journal of Statistics
Volume40
Issue number4
DOIs
StatePublished - Dec 2012
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Keywords

  • BIC
  • Consistency
  • Coordinate descent algorithm
  • Model selection
  • Oracle property
  • Penalized likelihood methods
  • SELO penalty
  • Tuning parameter selection

Fingerprint

Dive into the research topics of 'Variable selection and estimation in generalized linear models with the seamless $L0 $ penalty'. Together they form a unique fingerprint.

Cite this