Derivatives and residual distribution of regularized M-estimators with application to adaptive tuning

Pierre C. Bellec, Yiwei Shen

Research output: Contribution to journalConference articlepeer-review

5 Scopus citations


This paper studies M-estimators with gradient-Lipschitz loss function regularized with convex penalty in linear models with Gaussian design matrix and arbitrary noise distribution. A practical example is the robust M-estimator constructed with the Huber loss and the Elastic-Net penalty and the noise distribution has heavy-tails. Our main contributions are three-fold. (i) We provide general formulae for the derivatives of regularized M-estimators βb(y, X) where differentiation is taken with respect to both y and X; this reveals a simple differentiability structure shared by all convex regularized M-estimators. (ii) Using these derivatives, we characterize the distribution of the residual ri = yi − xTi βb in the intermediate high-dimensional regime where dimension and sample size are of the same order. (iii) Motivated by the distribution of the residuals, we propose a novel adaptive criterion to select tuning parameters of regularized M-estimators. The criterion approximates the out-of-sample error up to an additive constant independent of the estimator, so that minimizing the criterion provides a proxy for minimizing the out-of-sample error. The proposed adaptive criterion does not require the knowledge of the noise distribution or of the covariance of the design. Simulated data confirms the theoretical findings, regarding both the distribution of the residuals and the success of the criterion as a proxy of the out-of-sample error. Finally our results reveal new relationships between the derivatives of βb(y, X) and the effective degrees of freedom of the M-estimator, which are of independent interest.

Original languageEnglish (US)
Pages (from-to)1912-1947
Number of pages36
JournalProceedings of Machine Learning Research
StatePublished - 2022
Event35th Conference on Learning Theory, COLT 2022 - London, United Kingdom
Duration: Jul 2 2022Jul 5 2022

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability


  • Adaptive tuning
  • High-dimensional statistics
  • M-estimator
  • Residual distribution
  • Robust estimation


Dive into the research topics of 'Derivatives and residual distribution of regularized M-estimators with application to adaptive tuning'. Together they form a unique fingerprint.

Cite this