The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy

T. Tony Cai, Yichen Wang, Linjun Zhang

Research output: Contribution to journalArticlepeer-review

Abstract

Privacy-preserving data analysis is a rising challenge in contemporary statistics, as the privacy guarantees of statistical methods are often achieved at the expense of accuracy. In this paper, we investigate the tradeoff between statistical accuracy and privacy in mean estimation and linear regression, under both the classical low-dimensional and modern high-dimensional settings. A primary focus is to establish minimax optimality for statistical estimation with the (ε, δ)-differential privacy constraint. By refining the “tracing adversary” technique for lower bounds in the theoretical computer science literature, we improve existing minimax lower bound for low-dimensional mean estimation and establish new lower bounds for high-dimensional mean estimation and linear regression problems. We also design differentially private algorithms that attain the minimax lower bounds up to logarithmic factors. In particular, for high-dimensional linear regression, a novel private iterative hard thresholding algorithm is proposed. The numerical performance of differentially private algorithms is demonstrated by simulation studies and applications to real data sets.

Original languageEnglish (US)
Pages (from-to)2825-2850
Number of pages26
JournalAnnals of Statistics
Volume49
Issue number5
DOIs
StatePublished - Oct 2021
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Keywords

  • Differential privacy
  • High-dimensional data
  • Linear regression
  • Mean estimation
  • Minimax optimality

Fingerprint

Dive into the research topics of 'The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy'. Together they form a unique fingerprint.

Cite this