Optimal Estimation of Genetic Relatedness in High-Dimensional Linear Models

Zijian Guo, Wanjie Wang, T. Tony Cai, Hongzhe Li

Research output: Contribution to journalArticlepeer-review

9 Scopus citations


Estimating the genetic relatedness between two traits based on the genome-wide association data is an important problem in genetics research. In the framework of high-dimensional linear models, we introduce two measures of genetic relatedness and develop optimal estimators for them. One is genetic covariance, which is defined to be the inner product of the two regression vectors, and another is genetic correlation, which is a normalized inner product by their lengths. We propose functional de-biased estimators (FDEs), which consist of an initial estimation step with the plug-in scaled Lasso estimator, and a further bias correction step. We also develop estimators of the quadratic functionals of the regression vectors, which can be used to estimate the heritability of each trait. The estimators are shown to be minimax rate-optimal and can be efficiently implemented. Simulation results show that FDEs provide better estimates of the genetic relatedness than simple plug-in estimates. FDE is also applied to an analysis of a yeast segregant dataset with multiple traits to estimate the genetic relatedness among these traits. Supplementary materials for this article are available online.

Original languageEnglish (US)
Pages (from-to)358-369
Number of pages12
JournalJournal of the American Statistical Association
Issue number525
StatePublished - Jan 2 2019

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


  • Genetic correlations
  • Genome-wide association studies
  • Inner product
  • Minimax rate of convergence
  • Quadratic functional


Dive into the research topics of 'Optimal Estimation of Genetic Relatedness in High-Dimensional Linear Models'. Together they form a unique fingerprint.

Cite this