Maximum likelihood estimations and EM algorithms with length-biased data

Jing Qin, Jing Ning, Hao Liu, Yu Shen

Research output: Contribution to journalArticlepeer-review

50 Scopus citations


Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, and epidemiological, genetic, and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimation and inference methods for traditional survival data are not directly applicable for lengthbiased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite-dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semiparametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online.

Original languageEnglish (US)
Pages (from-to)1434-1449
Number of pages16
JournalJournal of the American Statistical Association
Issue number496
StatePublished - Dec 2011
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


  • Cox regression model
  • Increasing failure rate
  • Nonparametric likelihood
  • Profile likelihood
  • Right-censored data


Dive into the research topics of 'Maximum likelihood estimations and EM algorithms with length-biased data'. Together they form a unique fingerprint.

Cite this