TY - JOUR
T1 - Maximum likelihood estimations and EM algorithms with length-biased data
AU - Qin, Jing
AU - Ning, Jing
AU - Liu, Hao
AU - Shen, Yu
N1 - Funding Information:
Jing Qin is Mathematical Statistician, National Institution of Allergy and Infectious Diseases, Bethesda, MD 20817 (E-mail: jingqin@niaid.nih.gov). Jing Ning is Assistant Professor, Department of Biostatistics, M. D. Anderson Cancer Center, Houston, TX 77030 (E-mail: jning@mdanderson.org). Hao Liu is Associate Professor, Division of Biostatistics, Dan L. Duncan Cancer Center, Baylor College of Medicine, Houston, TX 77030 (E-mail: haol@bcm.edu). Yu Shen is Professor, Department of Biostatistics, M. D. Anderson Cancer Center, Houston, TX 77030 (E-mail: yshen@mdanderson.org). We thank one associate editor and two referees for their very constructive comments. We also thank Professor Masoud Asgharian and investigators of the Canadian Study of Health and Aging (CHSA) for providing us the dementia data from CHSA. The data reported in the example were collected as part of the CHSA. The core study was funded by the Seniors’ Independence Research Program, through the National Health Research and Development Program of Health Canada (Project 6606-3954-MC(S)). Additional funding was provided by Pfizer Canada Incorporated through the Medical Research Council/Pharmaceutical Manufacturers Association of Canada Health Activity Program, NHRDP Project 6603-1417-302(R), Bayer Incorporated, and the British Columbia Health Research Foundation Projects 38 (93-2) and 34 (96-1). The study was coordinated through the University of Ottawa and the Division of Aging and Seniors, Health Canada. This research was supported in part by National Institute of Health grant R01-CA079466.
PY - 2011/12
Y1 - 2011/12
N2 - Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, and epidemiological, genetic, and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimation and inference methods for traditional survival data are not directly applicable for lengthbiased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite-dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semiparametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online.
AB - Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, and epidemiological, genetic, and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimation and inference methods for traditional survival data are not directly applicable for lengthbiased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite-dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semiparametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online.
KW - Cox regression model
KW - Increasing failure rate
KW - Nonparametric likelihood
KW - Profile likelihood
KW - Right-censored data
UR - http://www.scopus.com/inward/record.url?scp=84862953577&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862953577&partnerID=8YFLogxK
U2 - 10.1198/jasa.2011.tm10156
DO - 10.1198/jasa.2011.tm10156
M3 - Article
AN - SCOPUS:84862953577
SN - 0162-1459
VL - 106
SP - 1434
EP - 1449
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 496
ER -