Missing covariates in longitudinal data with informative dropouts

Bias analysis and inference

Jason Roy, Xihong Lin

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

We consider estimation in generalized linear mixed models (GLMM) for longitudinal data with informative dropouts. At the time a unit drops out, time-varying covariates are often unobserved in addition to the missing outcome. However, existing informative dropout models typically require covariates to be completely observed. This assumption is not realistic in the presence of time-varying covariates. In this article, we first study the asymptotic bias that would result from applying existing methods, where missing time-varying covariates are handled using naive approaches, which include: (1) using only baseline values; (2) carrying forward the last observation; and (3) assuming the missing data are ignorable. Our asymptotic bias analysis shows that these naive approaches yield inconsistent estimators of model parameters. We next propose a selection/transition model that allows covariates to be missing in addition to the outcome variable at the time of dropout. The EM algorithm is used for inference in the proposed model. Data from a longitudinal study of human immunodeficiency virus (HlV)-infected women are used to illustrate the methodology.

Original languageEnglish (US)
JournalBiometrics
Volume61
Issue number3
DOIs
StatePublished - Sep 1 2005
Externally publishedYes

Fingerprint

Informative Dropout
Time-varying Covariates
dropouts
Missing Covariates
Longitudinal Data
Asymptotic Bias
Drop out
Covariates
Generalized Linear Mixed Model
Transition Model
Selection Model
Longitudinal Study
EM Algorithm
Missing Data
Inconsistent
Virus
Baseline
Model
Human immunodeficiency virus
Estimator

All Science Journal Classification (ASJC) codes

  • Agricultural and Biological Sciences(all)
  • Public Health, Environmental and Occupational Health
  • Agricultural and Biological Sciences (miscellaneous)
  • Applied Mathematics
  • Statistics and Probability

Cite this

@article{6871c4421b854e97b74c56982f5c8ca5,
title = "Missing covariates in longitudinal data with informative dropouts: Bias analysis and inference",
abstract = "We consider estimation in generalized linear mixed models (GLMM) for longitudinal data with informative dropouts. At the time a unit drops out, time-varying covariates are often unobserved in addition to the missing outcome. However, existing informative dropout models typically require covariates to be completely observed. This assumption is not realistic in the presence of time-varying covariates. In this article, we first study the asymptotic bias that would result from applying existing methods, where missing time-varying covariates are handled using naive approaches, which include: (1) using only baseline values; (2) carrying forward the last observation; and (3) assuming the missing data are ignorable. Our asymptotic bias analysis shows that these naive approaches yield inconsistent estimators of model parameters. We next propose a selection/transition model that allows covariates to be missing in addition to the outcome variable at the time of dropout. The EM algorithm is used for inference in the proposed model. Data from a longitudinal study of human immunodeficiency virus (HlV)-infected women are used to illustrate the methodology.",
author = "Jason Roy and Xihong Lin",
year = "2005",
month = "9",
day = "1",
doi = "10.1111/j.1541-0420.2005.00340.x",
language = "English (US)",
volume = "61",
journal = "Biometrics",
issn = "0006-341X",
publisher = "Wiley-Blackwell",
number = "3",

}

Missing covariates in longitudinal data with informative dropouts : Bias analysis and inference. / Roy, Jason; Lin, Xihong.

In: Biometrics, Vol. 61, No. 3, 01.09.2005.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Missing covariates in longitudinal data with informative dropouts

T2 - Bias analysis and inference

AU - Roy, Jason

AU - Lin, Xihong

PY - 2005/9/1

Y1 - 2005/9/1

N2 - We consider estimation in generalized linear mixed models (GLMM) for longitudinal data with informative dropouts. At the time a unit drops out, time-varying covariates are often unobserved in addition to the missing outcome. However, existing informative dropout models typically require covariates to be completely observed. This assumption is not realistic in the presence of time-varying covariates. In this article, we first study the asymptotic bias that would result from applying existing methods, where missing time-varying covariates are handled using naive approaches, which include: (1) using only baseline values; (2) carrying forward the last observation; and (3) assuming the missing data are ignorable. Our asymptotic bias analysis shows that these naive approaches yield inconsistent estimators of model parameters. We next propose a selection/transition model that allows covariates to be missing in addition to the outcome variable at the time of dropout. The EM algorithm is used for inference in the proposed model. Data from a longitudinal study of human immunodeficiency virus (HlV)-infected women are used to illustrate the methodology.

AB - We consider estimation in generalized linear mixed models (GLMM) for longitudinal data with informative dropouts. At the time a unit drops out, time-varying covariates are often unobserved in addition to the missing outcome. However, existing informative dropout models typically require covariates to be completely observed. This assumption is not realistic in the presence of time-varying covariates. In this article, we first study the asymptotic bias that would result from applying existing methods, where missing time-varying covariates are handled using naive approaches, which include: (1) using only baseline values; (2) carrying forward the last observation; and (3) assuming the missing data are ignorable. Our asymptotic bias analysis shows that these naive approaches yield inconsistent estimators of model parameters. We next propose a selection/transition model that allows covariates to be missing in addition to the outcome variable at the time of dropout. The EM algorithm is used for inference in the proposed model. Data from a longitudinal study of human immunodeficiency virus (HlV)-infected women are used to illustrate the methodology.

UR - http://www.scopus.com/inward/record.url?scp=27744463539&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=27744463539&partnerID=8YFLogxK

U2 - 10.1111/j.1541-0420.2005.00340.x

DO - 10.1111/j.1541-0420.2005.00340.x

M3 - Article

VL - 61

JO - Biometrics

JF - Biometrics

SN - 0006-341X

IS - 3

ER -