Background: Heckman-type selection models have been used to control HIV prevalence estimates for selection bias when participation in HIV testing and HIV status are associated after controlling for observed variables. These models typically rely on the strong assumption that the error terms in the participation and the outcome equations that comprise the model are distributed as bivariate normal.
Methods: We introduce a novel approach for relaxing the bivariate normality assumption in selection models using copula functions. We apply this method to estimating HIV prevalence and new confidence intervals (CI) in the 2007 Zambia Demographic and Health Survey (DHS) by using interviewer identity as the selection variable that predicts participation (consent to test) but not the outcome (HIV status).
Results: We show in a simulation study that selection models can generate biased results when the bivariate normality assumption is violated. In the 2007 Zambia DHS, HIV prevalence estimates are similar irrespective of the structure of the association assumed between participation and outcome. For men, we estimate a population HIV prevalence of 21% (95% CI = 16%-25%) compared with 12% (11%-13%) among those who consented to be tested; for women, the corresponding figures are 19% (13%-24%) and 16% (15%-17%). Conclusions: Copula approaches to Heckman-type selection models are a useful addition to the methodological toolkit of HIV epidemiology and of epidemiology in general. We develop the use of this approach to systematically evaluate the robustness of HIV prevalence estimates based on selection models, both empirically and in a simulation study.
All Science Journal Classification (ASJC) codes