Efficient discovery of confounders in large data sets

Wenjun Zhou, Hui Xiong

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

Given a large transaction database, association analysis is concerned with efficiently finding strongly related objects. Unlike traditional associate analysis, where relationships among variables are searched at a global level, we examine confounding factors at a local level. Indeed, many real-world phenomena are localized to specific regions and times. These relationships may not be visible when the entire data set is analyzed. Specially, confounding effects that change the direction of correlation is the most significant. Along this line, we propose to efficiently find confounding effects attributable to local associations. Specifically, we derive an upper bound by a necessary condition of confounders, which can help us prune the search space and efficiently identify confounders. Experimental results show that the proposed CONFOUND algorithm can effectively identify confounders and the computational performance is an order of magnitude faster than benchmark methods.

Original languageEnglish (US)
Title of host publicationICDM 2009 - The 9th IEEE International Conference on Data Mining
Pages647-656
Number of pages10
DOIs
StatePublished - 2009
Event9th IEEE International Conference on Data Mining, ICDM 2009 - Miami, FL, United States
Duration: Dec 6 2009Dec 9 2009

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Other

Other9th IEEE International Conference on Data Mining, ICDM 2009
Country/TerritoryUnited States
CityMiami, FL
Period12/6/0912/9/09

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Keywords

  • Confounder
  • Correlation
  • Local association
  • Partial correlation
  • φ correlation coefficient

Cite this