Privacy-preserving imputation of missing data

Geetha Jagannathan, Rebecca N. Wright

Research output: Contribution to journalArticlepeer-review

28 Scopus citations

Abstract

Handling missing data is a critical step to ensuring good results in data mining. Like most data mining algorithms, existing privacy-preserving data mining algorithms assume data is complete. In order to maintain privacy in the data mining process while cleaning data, privacy-preserving methods of data cleaning are required. In this paper, we address the problem of privacy-preserving data imputation of missing data. We present a privacy-preserving protocol for filling in missing values using a lazy decision-tree imputation algorithm for data that is horizontally partitioned between two parties. The participants of the protocol learn only the imputed values. The computed decision tree is not learned by either party.

Original languageEnglish (US)
Pages (from-to)40-56
Number of pages17
JournalData and Knowledge Engineering
Volume65
Issue number1
DOIs
StatePublished - Apr 2008

All Science Journal Classification (ASJC) codes

  • Information Systems and Management

Keywords

  • Data cleaning
  • Data imputation
  • Privacy-preserving protocols

Fingerprint

Dive into the research topics of 'Privacy-preserving imputation of missing data'. Together they form a unique fingerprint.

Cite this