Privacy-preserving imputation of missing data

Geetha Jagannathan, Rebecca N. Wright

Research output: Contribution to journalArticle

16 Scopus citations


Handling missing data is a critical step to ensuring good results in data mining. Like most data mining algorithms, existing privacy-preserving data mining algorithms assume data is complete. In order to maintain privacy in the data mining process while cleaning data, privacy-preserving methods of data cleaning are required. In this paper, we address the problem of privacy-preserving data imputation of missing data. We present a privacy-preserving protocol for filling in missing values using a lazy decision-tree imputation algorithm for data that is horizontally partitioned between two parties. The participants of the protocol learn only the imputed values. The computed decision tree is not learned by either party.

Original languageEnglish (US)
Pages (from-to)40-56
Number of pages17
JournalData and Knowledge Engineering
Issue number1
Publication statusPublished - Apr 1 2008
Externally publishedYes


All Science Journal Classification (ASJC) codes

  • Information Systems and Management


  • Data cleaning
  • Data imputation
  • Privacy-preserving protocols

Cite this