Evaluating the privacy implications of frequent itemset disclosure

Edoardo Serra, Jaideep Vaidya, Haritha Akella, Ashish Sharma

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Frequent itemset mining is a fundamental data analytics task. In many cases, due to privacy concerns, only the frequent itemsets are released instead of the underlying data. However, it is not clear how to evaluate the privacy implications of the disclosure of the frequent itemsets. Towards this, in this paper, we define the k-distant-IFM-solutions problem, which aims to find k transaction datasets whose pair distance is maximized. The degree of difference between the reconstructed datasets provides a way to evaluate the privacy risk. Since the problem is NPhard, we propose a 2-approximate solution as well as faster heuristics, and evaluate them on real data.

Original languageEnglish (US)
Title of host publicationICT Systems Security and Privacy Protection - 32nd IFIP TC 11 International Conference, SEC 2017, Proceedings
EditorsSabrina De Capitani di Vimercati, Fabio Martinelli
PublisherSpringer New York LLC
Pages506-519
Number of pages14
ISBN (Print)9783319584683
DOIs
StatePublished - Jan 1 2017
Event32nd International Conference on ICT Systems Security and Privacy Protection, IFIP SEC 2017 - Rome, Italy
Duration: May 29 2017May 31 2017

Publication series

NameIFIP Advances in Information and Communication Technology
Volume502
ISSN (Print)1868-4238

Other

Other32nd International Conference on ICT Systems Security and Privacy Protection, IFIP SEC 2017
CountryItaly
CityRome
Period5/29/175/31/17

Fingerprint

Privacy
Disclosure
Privacy concerns
Heuristics
NP-hard

All Science Journal Classification (ASJC) codes

  • Information Systems and Management

Keywords

  • Column generation
  • Inverse frequent itemset mining

Cite this

Serra, E., Vaidya, J., Akella, H., & Sharma, A. (2017). Evaluating the privacy implications of frequent itemset disclosure. In S. De Capitani di Vimercati, & F. Martinelli (Eds.), ICT Systems Security and Privacy Protection - 32nd IFIP TC 11 International Conference, SEC 2017, Proceedings (pp. 506-519). (IFIP Advances in Information and Communication Technology; Vol. 502). Springer New York LLC. https://doi.org/10.1007/978-3-319-58469-0_34
Serra, Edoardo ; Vaidya, Jaideep ; Akella, Haritha ; Sharma, Ashish. / Evaluating the privacy implications of frequent itemset disclosure. ICT Systems Security and Privacy Protection - 32nd IFIP TC 11 International Conference, SEC 2017, Proceedings. editor / Sabrina De Capitani di Vimercati ; Fabio Martinelli. Springer New York LLC, 2017. pp. 506-519 (IFIP Advances in Information and Communication Technology).
@inproceedings{67f0f89c5bf748448fbcd3d53237c4ba,
title = "Evaluating the privacy implications of frequent itemset disclosure",
abstract = "Frequent itemset mining is a fundamental data analytics task. In many cases, due to privacy concerns, only the frequent itemsets are released instead of the underlying data. However, it is not clear how to evaluate the privacy implications of the disclosure of the frequent itemsets. Towards this, in this paper, we define the k-distant-IFM-solutions problem, which aims to find k transaction datasets whose pair distance is maximized. The degree of difference between the reconstructed datasets provides a way to evaluate the privacy risk. Since the problem is NPhard, we propose a 2-approximate solution as well as faster heuristics, and evaluate them on real data.",
keywords = "Column generation, Inverse frequent itemset mining",
author = "Edoardo Serra and Jaideep Vaidya and Haritha Akella and Ashish Sharma",
year = "2017",
month = "1",
day = "1",
doi = "10.1007/978-3-319-58469-0_34",
language = "English (US)",
isbn = "9783319584683",
series = "IFIP Advances in Information and Communication Technology",
publisher = "Springer New York LLC",
pages = "506--519",
editor = "{De Capitani di Vimercati}, Sabrina and Fabio Martinelli",
booktitle = "ICT Systems Security and Privacy Protection - 32nd IFIP TC 11 International Conference, SEC 2017, Proceedings",

}

Serra, E, Vaidya, J, Akella, H & Sharma, A 2017, Evaluating the privacy implications of frequent itemset disclosure. in S De Capitani di Vimercati & F Martinelli (eds), ICT Systems Security and Privacy Protection - 32nd IFIP TC 11 International Conference, SEC 2017, Proceedings. IFIP Advances in Information and Communication Technology, vol. 502, Springer New York LLC, pp. 506-519, 32nd International Conference on ICT Systems Security and Privacy Protection, IFIP SEC 2017, Rome, Italy, 5/29/17. https://doi.org/10.1007/978-3-319-58469-0_34

Evaluating the privacy implications of frequent itemset disclosure. / Serra, Edoardo; Vaidya, Jaideep; Akella, Haritha; Sharma, Ashish.

ICT Systems Security and Privacy Protection - 32nd IFIP TC 11 International Conference, SEC 2017, Proceedings. ed. / Sabrina De Capitani di Vimercati; Fabio Martinelli. Springer New York LLC, 2017. p. 506-519 (IFIP Advances in Information and Communication Technology; Vol. 502).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Evaluating the privacy implications of frequent itemset disclosure

AU - Serra, Edoardo

AU - Vaidya, Jaideep

AU - Akella, Haritha

AU - Sharma, Ashish

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Frequent itemset mining is a fundamental data analytics task. In many cases, due to privacy concerns, only the frequent itemsets are released instead of the underlying data. However, it is not clear how to evaluate the privacy implications of the disclosure of the frequent itemsets. Towards this, in this paper, we define the k-distant-IFM-solutions problem, which aims to find k transaction datasets whose pair distance is maximized. The degree of difference between the reconstructed datasets provides a way to evaluate the privacy risk. Since the problem is NPhard, we propose a 2-approximate solution as well as faster heuristics, and evaluate them on real data.

AB - Frequent itemset mining is a fundamental data analytics task. In many cases, due to privacy concerns, only the frequent itemsets are released instead of the underlying data. However, it is not clear how to evaluate the privacy implications of the disclosure of the frequent itemsets. Towards this, in this paper, we define the k-distant-IFM-solutions problem, which aims to find k transaction datasets whose pair distance is maximized. The degree of difference between the reconstructed datasets provides a way to evaluate the privacy risk. Since the problem is NPhard, we propose a 2-approximate solution as well as faster heuristics, and evaluate them on real data.

KW - Column generation

KW - Inverse frequent itemset mining

UR - http://www.scopus.com/inward/record.url?scp=85019773847&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85019773847&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-58469-0_34

DO - 10.1007/978-3-319-58469-0_34

M3 - Conference contribution

AN - SCOPUS:85019773847

SN - 9783319584683

T3 - IFIP Advances in Information and Communication Technology

SP - 506

EP - 519

BT - ICT Systems Security and Privacy Protection - 32nd IFIP TC 11 International Conference, SEC 2017, Proceedings

A2 - De Capitani di Vimercati, Sabrina

A2 - Martinelli, Fabio

PB - Springer New York LLC

ER -

Serra E, Vaidya J, Akella H, Sharma A. Evaluating the privacy implications of frequent itemset disclosure. In De Capitani di Vimercati S, Martinelli F, editors, ICT Systems Security and Privacy Protection - 32nd IFIP TC 11 International Conference, SEC 2017, Proceedings. Springer New York LLC. 2017. p. 506-519. (IFIP Advances in Information and Communication Technology). https://doi.org/10.1007/978-3-319-58469-0_34