Integrity verification of K-means clustering outsourced to infrastructure as a service (IaaS) providers

Ruilin Liu, Hui Wang, Philippos Mordohai, Hui Xiong

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Scopus citations

Abstract

The Cloud-based infrastructure-as-a-service (IaaS) paradigm (e.g., Amazon EC2) enables a client who lacks computational resources to outsource her dataset and data mining tasks to the Cloud. However, as the Cloud may not be fully trusted, it raises serious concerns about the integrity of the mining results returned by the Cloud. To this end, in this paper, we provide a focused study about how to perform integrity verification of the κ-means clustering task outsourced to an IaaS provider. We consider the untrusted sloppy IaaS service provider that intends to return wrong clustering results by terminating the iterations early to save computational cost. We develop both probabilistic and deterministic verification methods to catch the incorrect clustering result by the service provider. The deterministic method returns 100% integrity guarantee with cost that is much cheaper than executing κ-means clustering locally, while the probabilistic method returns a probabilistic integrity guarantee with computational cost even cheaper than the deterministic approach. Our experimental results show that our verification methods can effectively and efficiently capture the sloppy service provider.

Original languageEnglish (US)
Title of host publicationProceedings of the 2013 SIAM International Conference on Data Mining, SDM 2013
EditorsJoydeep Ghosh, Zoran Obradovic, Jennifer Dy, Zhi-Hua Zhou, Chandrika Kamath, Srinivasan Parthasarathy
PublisherSiam Society
Pages632-640
Number of pages9
ISBN (Electronic)9781611972627
DOIs
StatePublished - 2013
EventSIAM International Conference on Data Mining, SDM 2013 - Austin, United States
Duration: May 2 2013May 4 2013

Publication series

NameProceedings of the 2013 SIAM International Conference on Data Mining, SDM 2013

Conference

ConferenceSIAM International Conference on Data Mining, SDM 2013
Country/TerritoryUnited States
CityAustin
Period5/2/135/4/13

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Software
  • Theoretical Computer Science
  • Information Systems
  • Signal Processing

Keywords

  • Cloud computing
  • Data-mining-as-a-service
  • Infrastructure as a Service (IaaS)
  • Integrity
  • κ-means clustering

Fingerprint

Dive into the research topics of 'Integrity verification of K-means clustering outsourced to infrastructure as a service (IaaS) providers'. Together they form a unique fingerprint.

Cite this