Uncertainty propagation in data processing systems

Ioannis Manousakis, Sandro Rigo, Íñigo Goiri, Ricardo Bianchini, Thu D. Nguyen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

We are seeing an explosion of uncertain data—i.e., data that is more properly represented by probability distributions or estimated values with error bounds rather than exact values—from sensors in IoT, sampling-based approximate computations and machine learning algorithms. In many cases, performing computations on uncertain data as if it were exact leads to incorrect results. Unfortunately, developing applications for processing uncertain data is a major challenge from both the mathematical and performance perspectives. This paper proposes and evaluates an approach for tackling this challenge in DAG-based data processing systems. We present a framework for uncertainty propagation (UP) that allows developers to modify precise implementations of DAG nodes to process uncertain inputs with modest effort. We implement this framework in a system called UP-MapReduce, and use it to modify ten applications, including AI/ML, image processing and trend analysis applications to process uncertain data. Our evaluation shows that UP-MapReduce propagates uncertainties with high accuracy and, in many cases, low performance overheads. For example, a social network trend analysis application that combines data sampling with UP can reduce execution time by 2.3x when the user can tolerate a maximum relative error of 5% in the final answer. These results demonstrate that our UP framework presents a compelling approach for handling uncertain data in DAG processing.

Original languageEnglish (US)
Title of host publicationSoCC 2018 - Proceedings of the 2018 ACM Symposium on Cloud Computing
PublisherAssociation for Computing Machinery, Inc
Pages95-106
Number of pages12
ISBN (Electronic)9781450360111
DOIs
StatePublished - Oct 11 2018
Event2018 ACM Symposium on Cloud Computing, SoCC 2018 - Carlsbad, United States
Duration: Oct 11 2018Oct 13 2018

Publication series

NameSoCC 2018 - Proceedings of the 2018 ACM Symposium on Cloud Computing

Other

Other2018 ACM Symposium on Cloud Computing, SoCC 2018
Country/TerritoryUnited States
CityCarlsbad
Period10/11/1810/13/18

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Information Systems
  • Software

Keywords

  • DAG Data Processing
  • Uncertainty Propagation

Fingerprint

Dive into the research topics of 'Uncertainty propagation in data processing systems'. Together they form a unique fingerprint.

Cite this