Staging Based Task Execution for Data-driven, In-Situ Scientific Workflows

Zhe Wang, Pradeep Subedi, Matthieu Dorier, Philip E. Davis, Manish Parashar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

As scientific workflows increasingly use extreme-scale resources, the imbalance between higher computational capabilities, generated data volumes, and available I/O bandwidth is limiting the ability to translate these scales into insights. Insitu workflows (and the in-situ approach) are leveraging storage levels close to the computation in novel ways in order to reduce the required I/O. However, to be effective, it is important that the mapping and execution of such in-situ workflows adopts a data-driven approach, enabling in-situ tasks to be executed flexibly based upon data content. This paper first explores the design space for data-driven in-situ workflows. Specifically, it presents a model that captures different factors that influence the mapping, execution, and performance of data-driven in-situ workflows and experimentally studies the impact of different mapping decisions and execution patterns. The paper then presents the design, implementation, and experimental evaluation of a data-driven in-situ workflow execution framework that leverages in-memory distributed data management and user-defined task-triggers to enable efficient and scalable in-situ workflow execution.

Original languageEnglish (US)
Title of host publicationProceedings - 2020 IEEE International Conference on Cluster Computing, CLUSTER 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages209-220
Number of pages12
ISBN (Electronic)9781728166773
DOIs
StatePublished - Sep 2020
Event22nd IEEE International Conference on Cluster Computing, CLUSTER 2020 - Kobe, Japan
Duration: Sep 14 2020Sep 17 2020

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
Volume2020-September
ISSN (Print)1552-5244

Conference

Conference22nd IEEE International Conference on Cluster Computing, CLUSTER 2020
Country/TerritoryJapan
CityKobe
Period9/14/209/17/20

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Signal Processing

Keywords

  • data staging
  • dynamic task triggers
  • in-situ workflows
  • in-situ/in-transit execution

Fingerprint

Dive into the research topics of 'Staging Based Task Execution for Data-driven, In-Situ Scientific Workflows'. Together they form a unique fingerprint.

Cite this