RISE: Reducing I/O Contention in Staging-based Extreme-Scale In-situ Workflows

Pradeep Subedi, Philip E. Davis, Manish Parashar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

While in-situ workflow formulations have addressed some of the data-related challenges associated with extreme-scale scientific workflows, these workflows involve complex interactions and different modes of data exchange. In the context of increasing system complexity, such workflows present significant resource management challenges, requiring complex cost-performance tradeoffs. This paper presents RISE, an intelligent staging-based data management middleware, which builds on the DataSpaces framework and performs intelligent scheduling of data management operations to reduce I/O contention. In RISE, data are always written immediately to local buffers to reduce the effect of the transfer impact upon application performance. RISE identifies applications' data access patterns and moves data towards data consumers only when the network is expected to be idle, reducing the impact of asynchronous background data movement upon critical data read/write requests. We experimentally demonstrate that RISE can take advantage of staging nodes to offload data during writes without degrading application data movement performance.

Original languageEnglish (US)
Title of host publicationProceedings - 2021 IEEE International Conference on Cluster Computing, Cluster 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages146-156
Number of pages11
ISBN (Electronic)9781728196664
DOIs
StatePublished - 2021
Externally publishedYes
Event2021 IEEE International Conference on Cluster Computing, Cluster 2021 - Virtual, Portland, United States
Duration: Sep 7 2021Sep 10 2021

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
Volume2021-September
ISSN (Print)1552-5244

Conference

Conference2021 IEEE International Conference on Cluster Computing, Cluster 2021
Country/TerritoryUnited States
CityVirtual, Portland
Period9/7/219/10/21

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Signal Processing

Keywords

  • Data Management
  • Extreme Scale Data Staging
  • High Performance Computing
  • Machine Learning

Fingerprint

Dive into the research topics of 'RISE: Reducing I/O Contention in Staging-based Extreme-Scale In-situ Workflows'. Together they form a unique fingerprint.

Cite this