WA-Dataspaces: Exploring the Data Staging Abstractions for Wide-Area Distributed Scientific Workflows

Mehmet Fatih Aktas, Javier Diaz-Montes, Ivan Rodero, Manish Parashar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Data staging has been shown to be very effective for supporting data intensive in-situ workflows and coupling of applications. Experimental sciences are increasingly becoming collaborative among geographically distributed teams, and include experimental instruments and HPC facilities. This new way of doing science poses new challenges due to data sizes, complexity of computation, and the use of wide area networks between couplings. In this paper, we explore how the staging abstraction can be extended to support such workflows. Specifically, we develop a NUMA-like abstraction that orchestrates multiple distributed local-area staging abstractions, and provides asynchronous data put/get semantics to enable data sharing across them. To mask data movement overhead and provide in-time data access, we propose the use of predictive prefetching approaches that leverage the iterative nature of the coupling. We evaluate our prototype implementation using a fusion workflow and show that our design can effectively and transparently support widearea coupled workflows. Additionally, results show that the use of prefetching techniques leads to significant gains in data access times of data that needs to be moved over the wide area network.

Original languageEnglish (US)
Title of host publicationProceedings - 46th International Conference on Parallel Processing, ICPP 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages251-260
Number of pages10
ISBN (Electronic)9781538610428
DOIs
StatePublished - Sep 1 2017
Event46th International Conference on Parallel Processing, ICPP 2017 - Bristol, United Kingdom
Duration: Aug 14 2017Aug 17 2017

Publication series

NameProceedings of the International Conference on Parallel Processing
ISSN (Print)0190-3918

Other

Other46th International Conference on Parallel Processing, ICPP 2017
CountryUnited Kingdom
CityBristol
Period8/14/178/17/17

All Science Journal Classification (ASJC) codes

  • Software
  • Mathematics(all)
  • Hardware and Architecture

Keywords

  • Dataspaces
  • Staging
  • Wide area network

Fingerprint Dive into the research topics of 'WA-Dataspaces: Exploring the Data Staging Abstractions for Wide-Area Distributed Scientific Workflows'. Together they form a unique fingerprint.

Cite this