TY - GEN
T1 - RISE
T2 - 2021 IEEE International Conference on Cluster Computing, Cluster 2021
AU - Subedi, Pradeep
AU - Davis, Philip E.
AU - Parashar, Manish
N1 - Funding Information:
ACKNOWLEDGEMENT We would like to thank all of the reviewers for their valuable feedback and comments. The research presented in this paper is based upon work by the RAPIDS2 Institute supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (SciDAC) program. This research also supported by Exascale Computing Project (17-SC-20-SC), and used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725.
Publisher Copyright:
©2021 IEEE.
PY - 2021
Y1 - 2021
N2 - While in-situ workflow formulations have addressed some of the data-related challenges associated with extreme-scale scientific workflows, these workflows involve complex interactions and different modes of data exchange. In the context of increasing system complexity, such workflows present significant resource management challenges, requiring complex cost-performance tradeoffs. This paper presents RISE, an intelligent staging-based data management middleware, which builds on the DataSpaces framework and performs intelligent scheduling of data management operations to reduce I/O contention. In RISE, data are always written immediately to local buffers to reduce the effect of the transfer impact upon application performance. RISE identifies applications' data access patterns and moves data towards data consumers only when the network is expected to be idle, reducing the impact of asynchronous background data movement upon critical data read/write requests. We experimentally demonstrate that RISE can take advantage of staging nodes to offload data during writes without degrading application data movement performance.
AB - While in-situ workflow formulations have addressed some of the data-related challenges associated with extreme-scale scientific workflows, these workflows involve complex interactions and different modes of data exchange. In the context of increasing system complexity, such workflows present significant resource management challenges, requiring complex cost-performance tradeoffs. This paper presents RISE, an intelligent staging-based data management middleware, which builds on the DataSpaces framework and performs intelligent scheduling of data management operations to reduce I/O contention. In RISE, data are always written immediately to local buffers to reduce the effect of the transfer impact upon application performance. RISE identifies applications' data access patterns and moves data towards data consumers only when the network is expected to be idle, reducing the impact of asynchronous background data movement upon critical data read/write requests. We experimentally demonstrate that RISE can take advantage of staging nodes to offload data during writes without degrading application data movement performance.
KW - Data Management
KW - Extreme Scale Data Staging
KW - High Performance Computing
KW - Machine Learning
UR - http://www.scopus.com/inward/record.url?scp=85126047109&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85126047109&partnerID=8YFLogxK
U2 - 10.1109/Cluster48925.2021.00021
DO - 10.1109/Cluster48925.2021.00021
M3 - Conference contribution
AN - SCOPUS:85126047109
T3 - Proceedings - IEEE International Conference on Cluster Computing, ICCC
SP - 146
EP - 156
BT - Proceedings - 2021 IEEE International Conference on Cluster Computing, Cluster 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 7 September 2021 through 10 September 2021
ER -