NVStream: Accelerating HPC workflows with NVRAM-based transport for streaming objects

Pradeep Fernando, Ada Gavrilovska, Sudarsun Kannan, Greg Eisenhauer

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Nonvolatile memory technologies (NVRAM) with larger capacity relative to DRAM and faster persistence relative to block-based storage technologies are expected to play a crucial role in accelerating I/O performance for HPC scientific workflows. Typically, a scientific workflow includes a simulation process (producer of data) and an analytics application process (consumer of data) that stream, share, and exchange data supported by an underlying OS-level file system. However, using an OS-level file system for data sharing adds substantial software overheads due to frequent system calls, journaling (for crash-consistency) cost, and file-system metadata update cost. To overcome these challenges, we design NVStream– a lightweight user-level data management system that exploits NVRAMs byte addressability and fast persistence to support streaming I/O in scientific workflows. First, NVStream reduces I/O-related software overheads by designing a memory-based persistent object store and log-structured heap manager that exploit NVRAM’s large capacity. Second, NVStream incorporates a hardware-assisted non-temporal stores for crash-consistent updates at near hardware data copy (memory copy) speeds. Finally, NVStream reduces data written to NVRAM with a delta compression, which further reduces I/O cost for workflows with higher write locality. The evaluation of NVStream using I/O benchmarks and scientific applications demonstrates 10× reduction in I/O compared to NVRAM-optimized file systems and also guaranteeing crash-consistent data movement.

Original languageEnglish (US)
Title of host publicationHPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing
PublisherAssociation for Computing Machinery, Inc
Pages231-242
Number of pages12
ISBN (Electronic)9781450357852
DOIs
StatePublished - Jun 11 2018
Externally publishedYes
Event27th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2018 - Tempe, United States
Duration: Jun 11 2018Jun 15 2018

Publication series

NameHPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing

Other

Other27th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2018
CountryUnited States
CityTempe
Period6/11/186/15/18

Fingerprint

Data storage equipment
Costs
Dynamic random access storage
Electronic data interchange
Metadata
Information management
Computer hardware
Managers
Hardware

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Software

Cite this

Fernando, P., Gavrilovska, A., Kannan, S., & Eisenhauer, G. (2018). NVStream: Accelerating HPC workflows with NVRAM-based transport for streaming objects. In HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing (pp. 231-242). (HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing). Association for Computing Machinery, Inc. https://doi.org/10.1145/3208040.3208061
Fernando, Pradeep ; Gavrilovska, Ada ; Kannan, Sudarsun ; Eisenhauer, Greg. / NVStream : Accelerating HPC workflows with NVRAM-based transport for streaming objects. HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing. Association for Computing Machinery, Inc, 2018. pp. 231-242 (HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing).
@inproceedings{986764ee842643fdafb22cfc2f9431b0,
title = "NVStream: Accelerating HPC workflows with NVRAM-based transport for streaming objects",
abstract = "Nonvolatile memory technologies (NVRAM) with larger capacity relative to DRAM and faster persistence relative to block-based storage technologies are expected to play a crucial role in accelerating I/O performance for HPC scientific workflows. Typically, a scientific workflow includes a simulation process (producer of data) and an analytics application process (consumer of data) that stream, share, and exchange data supported by an underlying OS-level file system. However, using an OS-level file system for data sharing adds substantial software overheads due to frequent system calls, journaling (for crash-consistency) cost, and file-system metadata update cost. To overcome these challenges, we design NVStream– a lightweight user-level data management system that exploits NVRAMs byte addressability and fast persistence to support streaming I/O in scientific workflows. First, NVStream reduces I/O-related software overheads by designing a memory-based persistent object store and log-structured heap manager that exploit NVRAM’s large capacity. Second, NVStream incorporates a hardware-assisted non-temporal stores for crash-consistent updates at near hardware data copy (memory copy) speeds. Finally, NVStream reduces data written to NVRAM with a delta compression, which further reduces I/O cost for workflows with higher write locality. The evaluation of NVStream using I/O benchmarks and scientific applications demonstrates 10× reduction in I/O compared to NVRAM-optimized file systems and also guaranteeing crash-consistent data movement.",
author = "Pradeep Fernando and Ada Gavrilovska and Sudarsun Kannan and Greg Eisenhauer",
year = "2018",
month = "6",
day = "11",
doi = "10.1145/3208040.3208061",
language = "English (US)",
series = "HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing",
publisher = "Association for Computing Machinery, Inc",
pages = "231--242",
booktitle = "HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing",

}

Fernando, P, Gavrilovska, A, Kannan, S & Eisenhauer, G 2018, NVStream: Accelerating HPC workflows with NVRAM-based transport for streaming objects. in HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing. HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing, Association for Computing Machinery, Inc, pp. 231-242, 27th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2018, Tempe, United States, 6/11/18. https://doi.org/10.1145/3208040.3208061

NVStream : Accelerating HPC workflows with NVRAM-based transport for streaming objects. / Fernando, Pradeep; Gavrilovska, Ada; Kannan, Sudarsun; Eisenhauer, Greg.

HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing. Association for Computing Machinery, Inc, 2018. p. 231-242 (HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - NVStream

T2 - Accelerating HPC workflows with NVRAM-based transport for streaming objects

AU - Fernando, Pradeep

AU - Gavrilovska, Ada

AU - Kannan, Sudarsun

AU - Eisenhauer, Greg

PY - 2018/6/11

Y1 - 2018/6/11

N2 - Nonvolatile memory technologies (NVRAM) with larger capacity relative to DRAM and faster persistence relative to block-based storage technologies are expected to play a crucial role in accelerating I/O performance for HPC scientific workflows. Typically, a scientific workflow includes a simulation process (producer of data) and an analytics application process (consumer of data) that stream, share, and exchange data supported by an underlying OS-level file system. However, using an OS-level file system for data sharing adds substantial software overheads due to frequent system calls, journaling (for crash-consistency) cost, and file-system metadata update cost. To overcome these challenges, we design NVStream– a lightweight user-level data management system that exploits NVRAMs byte addressability and fast persistence to support streaming I/O in scientific workflows. First, NVStream reduces I/O-related software overheads by designing a memory-based persistent object store and log-structured heap manager that exploit NVRAM’s large capacity. Second, NVStream incorporates a hardware-assisted non-temporal stores for crash-consistent updates at near hardware data copy (memory copy) speeds. Finally, NVStream reduces data written to NVRAM with a delta compression, which further reduces I/O cost for workflows with higher write locality. The evaluation of NVStream using I/O benchmarks and scientific applications demonstrates 10× reduction in I/O compared to NVRAM-optimized file systems and also guaranteeing crash-consistent data movement.

AB - Nonvolatile memory technologies (NVRAM) with larger capacity relative to DRAM and faster persistence relative to block-based storage technologies are expected to play a crucial role in accelerating I/O performance for HPC scientific workflows. Typically, a scientific workflow includes a simulation process (producer of data) and an analytics application process (consumer of data) that stream, share, and exchange data supported by an underlying OS-level file system. However, using an OS-level file system for data sharing adds substantial software overheads due to frequent system calls, journaling (for crash-consistency) cost, and file-system metadata update cost. To overcome these challenges, we design NVStream– a lightweight user-level data management system that exploits NVRAMs byte addressability and fast persistence to support streaming I/O in scientific workflows. First, NVStream reduces I/O-related software overheads by designing a memory-based persistent object store and log-structured heap manager that exploit NVRAM’s large capacity. Second, NVStream incorporates a hardware-assisted non-temporal stores for crash-consistent updates at near hardware data copy (memory copy) speeds. Finally, NVStream reduces data written to NVRAM with a delta compression, which further reduces I/O cost for workflows with higher write locality. The evaluation of NVStream using I/O benchmarks and scientific applications demonstrates 10× reduction in I/O compared to NVRAM-optimized file systems and also guaranteeing crash-consistent data movement.

UR - http://www.scopus.com/inward/record.url?scp=85050080679&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050080679&partnerID=8YFLogxK

U2 - 10.1145/3208040.3208061

DO - 10.1145/3208040.3208061

M3 - Conference contribution

AN - SCOPUS:85050080679

T3 - HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing

SP - 231

EP - 242

BT - HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing

PB - Association for Computing Machinery, Inc

ER -

Fernando P, Gavrilovska A, Kannan S, Eisenhauer G. NVStream: Accelerating HPC workflows with NVRAM-based transport for streaming objects. In HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing. Association for Computing Machinery, Inc. 2018. p. 231-242. (HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing). https://doi.org/10.1145/3208040.3208061