ZHT: A light-weight reliable persistent dynamic scalable zero-hop distributed hash table

Tonglin Li, Xiaobing Zhou, Kevin Brandstatter, Dongfang Zhao, Ke Wang, Anupam Rajendran, Zhao Zhang, Ioan Raicu

Research output: Contribution to conferencePaperpeer-review

94 Scopus citations

Abstract

This paper presents ZHT, a zero-hop distributed hash table, which has been tuned for the requirements of high-end computing systems. ZHT aims to be a building block for future distributed systems, such as parallel and distributed file systems, distributed job management systems, and parallel programming systems. The goals of ZHT are delivering high availability, good fault tolerance, high throughput, and low latencies, at extreme scales of millions of nodes. ZHT has some important properties, such as being light-weight, dynamically allowing nodes join and leave, fault tolerant through replication, persistent, scalable, and supporting unconventional operations such as append (providing lock-free concurrent key/value modifications) in addition to insert/lookup/remove. We have evaluated ZHT's performance under a variety of systems, ranging from a Linux cluster with 512-cores, to an IBM Blue Gene/P supercomputer with 160K-cores. Using micro-benchmarks, we scaled ZHT up to 32K-cores with latencies of only 1.1ms and 18M operations/sec throughput. This work provides three real systems that have integrated with ZHT, and evaluate them at modest scales. 1) ZHT was used in the Fusion FS distributed file system to deliver distributed meta-data management at over 60K operations (e.g. file create) per second at 2K-core scales. 2) ZHT was used in the IStore, an information dispersal algorithm enabled distributed object storage system, to manage chunk locations, delivering more than 500 chunks/sec at 32-nodes scales. 3) ZHT was also used as a building block to MATRIX, a distributed job scheduling system, delivering 5000 jobs/sec throughputs at 2K-core scales. We compared ZHT against other distributed hash tables and key/value stores and found it offers superior performance for the features and portability it supports.

Original languageEnglish (US)
Pages775-787
Number of pages13
DOIs
StatePublished - 2013
Externally publishedYes
Event27th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2013 - Boston, MA, United States
Duration: May 20 2013May 24 2013

Other

Other27th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2013
Country/TerritoryUnited States
CityBoston, MA
Period5/20/135/24/13

All Science Journal Classification (ASJC) codes

  • Software

Keywords

  • Distributed hash tables
  • high-end computing
  • key/value stores

Fingerprint

Dive into the research topics of 'ZHT: A light-weight reliable persistent dynamic scalable zero-hop distributed hash table'. Together they form a unique fingerprint.

Cite this