TY - GEN
T1 - Durable transactional memory can scale with timestone
AU - Madhava Krishnan, R.
AU - Kim, Jaeho
AU - Mathew, Ajit
AU - Fu, Xinwei
AU - Demeri, Anthony
AU - Min, Changwoo
AU - Kannan, Sudarsun
N1 - Funding Information:
the database log optimization techniques primarily focuses on reducing the durability cost, we in TimeStone propose TOC logging which is geared not only towards reducing the durability cost but also focuses on reducing write amplification to achieve better performance and scalability. FASE Techniques. Another line of research for developing crash consistent NVMM applications utilize a failure-atomic critical section (FASE) approach, guaranteeing atomicity at the level of a critical section granularity [12, 23, 27, 49, 58]. This approach focuses on providing failure atomicity to the legacy lock-based code with little or no focus on the scalability and write amplification issues. Moreover, the traditional FASE-based techniques such as [12, 23, 27, 49] suffers from complex runtime dependency tracking. While the state-of-art iDO logging [58] reduces the dependency tracking overhead but still it needs a specialized compiler support. Hardware Assisted Techniques. This class leverages STM-or FASE-based approaches and propose new hardware support for guaranteeing atomic durability [22, 35, 37, 39, 50, 51, 64, 67, 76, 80, 95]. They interface with hardware buffers to speed up logging [39, 80, 95] or delegate the process of ordering stores to hardware [22, 50, 51, 64], clearly demanding significant hardware changes and introducing new logging instructions. Some approaches in this class propose extending hardware transactional memory (like Intel RTM) for making atomic updates to NVMM [38, 38, 53]. The performance of these techniques are bound by the L1-L3 cache size and requires changes in the existing cache-coherence protocol [38]. Unlike these techniques, TimeStone is completely software-based capable of running on the modern hardware. 7 Conclusion In this paper, we propose TimeStone, a scalable and high-performing DTM framework. We propose TOC logging to keep write amplification under the check. MVCC-based design helps TimeStone to achieve better scalability and full-data consistency. Also, we support three different isolation levels to improve the applicability of TimeStone. We evaluated the TimeStone against all of the latest DTM works and we showed that TimeStone outperforms all of them upto 40× and shows a better scalability. While the prior DTM systems suffers from 2×-6× write amplification, TimeStone maintains it below 1. We also presented the real world impact of TimeStone by evaluating it with KyotoCabinet and YCSB workloads. The TimeStone enabled KytoCabinet and B+-tree shows better performance and scalability. We will open source TimeStone. Acknowledgment We thank the anonymous reviewers for their helpful feedback. This work was partially supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No. 2014-3-00035).
Publisher Copyright:
© 2020 Association for Computing Machinery.
PY - 2020/3/9
Y1 - 2020/3/9
N2 - Non-volatile main memory (NVMM) technologies promise byte addressability and near-DRAM access that allows developers to build persistent applications with common load and store instructions. However, it is difficult to realize these promises because NVMM software should also provide crash consistency while providing high performance, and scalability. Durable transactional memory (DTM) systems address these challenges. However, none of them scale beyond 16 cores. The poor scalability either stems from the underlying STM layer or from employing limited write parallelism (single writer or dual version). In addition, other fundamental issues with guaranteeing crash consistency are high write amplification and memory footprint in existing approaches. To address these challenges, we propose TimeStone: a highly scalable DTM system with low write amplification and minimal memory footprint. TimeStone uses a novel multi-layered hybrid logging technique, called TOC logging, to guarantee crash consistency. Also, TimeStone further relies on Multi-Version Concurrency Control (MVCC) mechanism to achieve high scalability and to support different isolation levels on the same data set. Our evaluation of TimeStone against the state-of-the-art DTM systems shows that it significantly outperforms other systems for a wide range of workloads with varying data-set size and contention levels, up to 112 hardware threads. In addition, with our TOC logging, TimeStone achieves a write amplification of less than 1, while existing DTM systems suffer from 2×-6× overhead.
AB - Non-volatile main memory (NVMM) technologies promise byte addressability and near-DRAM access that allows developers to build persistent applications with common load and store instructions. However, it is difficult to realize these promises because NVMM software should also provide crash consistency while providing high performance, and scalability. Durable transactional memory (DTM) systems address these challenges. However, none of them scale beyond 16 cores. The poor scalability either stems from the underlying STM layer or from employing limited write parallelism (single writer or dual version). In addition, other fundamental issues with guaranteeing crash consistency are high write amplification and memory footprint in existing approaches. To address these challenges, we propose TimeStone: a highly scalable DTM system with low write amplification and minimal memory footprint. TimeStone uses a novel multi-layered hybrid logging technique, called TOC logging, to guarantee crash consistency. Also, TimeStone further relies on Multi-Version Concurrency Control (MVCC) mechanism to achieve high scalability and to support different isolation levels on the same data set. Our evaluation of TimeStone against the state-of-the-art DTM systems shows that it significantly outperforms other systems for a wide range of workloads with varying data-set size and contention levels, up to 112 hardware threads. In addition, with our TOC logging, TimeStone achieves a write amplification of less than 1, while existing DTM systems suffer from 2×-6× overhead.
KW - Logging
KW - Multi-version
KW - Scalability
KW - Transactional memory
KW - Write amplification
UR - http://www.scopus.com/inward/record.url?scp=85082389613&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082389613&partnerID=8YFLogxK
U2 - 10.1145/3373376.3378483
DO - 10.1145/3373376.3378483
M3 - Conference contribution
AN - SCOPUS:85082389613
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 335
EP - 349
BT - ASPLOS 2020 - 25th International Conference on Architectural Support for Programming Languages and Operating Systems
PB - Association for Computing Machinery
T2 - 25th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2020
Y2 - 16 March 2020 through 20 March 2020
ER -