Cache-oblivious string B-trees

Michael A. Bender, Martin Farach-Colton, Bradley C. Kuszmaul

Research output: Chapter in Book/Report/Conference proceedingConference contribution

60 Scopus citations

Abstract

B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees perform suboptimally when keys are long or of variable length,when keys are compressed, even when using front compression, the standard B-tree compression scheme,for range queries, andwith respect to memory effects such as disk prefetching.This paper presents a cache-oblivious string B-tree (COSB-tree) data structure that is efficient in all these ways: The COSB-tree searches asymptotically optimally and inserts and deletes nearly optimally.It maintains an index whose size is proportional to the front-compressed size of the dictionary. Furthermore, unlike standard front-compressed strings, keys can be decompressed in a memory-efficient manner.It performs range queries with no extra disk seeks; in contrast, B-trees incur disk seeks when skipping from leaf block to leaf block.It utilizes all levels of a memory hierarchy efficiently and makes good use of disk locality by using cache-oblivious layout strategies.

Original languageEnglish (US)
Title of host publicationProceedings of the Twenty-Fifth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2006
Pages233-242
Number of pages10
DOIs
StatePublished - 2006
Event25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2006 - Chicago, IL, United States
Duration: Jun 26 2006Jun 28 2006

Publication series

NameProceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Other

Other25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2006
Country/TerritoryUnited States
CityChicago, IL
Period6/26/066/28/06

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems
  • Hardware and Architecture

Keywords

  • Cache oblivious string B-tree
  • Locality preserving front compression
  • Packed-memory array
  • Range query
  • Rebalance

Fingerprint

Dive into the research topics of 'Cache-oblivious string B-trees'. Together they form a unique fingerprint.

Cite this