Cache-oblivious streaming B-trees

Michael A. Bender, Martin Farach-Colton, Jeremy T. Fineman, Yonatan R. Fogel, Bradley C. Kuszmaul, Jelani Nelson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

90 Citations (Scopus)

Abstract

A streaming B-tree is a dictionary that efficiently implements insertions and range queries. We present two cache-oblivious streaming B-trees, the shuttle tree, and the cache-oblivious lookahead array (COLA). For block-transfer size B and on N elements, the shuttle tree implements searches in optimal O(log B+1N) transfers, range queries of L successive elements in optimal O(log B+1N +L/B) transfers, and insertions in O((log B+1N)/B(1/(log log B)2)+(log2N)/B) transfers, which is an asymptotic speedup over traditional B-trees if B (log N) 1+c log log log2N for any constant c >1. A COLA implements searches in O(log N) transfers, range queries in O(log N + L/B) transfers, and insertions in amortized O((log N)/B) transfers, matching the bounds for a (cache-aware) buffered repository tree. A partially deamortized COLA matches these bounds but reduces the worst-case insertion cost to O(log N) if memory size M = (log N). We also present a cache-aware version of the COLA, the lookahead array, which achieves the same bounds as Brodal and Fagerberg's (cache-aware) Bε-tree. We compare our COLA implementation to a traditional B-tree. Our COLA implementation runs 790 times faster for random inser-tions, 3.1 times slower for insertions of sorted data, and 3.5 times slower for searches.

Original languageEnglish (US)
Title of host publicationSPAA'07
Subtitle of host publicationProceedings of the Nineteenth Annual Symposium on Parallelism in Algorithms and Architectures
Pages81-92
Number of pages12
DOIs
StatePublished - Oct 18 2007
EventSPAA'07: 19th Annual Symposium on Parallelism in Algorithms and Architectures - San Diego, CA, United States
Duration: Jun 9 2007Jun 11 2007

Publication series

NameAnnual ACM Symposium on Parallelism in Algorithms and Architectures

Other

OtherSPAA'07: 19th Annual Symposium on Parallelism in Algorithms and Architectures
CountryUnited States
CitySan Diego, CA
Period6/9/076/11/07

Fingerprint

Glossaries
Data storage equipment
Costs

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Keywords

  • Buffered repository tree
  • Cache-oblivious B-tree
  • Cascading array
  • Deamortized
  • Lookahead array
  • Shuttle tree

Cite this

Bender, M. A., Farach-Colton, M., Fineman, J. T., Fogel, Y. R., Kuszmaul, B. C., & Nelson, J. (2007). Cache-oblivious streaming B-trees. In SPAA'07: Proceedings of the Nineteenth Annual Symposium on Parallelism in Algorithms and Architectures (pp. 81-92). (Annual ACM Symposium on Parallelism in Algorithms and Architectures). https://doi.org/10.1145/1248377.1248393
Bender, Michael A. ; Farach-Colton, Martin ; Fineman, Jeremy T. ; Fogel, Yonatan R. ; Kuszmaul, Bradley C. ; Nelson, Jelani. / Cache-oblivious streaming B-trees. SPAA'07: Proceedings of the Nineteenth Annual Symposium on Parallelism in Algorithms and Architectures. 2007. pp. 81-92 (Annual ACM Symposium on Parallelism in Algorithms and Architectures).
@inproceedings{80a9846ab59d4a57ba8f966f9ad189ea,
title = "Cache-oblivious streaming B-trees",
abstract = "A streaming B-tree is a dictionary that efficiently implements insertions and range queries. We present two cache-oblivious streaming B-trees, the shuttle tree, and the cache-oblivious lookahead array (COLA). For block-transfer size B and on N elements, the shuttle tree implements searches in optimal O(log B+1N) transfers, range queries of L successive elements in optimal O(log B+1N +L/B) transfers, and insertions in O((log B+1N)/B(1/(log log B)2)+(log2N)/B) transfers, which is an asymptotic speedup over traditional B-trees if B (log N) 1+c log log log2N for any constant c >1. A COLA implements searches in O(log N) transfers, range queries in O(log N + L/B) transfers, and insertions in amortized O((log N)/B) transfers, matching the bounds for a (cache-aware) buffered repository tree. A partially deamortized COLA matches these bounds but reduces the worst-case insertion cost to O(log N) if memory size M = (log N). We also present a cache-aware version of the COLA, the lookahead array, which achieves the same bounds as Brodal and Fagerberg's (cache-aware) Bε-tree. We compare our COLA implementation to a traditional B-tree. Our COLA implementation runs 790 times faster for random inser-tions, 3.1 times slower for insertions of sorted data, and 3.5 times slower for searches.",
keywords = "Buffered repository tree, Cache-oblivious B-tree, Cascading array, Deamortized, Lookahead array, Shuttle tree",
author = "Bender, {Michael A.} and Martin Farach-Colton and Fineman, {Jeremy T.} and Fogel, {Yonatan R.} and Kuszmaul, {Bradley C.} and Jelani Nelson",
year = "2007",
month = "10",
day = "18",
doi = "10.1145/1248377.1248393",
language = "English (US)",
isbn = "159593667X",
series = "Annual ACM Symposium on Parallelism in Algorithms and Architectures",
pages = "81--92",
booktitle = "SPAA'07",

}

Bender, MA, Farach-Colton, M, Fineman, JT, Fogel, YR, Kuszmaul, BC & Nelson, J 2007, Cache-oblivious streaming B-trees. in SPAA'07: Proceedings of the Nineteenth Annual Symposium on Parallelism in Algorithms and Architectures. Annual ACM Symposium on Parallelism in Algorithms and Architectures, pp. 81-92, SPAA'07: 19th Annual Symposium on Parallelism in Algorithms and Architectures, San Diego, CA, United States, 6/9/07. https://doi.org/10.1145/1248377.1248393

Cache-oblivious streaming B-trees. / Bender, Michael A.; Farach-Colton, Martin; Fineman, Jeremy T.; Fogel, Yonatan R.; Kuszmaul, Bradley C.; Nelson, Jelani.

SPAA'07: Proceedings of the Nineteenth Annual Symposium on Parallelism in Algorithms and Architectures. 2007. p. 81-92 (Annual ACM Symposium on Parallelism in Algorithms and Architectures).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Cache-oblivious streaming B-trees

AU - Bender, Michael A.

AU - Farach-Colton, Martin

AU - Fineman, Jeremy T.

AU - Fogel, Yonatan R.

AU - Kuszmaul, Bradley C.

AU - Nelson, Jelani

PY - 2007/10/18

Y1 - 2007/10/18

N2 - A streaming B-tree is a dictionary that efficiently implements insertions and range queries. We present two cache-oblivious streaming B-trees, the shuttle tree, and the cache-oblivious lookahead array (COLA). For block-transfer size B and on N elements, the shuttle tree implements searches in optimal O(log B+1N) transfers, range queries of L successive elements in optimal O(log B+1N +L/B) transfers, and insertions in O((log B+1N)/B(1/(log log B)2)+(log2N)/B) transfers, which is an asymptotic speedup over traditional B-trees if B (log N) 1+c log log log2N for any constant c >1. A COLA implements searches in O(log N) transfers, range queries in O(log N + L/B) transfers, and insertions in amortized O((log N)/B) transfers, matching the bounds for a (cache-aware) buffered repository tree. A partially deamortized COLA matches these bounds but reduces the worst-case insertion cost to O(log N) if memory size M = (log N). We also present a cache-aware version of the COLA, the lookahead array, which achieves the same bounds as Brodal and Fagerberg's (cache-aware) Bε-tree. We compare our COLA implementation to a traditional B-tree. Our COLA implementation runs 790 times faster for random inser-tions, 3.1 times slower for insertions of sorted data, and 3.5 times slower for searches.

AB - A streaming B-tree is a dictionary that efficiently implements insertions and range queries. We present two cache-oblivious streaming B-trees, the shuttle tree, and the cache-oblivious lookahead array (COLA). For block-transfer size B and on N elements, the shuttle tree implements searches in optimal O(log B+1N) transfers, range queries of L successive elements in optimal O(log B+1N +L/B) transfers, and insertions in O((log B+1N)/B(1/(log log B)2)+(log2N)/B) transfers, which is an asymptotic speedup over traditional B-trees if B (log N) 1+c log log log2N for any constant c >1. A COLA implements searches in O(log N) transfers, range queries in O(log N + L/B) transfers, and insertions in amortized O((log N)/B) transfers, matching the bounds for a (cache-aware) buffered repository tree. A partially deamortized COLA matches these bounds but reduces the worst-case insertion cost to O(log N) if memory size M = (log N). We also present a cache-aware version of the COLA, the lookahead array, which achieves the same bounds as Brodal and Fagerberg's (cache-aware) Bε-tree. We compare our COLA implementation to a traditional B-tree. Our COLA implementation runs 790 times faster for random inser-tions, 3.1 times slower for insertions of sorted data, and 3.5 times slower for searches.

KW - Buffered repository tree

KW - Cache-oblivious B-tree

KW - Cascading array

KW - Deamortized

KW - Lookahead array

KW - Shuttle tree

UR - http://www.scopus.com/inward/record.url?scp=35248857461&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=35248857461&partnerID=8YFLogxK

U2 - 10.1145/1248377.1248393

DO - 10.1145/1248377.1248393

M3 - Conference contribution

AN - SCOPUS:35248857461

SN - 159593667X

SN - 9781595936677

T3 - Annual ACM Symposium on Parallelism in Algorithms and Architectures

SP - 81

EP - 92

BT - SPAA'07

ER -

Bender MA, Farach-Colton M, Fineman JT, Fogel YR, Kuszmaul BC, Nelson J. Cache-oblivious streaming B-trees. In SPAA'07: Proceedings of the Nineteenth Annual Symposium on Parallelism in Algorithms and Architectures. 2007. p. 81-92. (Annual ACM Symposium on Parallelism in Algorithms and Architectures). https://doi.org/10.1145/1248377.1248393