Efficient locality-sensitive hashing over high-dimensional data streams

Chengcheng Yang, Dong Deng, Shuo Shang, Ling Shao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

Approximate Nearest Neighbor (ANN) search in high-dimensional space is a fundamental task in many applications. Locality-Sensitive Hashing (LSH) is a well-known methodology to solve the ANN problem with theoretical guarantees and empirical performance. We observe that existing LSH-based approaches target at the problem of designing search optimized indexes, which require a number of separate indexes and high index maintenance overhead, and hence impractical for high-dimensional streaming data processing. In this paper, we present PDA-LSH, a novel and practical disk-based LSH index that can offer efficient support for both updates and searches. Experiments on real-world datasets show that our proposal outperforms the state-of-the-art schemes by up to 10× on update performance and up to 2× on search performance.

Original languageEnglish (US)
Title of host publicationProceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020
PublisherIEEE Computer Society
Pages1986-1989
Number of pages4
ISBN (Electronic)9781728129037
DOIs
StatePublished - Apr 2020
Event36th IEEE International Conference on Data Engineering, ICDE 2020 - Dallas, United States
Duration: Apr 20 2020Apr 24 2020

Publication series

NameProceedings - International Conference on Data Engineering
Volume2020-April
ISSN (Print)1084-4627

Conference

Conference36th IEEE International Conference on Data Engineering, ICDE 2020
Country/TerritoryUnited States
CityDallas
Period4/20/204/24/20

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Information Systems

Fingerprint

Dive into the research topics of 'Efficient locality-sensitive hashing over high-dimensional data streams'. Together they form a unique fingerprint.

Cite this