Locality Sensitive Hashing revisited: Filling the gap between theory and algorithm analysis

Hongya Wang, Jiao Cao, Lih-Chyun Shu, Davood Rafiei

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Citations (Scopus)

Abstract

Locality Sensitive Hashing (LSH) is widely recognized as one of the most promising approaches to similarity search in high-dimensional spaces. Based on LSH, a considerable number of nearest neighbor search algorithms have been proposed in the past, with some of them having been used in many real-life applications. Apart from their demonstrated superior performance in practice, the popularity of the LSH algorithms is mainly due to their provable performance bounds on query cost, space consumption and failure probability. In this paper, we show that a surprising gap exists between the LSH theory and widely practiced algorithm analysis techniques. In particular, we discover that a critical assumption made in the classical LSH algorithm analysis does not hold in practice, which suggests that using the existing methods to analyze the performance of practical LSH algorithms is a conceptual mismatch. To address this problem, a novel analysis model is developed that bridges the gap between the LSH theory and the method for analyzing the LSH algorithm performance. With the help of this model, we identify some important flaws in the commonly used analysis methods in the LSH literature. The validity of this model is verified through extensive experiments with real datasets.

Original languageEnglish
Title of host publicationCIKM 2013 - Proceedings of the 22nd ACM International Conference on Information and Knowledge Management
Pages1969-1978
Number of pages10
DOIs
Publication statusPublished - 2013 Dec 11
Event22nd ACM International Conference on Information and Knowledge Management, CIKM 2013 - San Francisco, CA, United States
Duration: 2013 Oct 272013 Nov 1

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

Other22nd ACM International Conference on Information and Knowledge Management, CIKM 2013
CountryUnited States
CitySan Francisco, CA
Period13-10-2713-11-01

All Science Journal Classification (ASJC) codes

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Fingerprint Dive into the research topics of 'Locality Sensitive Hashing revisited: Filling the gap between theory and algorithm analysis'. Together they form a unique fingerprint.

  • Cite this

    Wang, H., Cao, J., Shu, L-C., & Rafiei, D. (2013). Locality Sensitive Hashing revisited: Filling the gap between theory and algorithm analysis. In CIKM 2013 - Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (pp. 1969-1978). (International Conference on Information and Knowledge Management, Proceedings). https://doi.org/10.1145/2505515.2505765