Locality Sensitive Hashing revisited: Filling the gap between theory and algorithm analysis

Hongya Wang, Jiao Cao, Lih Chyun Shu, Davood Rafiei

研究成果: Conference contribution

19 引文 斯高帕斯(Scopus)

摘要

Locality Sensitive Hashing (LSH) is widely recognized as one of the most promising approaches to similarity search in high-dimensional spaces. Based on LSH, a considerable number of nearest neighbor search algorithms have been proposed in the past, with some of them having been used in many real-life applications. Apart from their demonstrated superior performance in practice, the popularity of the LSH algorithms is mainly due to their provable performance bounds on query cost, space consumption and failure probability. In this paper, we show that a surprising gap exists between the LSH theory and widely practiced algorithm analysis techniques. In particular, we discover that a critical assumption made in the classical LSH algorithm analysis does not hold in practice, which suggests that using the existing methods to analyze the performance of practical LSH algorithms is a conceptual mismatch. To address this problem, a novel analysis model is developed that bridges the gap between the LSH theory and the method for analyzing the LSH algorithm performance. With the help of this model, we identify some important flaws in the commonly used analysis methods in the LSH literature. The validity of this model is verified through extensive experiments with real datasets.

原文English
主出版物標題CIKM 2013 - Proceedings of the 22nd ACM International Conference on Information and Knowledge Management
頁面1969-1978
頁數10
DOIs
出版狀態Published - 2013
事件22nd ACM International Conference on Information and Knowledge Management, CIKM 2013 - San Francisco, CA, United States
持續時間: 2013 10月 272013 11月 1

出版系列

名字International Conference on Information and Knowledge Management, Proceedings

Other

Other22nd ACM International Conference on Information and Knowledge Management, CIKM 2013
國家/地區United States
城市San Francisco, CA
期間13-10-2713-11-01

All Science Journal Classification (ASJC) codes

  • 一般決策科學
  • 一般商業,管理和會計

指紋

深入研究「Locality Sensitive Hashing revisited: Filling the gap between theory and algorithm analysis」主題。共同形成了獨特的指紋。

引用此