Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SAH: Shifting-aware Asymmetric Hashing for Reverse $k$-Maximum Inner Product Search (2211.12751v1)

Published 23 Nov 2022 in cs.IR, cs.DB, cs.DS, and cs.LG

Abstract: This paper investigates a new yet challenging problem called Reverse $k$-Maximum Inner Product Search (R$k$MIPS). Given a query (item) vector, a set of item vectors, and a set of user vectors, the problem of R$k$MIPS aims to find a set of user vectors whose inner products with the query vector are one of the $k$ largest among the query and item vectors. We propose the first subquadratic-time algorithm, i.e., Shifting-aware Asymmetric Hashing (SAH), to tackle the R$k$MIPS problem. To speed up the Maximum Inner Product Search (MIPS) on item vectors, we design a shifting-invariant asymmetric transformation and develop a novel sublinear-time Shifting-Aware Asymmetric Locality Sensitive Hashing (SA-ALSH) scheme. Furthermore, we devise a new blocking strategy based on the Cone-Tree to effectively prune user vectors (in a batch). We prove that SAH achieves a theoretical guarantee for solving the RMIPS problem. Experimental results on five real-world datasets show that SAH runs 4$\sim$8$\times$ faster than the state-of-the-art methods for R$k$MIPS while achieving F1-scores of over 90\%. The code is available at \url{https://github.com/HuangQiang/SAH}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. To Index or Not to Index: Optimizing Exact Maximum Inner Product Search. In 2019 IEEE 35th International Conference on Data Engineering (ICDE), 1250–1261.
  2. Efficient reverse k-nearest neighbor search in arbitrary metric spaces. In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data (SIGMOD), 515–526.
  3. Reverse Maximum Inner Product Search: How to efficiently find users who would like to buy my item? In The Fifteenth ACM Conference on Recommender Systems (RecSys), 273–281.
  4. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 459–468.
  5. Practical and Optimal LSH for Angular Distance. In Advances in Neural Information Processing Systems 28 (NIPS), 1225–1233.
  6. Reverse Nearest Neighbors Search in High Dimensions using Locality-Sensitive Hashing. arXiv:1011.4955.
  7. Diamond Sampling for Approximate Maximum All-Pairs Dot-Product (MAD) Search. In 2015 IEEE International Conference on Data Mining (ICDM), 11–20.
  8. The Netflix Prize. In Proceedings of KDD Cup and Workshop 2007 (KDDCup), 3–6.
  9. Charikar, M. 2002. Similarity estimation techniques from rounding algorithms. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing (STOC), 380–388.
  10. Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), 51–58.
  11. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the 20th ACM Symposium on Computational Geometry (SCG), 253–262.
  12. A Fast Sampling Algorithm for Maximum Inner Product Search. In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics (AISTATS), 3004–3012.
  13. Locality-sensitive hashing scheme based on dynamic collision counting. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD), 541–552.
  14. Quantization based Fast Inner Product Search. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), 482–490.
  15. Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality. Theory Comput., 8: 321–350.
  16. Query-aware locality-sensitive hashing for approximate nearest neighbor search. Proc. VLDB Endow., 9(1): 1–12.
  17. Accurate and Fast Asymmetric Locality-Sensitive Hashing Scheme for Maximum Inner Product Search. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 1561–1570.
  18. Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing (STOC), 604–613.
  19. Improved maximum inner product search with better theoretical guarantee using randomized partition trees. Mach. Learn., 107(6): 1069–1094.
  20. Efficient retrieval of recommendations in a matrix factorization framework. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM), 535–544.
  21. Matrix Factorization Techniques for Recommender Systems. Computer, 42(8): 30–37.
  22. Influence Sets Based on Reverse Nearest Neighbor Queries. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD), 201–212.
  23. Locality-sensitive hashing scheme based on longest circular co-substring. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (SIGMOD), 2589–2599.
  24. Sublinear Time Nearest Neighbor Search over Generalized Weighted Space. In Proceedings of the 36th International Conference on Machine Learning (ICML), 3773–3781.
  25. FEXIPRO: Fast and Exact Inner Product Retrieval in Recommender Systems. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD), 835–850.
  26. Understanding and Improving Proximity Graph Based Maximum Inner Product Search. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), 139–146.
  27. Revisiting Wedge Sampling for Budgeted Maximum Inner Product Search. In Machine Learning and Knowledge Discovery in Databases - European Conference (ECML-PKDD) 2020, Ghent, Belgium, September 14-18, 2020, Proceedings, Part I, 439–455.
  28. Non-metric Similarity Graphs for Maximum Inner Product Search. In Advances in Neural Information Processing Systems 31 (NeurIPS), 4726–4735.
  29. On Symmetric and Asymmetric LSHs for Inner Product Search. In Proceedings of the 32nd International Conference on Machine Learning (ICML), 1926–1934.
  30. Pham, N. 2021. Simple Yet Efficient Algorithms for Maximum Inner Product Search via Extreme Order Statistics. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD), 1339–1347.
  31. Maximum inner-product search using cone trees. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 931–939.
  32. Learning Binary Codes for Maximum Inner Product Search. In 2015 IEEE International Conference on Computer Vision (ICCV), 4148–4156.
  33. Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS). In Advances in Neural Information Processing Systems 27 (NIPS), 2321–2329.
  34. Improved Asymmetric Locality Sensitive Hashing (ALSH) for Maximum Inner Product Search (MIPS). In Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence (UAI), 812–821.
  35. High dimensional reverse nearest neighbor queries. In Proceedings of the Twelfth International Conference on Information and Knowledge Management (CIKM), 91–98.
  36. Norm Adjusted Proximity Graph for Fast Inner Product Retrieval. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD), 1552–1560.
  37. On Efficient Retrieval of Top Similarity Vectors. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 5235–5245.
  38. Reverse kNN Search in Arbitrary Dimensionality. In Proceedings of the Thirtieth International Conference on Very Large Data Bases (VLDB), 744–755.
  39. Quality and efficiency in high dimensional nearest neighbor search. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD), 563–576.
  40. Exact and Approximate Maximum Inner Product Search with LEMP. ACM Trans. Database Syst., 42(1): 5:1–5:49.
  41. Reverse top-k queries. In 2010 IEEE 26th International Conference on Data Engineering (ICDE), 365–376.
  42. Monochromatic and Bichromatic Reverse Top-k Queries. IEEE Trans. Knowl. Data Eng., 23(8): 1215–1229.
  43. Branch-and-bound algorithm for reverse top-k queries. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD), 481–492.
  44. GAIPS: Accelerating Maximum Inner Product Search with GPU. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 1920–1924.
  45. Deep Matrix Factorization Models for Recommender Systems. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI), 3203–3209.
  46. Norm-Ranging LSH for Maximum Inner Product Search. In Advances in Neural Information Processing Systems 31 (NeurIPS), 2956–2965.
  47. An Index Structure for Efficient Reverse Nearest Neighbor Queries. In Proceedings of the 17th International Conference on Data Engineering (ICDE), 485–492.
  48. A Greedy Approach for Budgeted Maximum Inner Product Search. In Advances in Neural Information Processing Systems 30 (NIPS), 5453–5462.
  49. Möbius Transformation for Fast Inner Product Search on Graph. In Advances in Neural Information Processing Systems 32 (NeurIPS), 8216–8227.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com