GUITAR: Gradient Pruning toward Fast Neural Ranking (2312.16828v1)
Abstract: With the continuous popularity of deep learning and representation learning, fast vector search becomes a vital task in various ranking/retrieval based applications, say recommendation, ads ranking and question answering. Neural network based ranking is widely adopted due to its powerful capacity in modeling complex relationships, such as between users and items, questions and answers. However, it is usually exploited in offline or re-ranking manners for it is time-consuming in computations. Online neural network ranking--so called fast neural ranking--is considered challenging because neural network measures are usually non-convex and asymmetric. Traditional Approximate Nearest Neighbor (ANN) search which usually focuses on metric ranking measures, is not applicable to these advanced measures. In this paper, we introduce a novel graph searching framework to accelerate the searching in the fast neural ranking problem. The proposed graph searching algorithm is bi-level: we first construct a probable candidate set; then we only evaluate the neural network measure over the probable candidate set instead of evaluating the neural network over all neighbors. Specifically, we propose a gradient-based algorithm that approximates the rank of the neural network matching score to construct the probable candidate set; and we present an angle-based heuristic procedure to adaptively identify the proper size of the probable candidate set. Empirical results on public data confirm the effectiveness of our proposed algorithms.
- Speeding up the Xbox recommender system using a euclidean transformation for inner-product spaces. In Proceedings of the Eighth ACM Conference on Recommender Systems (RecSys), pages 257–264, Foster City, CA, 2014.
- Syntactic clustering of the web. Comput. Networks, 29(8-13):1157–1166, 1997.
- Min-wise independent permutations. In Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing (STOC), pages 327–336, Dallas, TX, 1998.
- Lawrence Cayton. Fast nearest neighbor retrieval for bregman divergences. In Proceedings of the Twenty-Fifth International Conference on Machine learning (ICML), pages 112–119, Helsinki, Finland, 2008.
- Pre-training tasks for embedding-based large-scale retrieval. In Proceedings of the 8th International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia, 2020.
- Moses S Charikar. Similarity estimation techniques from rounding algorithms. In Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing (STOC), pages 380–388, Montreal, Canada, 2002.
- Reading wikipedia to answer open-domain questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL), pages 1870–1879, Vancouver, Canada, 2017.
- Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys), pages 191–198, Boston, MA, 2016.
- Dual-tree fast exact max-kernel search. Statistical Analysis and Data Mining: The ASA Data Science Journal, 7(4):229–253, 2014.
- Fast exact max-kernel search. In Proceedings of the 13th SIAM International Conference on Data Mining (SDM), pages 1–9, Austin,TX, 2013.
- Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the Twentieth Annual Symposium on Computational Geometry (SCG), pages 253–262, Brooklyn, NY, 2004.
- Neural ranking models with weak supervision. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pages 65–74, Shinjuku, Tokyo, 2017.
- BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pages 4171–4186, Minneapolis, MN, 2019.
- MOBIUS: towards the next generation of query-ad matching in baidu’s sponsored search. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), pages 2509–2517, Anchorage, AK, 2019.
- GemNN: Gating-enhanced multi-task neural networks with feature interaction learning for CTR prediction. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pages 2166–2171, Virtual Event, Canada, 2021.
- An algorithm for finding nearest neighbors. IEEE Trans. Computers, 24(10):1000–1006, 1975.
- An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 3:209–226, 1977.
- Optimized product quantization for approximate nearest neighbor search. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2946–2953, Portland, OR, 2013.
- Similarity search in high dimensions via hashing. In Proceedings of 25th International Conference on Very Large Data Bases (VLDB), pages 518–529, Edinburgh, Scotland, UK, 1999.
- Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM, 42(6):1115–1145, 1995.
- DeepFM: A factorization-machine based neural network for CTR prediction. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI), pages 1725–1731, Melbourne, Australia, 2017.
- A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM), pages 55–64, Indianapolis, IN, 2016.
- A deep look into neural ranking models for information retrieval. Inf. Process. Manag., 57(6):102067, 2020.
- Fast approximate nearest-neighbor search with k-nearest neighbor graph. In Toby Walsh, editor, Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI), pages 1312–1317, Barcelona, Catalonia, Spain, 2011.
- Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web (WWW), pages 173–182, Perth, Australia, 2017.
- Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
- Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing (STOC), pages 604–613, Dallas, TX, 1998.
- Masajiro Iwasaki. Pruned bi-directed k-nearest neighbor graph for proximity search. In Laurent Amsaleg, Michael E. Houle, and Erich Schubert, editors, Proceedings of the 9th International Conference on Similarity Search and Applications (SISAP), pages 20–33, Tokyo, Japan, 2016.
- Optimization of indexing based on k-nearest neighbor graph for proximity search in high-dimensional data. arXiv preprint arXiv:1810.07355, 2018.
- Hamming embedding and weak geometric consistency for large scale image search. In Proceedings of the 10th European Conference on Computer Vision (ECCV), Part I, pages 304–317, Marseille, France, 2008.
- Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell., 33(1):117–128, 2011.
- Billion-scale similarity search with GPUs. IEEE Trans. Big Data, 7(3):535–547, 2021.
- Video recommendation with multi-gate mixture of experts soft actor critic. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval (SIGIR), pages 1553–1556, Virtual Event, China, 2020.
- Using sketches to estimate associations. In Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 708–715, Vancouver, Canada, 2005.
- One permutation hashing. In Advances in Neural Information Processing Systems (NIPS), pages 3122–3130, Lake Tahoe, NV, 2012.
- Sign cauchy projections and chi-square kernel. In Advances in Neural Information Processing Systems (NIPS), pages 2571–2579, Lake Tahoe, NV, 2013.
- C-MinHash: Improving minwise hashing with circulant permutation. In Proceedings of the International Conference on Machine Learning (ICML), pages 12857–12887, Baltimore, MD, 2022.
- A deep architecture for matching short texts. In Advances in Neural Information Processing Systems (NIPS), pages 1367–1375, Lake Tahoe, NV, 2013.
- Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE Trans. Pattern Anal. Mach. Intell., 42(4):824–836, 2020.
- An introduction to neural information retrieval. Found. Trends Inf. Retr., 13(1):1–126, 2018.
- Non-metric similarity graphs for maximum inner product search. In Advances in Neural Information Processing Systems (NeurIPS), pages 4726–4735, Montreal, Canada, 2018.
- Mining of massive datasets. Cambridge University Press, 2011.
- Maximum inner-product search using cone trees. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 931–939, Beijing, China, 2012.
- Recommendation on live-streaming platforms: Dynamic availability and repeat consumption. In Proceedings of the Fifteenth ACM Conference on Recommender Systems (RecSys), pages 390–399, Amsterdam, The Netherlands, 2021.
- Learning to rank short text pairs with convolutional deep neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pages 373–382, Santiago, Chile, 2015.
- Fast near neighbor search in high-dimensional binary data. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), Part I, pages 474–489, Bristol, UK, 2012.
- Asymmetric LSH (ALSH) for sublinear time maximum inner product search (MIPS). In Advances in Neural Information Processing Systems (NIPS), pages 2321–2329, Montreal, Canada, 2014.
- Asymmetric minwise hashing for indexing binary inner products and set containment. In Proceedings of the 24th International Conference on World Wide Web (WWW), pages 981–991, Florence, Italy, 2015.
- Fast item ranking under neural network based measures. In Proceedings of the Thirteenth ACM International Conference on Web Search and Data Mining (WSDM), pages 591–599, Houston, TX, 2020.
- Norm adjusted proximity graph for fast inner product retrieval. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pages 1552–1560, Virtual Event, Singapore, 2021a.
- Fast neural ranking on bipartite graph indices. Proc. VLDB Endow., 15(4):794–803, 2021b.
- Jiaxi Tang and Ke Wang. Ranking distillation: Learning compact ranking models with high performance for recommender system. In Proceedings of the 24th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), pages 2289–2298, 2018.
- Latent relational metric learning via memory-based attention for collaborative ranking. In Proceedings of the 2018 World Wide Web Conference on World Wide Web (WWW), pages 729–739, Lyon, France, 2018.
- Multiscale quantization for fast similarity search. In Advances in Neural Information Processing Systems (NIPS), pages 5745–5755, Long Beach, CA, 2017.
- Fast and unified local search for random walk based k-nearest-neighbor query in large graphs. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD), page 1139–1150, Snowbird, UT, 2014.
- End-to-end neural ad-hoc ranking with kernel pooling. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pages 55–64, Shinjuku, Tokyo, 2017.
- Norm-ranging LSH for maximum inner product search. In Advances in Neural Information Processing Systems (NeurIPS), pages 2956–2965, Montreal, Canada, 2018.
- A greedy approach for budgeted maximum inner product search. In Advances in Neural Information Processing Systems (NIPS), pages 5453–5462, Long Beach, CA, 2017.
- Boost CTR prediction for new advertisements via modeling visual content. In Proceedings of the IEEE International Conference on Big Data (IEEE BigData), pages 2140–2149, Osaka, Japan, 2022a.
- EGM: enhanced graph-based model for large-scale video advertisement search. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pages 4443–4451, Washington, DC, 2022b.
- Deep mutual learning. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4320–4328, Salt Lake City, UT, 2018.
- SONG: approximate nearest neighbor search on GPU. In Proceedings of the 36th IEEE International Conference on Data Engineering (ICDE), pages 1033–1044, Dallas, TX, 2020.
- Möbius transformation for fast inner product search on graph. In Advances in Neural Information Processing Systems (NeurIPS), pages 8216–8227, Vancouver, Canada, 2019.