Vector Quantization for Recommender Systems: A Review and Outlook (2405.03110v1)
Abstract: Vector quantization, renowned for its unparalleled feature compression capabilities, has been a prominent topic in signal processing and machine learning research for several decades and remains widely utilized today. With the emergence of large models and generative AI, vector quantization has gained popularity in recommender systems, establishing itself as a preferred solution. This paper starts with a comprehensive review of vector quantization techniques. It then explores systematic taxonomies of vector quantization methods for recommender systems (VQ4Rec), examining their applications from multiple perspectives. Further, it provides a thorough introduction to research efforts in diverse recommendation scenarios, including efficiency-oriented approaches and quality-oriented approaches. Finally, the survey analyzes the remaining challenges and anticipates future trends in VQ4Rec, including the challenges associated with the training of vector quantization, the opportunities presented by LLMs, and emerging trends in multimodal recommender systems. We hope this survey can pave the way for future researchers in the recommendation community and accelerate their exploration in this promising field.
- Voice conversion through vector quantization. Journal of the Acoustical Society of Japan (E) 11, 2 (1990), 71–76.
- Artem Babenko and Victor Lempitsky. 2014. Additive quantization for extreme vector compression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 931–938.
- EdVAE: Mitigating Codebook Collapse with Evidential Discrete Variational Autoencoders. Available at SSRN 4671725 (2023).
- Joeran Beel and Victor Brunel. 2019. Data pruning in recommender systems research: Best-practice or malpractice. ACM RecSys (2019).
- Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013).
- Speech coding based upon vector quantization. IEEE Transactions on Acoustics, Speech, and Signal Processing 28, 5 (1980), 562–574.
- Deep visual-semantic quantization for efficient image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1328–1337.
- Adversarial examples generation for deep product quantization networks on image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 2 (2022), 1388–1404.
- Hessian-aware Quantized Node Embeddings for Recommendation. In Proceedings of the 17th ACM Conference on Recommender Systems. 757–762.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597–1607.
- Approximate nearest neighbor search by residual vector quantization. Sensors 10, 12 (2010), 11259–11273.
- Clustered Embedding Learning for Recommender Systems. arXiv preprint arXiv:2302.01478 (2023).
- Rethinking attention with performers. arXiv preprint arXiv:2009.14794 (2020).
- Using vector quantization for image processing. Proc. IEEE 81, 9 (1993), 1326–1341.
- Variable-rate discrete representation learning. arXiv preprint arXiv:2103.06089 (2021).
- Piano genie. In Proceedings of the 24th International Conference on Intelligent User Interfaces. 160–164.
- Reinforcement routing on proximity graph for efficient recommendation. ACM Transactions on Information Systems 41, 1 (2023), 1–27.
- Optimized product quantization. IEEE transactions on pattern analysis and machine intelligence 36, 4 (2013), 744–755.
- Planting a seed of vision in large language model. arXiv preprint arXiv:2307.08041 (2023).
- Robert Gray. 1984. Vector quantization. IEEE Assp Magazine 1, 2 (1984), 4–29.
- Robert M. Gray and David L. Neuhoff. 1998. Quantization. IEEE transactions on information theory 44, 6 (1998), 2325–2383.
- Learning vector-quantized item representation for transferable sequential recommenders. In Proceedings of the ACM Web Conference 2023. 1162–1171.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021).
- Lightweight Modality Adaptation to Sequential Recommendation via Correlation Supervision. In European Conference on Information Retrieval. Springer International Publishing, Glasgow, Scotland, UK.
- Yao-Chang Huang and Shyh-Kang Jenor. 2004. An audio recommendation system based on audio signature description scheme in mpeg-7 audio. In 2004 IEEE International Conference on Multimedia and Expo (ICME)(IEEE Cat. No. 04TH8763), Vol. 1. IEEE, 639–642.
- Straightening out the straight-through estimator: Overcoming optimization challenges in vector quantized networks. In International Conference on Machine Learning. PMLR, 14096–14113.
- ReFRS: Resource-efficient federated recommender system for dynamic and diversified user preferences. ACM Transactions on Information Systems 41, 3 (2023), 1–30.
- Young Kyun Jang and Nam Ik Cho. 2021. Self-supervised product quantization for deep unsupervised image retrieval. In Proceedings of the IEEE/CVF international conference on computer vision. 12085–12094.
- Product quantization for nearest neighbor search. IEEE transactions on pattern analysis and machine intelligence 33, 1 (2010), 117–128.
- Language Models As Semantic Indexers. arXiv preprint arXiv:2310.07815 (2023).
- Contrastive Quantization based Semantic Code for Generative Recommendation. arXiv preprint arXiv:2404.14774 (2024).
- Billion-scale similarity search with gpus. IEEE Transactions on Big Data 7, 3 (2019), 535–547.
- Biing-Hwang Juang and A Gray. 1982. Multiple stage vector quantization for speech coding. In ICASSP’82. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 7. IEEE, 597–600.
- Learning multi-granular quantized embeddings for large-vocab categorical features in recommender systems. In Companion Proceedings of the Web Conference 2020. 562–566.
- MASCOT: A Quantization Framework for Efficient Matrix Factorization in Recommender Systems. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 290–299.
- Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30–37.
- K Krishna and M Narasimha Murty. 1999. Genetic K-means algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 29, 3 (1999), 433–439.
- Robust training of vector quantized bottleneck models. In 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–7.
- Autoregressive image generation using residual quantization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11523–11532.
- Lightrec: A memory and search-efficient recommender system. In Proceedings of The Web Conference 2020. 695–705.
- Product quantized collaborative filtering. IEEE Transactions on Knowledge and Data Engineering 33, 9 (2020), 3284–3296.
- Lucas D Lingle. 2023. Transformer-vq: Linear-time transformers via vector quantization. arXiv preprint arXiv:2309.16354 (2023).
- Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning. Advances in Neural Information Processing Systems 35 (2022), 1950–1965.
- MMGRec: Multimodal Generative Recommendation with Transformer Model. arXiv preprint arXiv:2404.16555 (2024).
- Ecoformer: Energy-saving attention with linear complexity. Advances in Neural Information Processing Systems 35 (2022), 10295–10308.
- Linrec: Linear attention mechanism for long-term sequential recommender systems. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 289–299.
- Once: Boosting content-based recommendation with both open-and closed-source large language models. In Proceedings of the 17th ACM International Conference on Web Search and Data Mining. 452–461.
- Learning Category Trees for ID-Based Recommendation: Exploring the Power of Differentiable Vector Quantization. In Proceedings of the ACM Web Conference 2024. Singapore.
- Discrete Semantic Tokenization for Deep CTR Prediction.
- Benchmarking News Recommendation in the Era of Green AI. arXiv preprint arXiv:2403.04736 (2024).
- Guojun Lu and Shyhwei Teng. 1999. A novel image retrieval technique based on vector quantization. In Proceedings of International Conference on Computational Intelligence for Modeling, Control and Automation. Citeseer, 36–41.
- Differentiable Optimized Product Quantization and Beyond. In Proceedings of the ACM Web Conference 2023. 3353–3363.
- Within-basket Recommendation via Neural Pattern Associator. arXiv preprint arXiv:2401.16433 (2024).
- Vector quantization in speech coding. Proc. IEEE 73, 11 (1985), 1551–1588.
- Stacked quantizers for compositional vector compression. arXiv preprint arXiv:1411.2173 (2014).
- Gaurav Menghani. 2023. Efficient deep learning: A survey on making deep learning models smaller, faster, and better. Comput. Surveys 55, 12 (2023), 1–37.
- Finite scalar quantization: Vq-vae made simple. arXiv preprint arXiv:2309.15505 (2023).
- Stanislav Morozov and Artem Babenko. 2018. Non-metric similarity graphs for maximum inner product search. Advances in Neural Information Processing Systems 31 (2018).
- Nasser M Nasrabadi and Robert A King. 1988. Image coding using vector quantization: A review. IEEE Transactions on communications 36, 8 (1988), 957–971.
- Behnam Neyshabur and Nathan Srebro. 2015. On symmetric and asymmetric lshs for inner product search. In International Conference on Machine Learning. PMLR, 1926–1934.
- User-LLM: Efficient LLM Contextualization with User Embeddings. arXiv preprint arXiv:2402.13598 (2024).
- R OpenAI. 2023. Gpt-4 technical report. arxiv 2303.08774. View in Article 2, 5 (2023).
- Click-through rate prediction with auto-quantized contrastive learning. arXiv preprint arXiv:2109.13921 (2021).
- Recommender Systems with Generative Retrieval. arXiv preprint arXiv:2305.05065 (2023).
- Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems 32 (2019).
- ML Sabin and R Gray. 1984. Product code vector quantizers for waveform and voice coding. IEEE transactions on acoustics, speech, and signal processing 32, 3 (1984), 474–488.
- GPU accelerated feature engineering and training for recommender systems. In Proceedings of the Recommender Systems Challenge 2020. 16–23.
- Green ai. Commun. ACM 63, 12 (2020), 54–63.
- Quantize Sequential Recommenders Without Private Data. In Proceedings of the ACM Web Conference 2023. 1043–1052.
- Better Generalization with Semantic IDs: A case study in Ranking for Recommendations. arXiv preprint arXiv:2306.08121 (2023).
- Data masking for recommender systems: prediction performance and rating hiding. (2019).
- Joo-yeong Song and Bongwon Suh. 2022. Data Augmentation Strategies for Improving Sequential Recommender Systems. arXiv e-prints (2022), arXiv–2203.
- Beyond Two-Tower Matching: Learning Sparse Retrievable Cross-Interactions for Recommendation. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 548–557.
- Jiaxi Tang and Ke Wang. 2018. Ranking distillation: Learning compact ranking models with high performance for recommender system. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 2289–2298.
- Transformer memory as a differentiable search index. Advances in Neural Information Processing Systems 35 (2022), 21831–21843.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).
- Jan Van Balen and Mark Levy. 2019. PQ-VAE: Efficient Recommendation Using Quantized Embeddings.. In RecSys (Late-Breaking Results). 46–50.
- Neural discrete representation learning. Advances in neural information processing systems 30 (2017).
- Attention is all you need. Advances in neural information processing systems 30 (2017).
- Linformer: Self-attention with linear complexity. arXiv preprint arXiv:2006.04768 (2020).
- A neural corpus indexer for document retrieval. Advances in Neural Information Processing Systems 35 (2022), 25600–25614.
- Hyperparameter learning for deep learning-based recommender systems. IEEE Transactions on Services Computing (2023).
- Linear-time self attention with codeword histogram for efficient recommendation. In Proceedings of the Web Conference 2021. 1262–1273.
- Achieving Cross Modal Generalization with Multimodal Unified Representation. Advances in Neural Information Processing Systems 36 (2024).
- Vqgraph: Graph vector-quantization for bridging gnns and mlps. arXiv preprint arXiv:2308.02117 (2023).
- LlamaRec: Two-stage recommendation using large language models for ranking. arXiv preprint arXiv:2311.02089 (2023).
- Soundstream: An end-to-end neural audio codec. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30 (2021), 495–507.
- AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling. arXiv preprint arXiv:2402.12226 (2024).
- Query-Aware Quantization for Maximum Inner Product Search. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 4875–4883.
- Regularized vector quantization for tokenized image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18467–18476.
- An efficient recommender system using locality sensitive hashing. (2018).
- Embedding-based recommender system for job to candidate matching on scale. arXiv preprint arXiv:2107.00221 (2021).
- Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation. arXiv preprint arXiv:2311.09049 (2023).
- UniCode: Learning a Unified Codebook for Multimodal Large Language Models. arXiv preprint arXiv:2403.09072 (2024).
- Qijiong Liu (22 papers)
- Xiaoyu Dong (23 papers)
- Jiaren Xiao (7 papers)
- Nuo Chen (100 papers)
- Hengchang Hu (12 papers)
- Jieming Zhu (68 papers)
- Chenxu Zhu (14 papers)
- Tetsuya Sakai (30 papers)
- Xiao-Ming Wu (91 papers)