Papers
Topics
Authors
Recent
2000 character limit reached

Bridging Language and Items for Retrieval and Recommendation (2403.03952v1)

Published 6 Mar 2024 in cs.IR

Abstract: This paper introduces BLaIR, a series of pretrained sentence embedding models specialized for recommendation scenarios. BLaIR is trained to learn correlations between item metadata and potential natural language context, which is useful for retrieving and recommending items. To pretrain BLaIR, we collect Amazon Reviews 2023, a new dataset comprising over 570 million reviews and 48 million items from 33 categories, significantly expanding beyond the scope of previous versions. We evaluate the generalization ability of BLaIR across multiple domains and tasks, including a new task named complex product search, referring to retrieving relevant items given long, complex natural language contexts. Leveraging LLMs like ChatGPT, we correspondingly construct a semi-synthetic evaluation set, Amazon-C4. Empirical results on the new task, as well as conventional retrieval and recommendation tasks, demonstrate that BLaIR exhibit strong text and item representation capacity. Our datasets, code, and checkpoints are available at: https://github.com/hyp1231/AmazonReviews2023.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Learning a hierarchical embedding model for personalized product search. In SIGIR.
  2. Palm 2 technical report. arXiv preprint arXiv:2305.10403.
  3. Understanding scaling laws for recommendation models. arXiv preprint arXiv:2208.08489.
  4. Tallrec: An effective and efficient tuning framework to align large language model with recommendation. In RecSys.
  5. The netflix prize. In Proceedings of KDD cup and workshop, volume 2007, page 35. New York.
  6. A transformer-based embedding model for personalized product search. In SIGIR.
  7. Palm: Scaling language modeling with pathways. JMLR, 24(240):1–113.
  8. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  9. Precise zero-shot dense retrieval without relevance labels.
  10. Simcse: Simple contrastive learning of sentence embeddings. In emnlp.
  11. Large language models as zero-shot conversational recommenders. In CIKM.
  12. Query-aware sequential recommendation. In CIKM.
  13. Session-based recommendations with recurrent neural networks. In ICLR.
  14. Training compute-optimal large language models. arXiv preprint arXiv:2203.15556.
  15. Learning vector-quantized item representation for transferable sequential recommenders. In TheWebConf.
  16. Towards universal sequence representation learning for recommender systems. In KDD.
  17. Large language models are zero-shot rankers for recommender systems. In ECIR.
  18. Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recommendation. In ICDM.
  19. Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6769–6781.
  20. Matrix factorization techniques for recommender systems. Computer, 42(8):30–37.
  21. Text is all you need: Learning language representations for sequential recommendation. In KDD.
  22. Variational autoencoders for collaborative filtering. In Proceedings of the 2018 world wide web conference, pages 689–698.
  23. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  24. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pages 188–197.
  25. Large dual encoders are generalizable retrievers. In EMNLP.
  26. A content-driven micro-video recommendation dataset at scale. arXiv preprint arXiv:2309.15379.
  27. OpenAI. 2022. Introducing chatgpt. OpenAI Blog.
  28. OpenAI. 2023. Gpt-4 technical report.
  29. Proving test set contamination in black box language models. arXiv preprint arXiv:2310.17623.
  30. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR.
  31. Scaling language models: Methods, analysis & insights from training gopher. arXiv preprint arXiv:2112.11446.
  32. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  33. Recommendation on live-streaming platforms: Dynamic availability and repeat consumption. In Proceedings of the 15th ACM Conference on Recommender Systems, pages 390–399.
  34. Shopping queries dataset: A large-scale ESCI benchmark for improving product search.
  35. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992.
  36. Representation learning with large language models for recommendation. arXiv preprint arXiv:2310.15950.
  37. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends® in Information Retrieval, 3(4):333–389.
  38. e-clip: Large-scale vision-language representation learning in e-commerce. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 3484–3494.
  39. One embedder, any task: Instruction-finetuned text embeddings. In ACL.
  40. Xiaoyuan Su and Taghi M Khoshgoftaar. 2009. A survey of collaborative filtering techniques. Advances in artificial intelligence, 2009.
  41. Transformer memory as a differentiable search index. Advances in Neural Information Processing Systems, 35:21831–21843.
  42. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  43. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  44. Mengting Wan and Julian McAuley. 2018. Item recommendation on monotonic behavior chains. In Proceedings of the 12th ACM conference on recommender systems, pages 86–94.
  45. Videoclip: Contrastive pre-training for zero-shot video-text understanding. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6787–6800.
  46. Personalized complementary product recommendation. In Companion Proceedings of the Web Conference 2022, pages 146–151.
  47. Personalized showcases: Generating multi-modal explanations for recommendations. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2251–2255.
  48. Unsupervised legal evidence retrieval via contrastive learning with approximate aggregated positive. In AAAI.
  49. Tenrec: A large-scale multipurpose benchmark dataset for recommender systems. Advances in Neural Information Processing Systems, 35:11480–11493.
  50. Where to go next for recommender systems? id-vs. modality-based recommender models revisited. In SIGIR.
  51. E-bert: A phrase and product knowledge enhanced language model for e-commerce. arXiv e-prints, pages arXiv–2009.
  52. Scaling law of large sequential recommendation models. arXiv preprint arXiv:2311.11351.
  53. Recommendation as instruction following: A large language model empowered recommendation approach. arXiv preprint arXiv:2305.07001.
  54. Feature-level deeper self-attention network for sequential recommendation. In IJCAI.
  55. Dense text retrieval based on pretrained language models: A survey. arXiv preprint arXiv:2211.14876.
  56. Recbole: Towards a unified, comprehensive and efficient framework for recommendation algorithms. In CIKM.
  57. A survey of large language models. arXiv preprint arXiv:2303.18223.
  58. S3-rec: Self-supervised learning for sequential recommendation with mutual information maximization. In CIKM.
  59. Improving conversational recommender systems via knowledge graph based semantic fusion. In KDD.
  60. Don’t make your llm an evaluation benchmark cheater. arXiv preprint arXiv:2311.01964.
Citations (49)

Summary

  • The paper introduces BLaIR, a pretrained embedding model series that integrates item metadata with natural language for enhanced recommendation accuracy.
  • It employs a contrastive learning approach on the rich Amazon Reviews 2023 dataset to enable effective complex product search and retrieval.
  • Empirical results demonstrate BLaIR’s superior performance over existing methods, paving the way for advanced AI-driven recommendation systems.

Bridging Language and Items for Retrieval and Recommendation

Introduction to BLaIR

Recent advancements in LLMs have amplified interest in exploiting their capabilities for recommendation systems. Nevertheless, a significant challenge lies in integrating the vast universe of items—often scaling to millions—into these models without extensive retraining or complex engineering efforts. Addressing this challenge, we introduce BLaIR (Bridging Language and Items for Retrieval and Recommendation), a series of pretrained sentence embedding models designed exclusively for recommendation scenarios. BLaIR is adept at learning the nuanced relationships between item metadata and their corresponding natural language descriptions, a critical feature for enhancing retrieval and recommendation tasks.

The Amazon Reviews 2023 Dataset

A cornerstone of our research is the newly curated Amazon Reviews 2023 dataset, which significantly outpaces its predecessors both in scope and richness. With over 570 million reviews spanning 33 categories and linked to 48 million items, this dataset is uniquely positioned to provide a comprehensive landscape for training and evaluating recommendation models. The dataset updates include finer-grained timestamps for precision in temporal-based recommendation tasks, cleaner and richer metadata, and a vast expansion in item categories and user reviews compared to the 2018 version.

Architecture and Training Objective

At its core, BLaIR utilizes a contrastive learning approach to embed both item metadata and user reviews into a shared embedding space. This methodology enables the effective bridging of items with their potentially vast array of natural language contexts. By leveraging reviews as natural, rich language contexts related to items, BLaIR can finely tune its embeddings to suit the recommendation domain's unique requirements.

Evaluating BLaIR's Performance

Our extensive experiments across multiple domains and tasks underscore BLaIR's superior performance and versatility. Specifically, we introduce a new task termed complex product search, deeply aligned with real-world scenarios where queries may involve long and detailed natural language. Leveraging LLMs like ChatGPT, we further construct a semi-synthetic evaluation set, Amazon-C4, to benchmark models in this nuanced task domain. The empirical results validate BLaIR's efficacy, showing marked improvements over existing methods across a spectrum of retrieval and recommendation tasks.

Future Directions

The emergence of BLaIR opens several avenues for future research in AI and recommendation systems. The model's ability to generalize across tasks and domains suggests potential for further exploration into other language-heavy recommendation scenarios. Moreover, the Amazon Reviews 2023 dataset itself, with its unprecedented scale and depth, offers a fertile ground for advancing research in recommendation systems and natural language processing.

Concluding Remarks

In conclusion, BLaIR represents a significant stride forward in harmonizing the capabilities of LLMs with the intricate demands of modern recommendation systems. By meticulously pretraining on the vast and rich Amazon Reviews 2023 dataset, BLaIR sets a new benchmark for the integration of language and item data in the recommendation domain. As we move forward, the methodologies and insights gleaned from this work are poised to inspire a new generation of AI-driven recommendation systems, further blurring the lines between human language understanding and machine intelligence.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We found no open problems mentioned in this paper.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Sign up for free to view the 5 tweets with 344 likes about this paper.