2000 character limit reached
Representation Learning in Low-rank Slate-based Recommender Systems (2309.08622v2)
Published 10 Sep 2023 in cs.IR and cs.AI
Abstract: Reinforcement learning (RL) in recommendation systems offers the potential to optimize recommendations for long-term user engagement. However, the environment often involves large state and action spaces, which makes it hard to efficiently learn and explore. In this work, we propose a sample-efficient representation learning algorithm, using the standard slate recommendation setup, to treat this as an online RL problem with low-rank Markov decision processes (MDPs). We also construct the recommender simulation environment with the proposed setup and sampling method.
- Reinforcement learning based recommender systems: A survey. ACM Computing Surveys, 55(7):1–38, 2022.
- Flambe: Structural complexity and representation learning of low rank mdps. Advances in neural information processing systems, 33:20095–20107, 2020.
- Empirical analysis of predictive algorithms for collaborative filtering. arXiv preprint arXiv:1301.7363, 2013.
- User response models to improve a reinforce recommender system. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, pp. 121–129, 2021.
- Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM conference on recommender systems, pp. 191–198, 2016.
- Item-based top-n recommendation algorithms. ACM Transactions on Information Systems (TOIS), 22(1):143–177, 2004.
- Provable reinforcement learning with a short-term memory. In International Conference on Machine Learning, pp. 5832–5850. PMLR, 2022.
- Horizon: Facebook’s open source applied reinforcement learning platform. arXiv preprint arXiv:1811.00260, 2018.
- The netflix recommender system: Algorithms, business value, and innovation. ACM Transactions on Management Information Systems (TMIS), 6(4):1–19, 2015.
- Fusing similarity models with markov chains for sparse sequential recommendation. In 2016 IEEE 16th international conference on data mining (ICDM), pp. 191–200. IEEE, 2016.
- Reinforcement learning for slate-based recommender systems: A tractable decomposition and practical methodology. arXiv preprint arXiv:1905.12767, 2019.
- Music personalization at spotify. In Proceedings of the 10th ACM Conference on Recommender Systems, pp. 373–373, 2016.
- Provably efficient reinforcement learning with linear function approximation. In Conference on Learning Theory, pp. 2137–2143. PMLR, 2020.
- Joachims, T. Optimizing search engines using clickthrough data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 133–142, 2002.
- Grouplens: Applying collaborative filtering to usenet news. Communications of the ACM, 40(3):77–87, 1997.
- Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, 2009.
- Latent dirichlet allocation for tag recommendation. In Proceedings of the third ACM conference on Recommender systems, pp. 61–68, 2009.
- End-to-end deep reinforcement learning based recommendation with supervised embedding. In Proceedings of the 13th International Conference on Web Search and Data Mining, pp. 384–392, 2020.
- Stated choice methods: analysis and applications. Cambridge university press, 2000.
- Luce, R. D. Individual choice behavior: A theoretical analysis. Courier Corporation, 2012.
- Jointly leveraging intent and interaction signals to predict user satisfaction with slate recommendations. In The World Wide Web Conference, pp. 1256–1267, 2019.
- Kinematic state abstraction and provably efficient rich-observation reinforcement learning. In International conference on machine learning, pp. 6961–6971. PMLR, 2020.
- Recsim ng: Toward principled uncertainty modeling for recommender ecosystems. arXiv preprint arXiv:2103.08057, 2021.
- Probabilistic matrix factorization. Advances in neural information processing systems, 20, 2007.
- Handling data sparsity in collaborative filtering using emotion and semantic based features. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pp. 625–634, 2011.
- Factorizing personalized markov chains for next-basket recommendation. In Proceedings of the 19th international conference on World wide web, pp. 811–820, 2010.
- Recommendations as treatments: Debiasing learning and evaluation. In international conference on machine learning, pp. 1670–1679. PMLR, 2016.
- An mdp-based recommender system. Journal of Machine Learning Research, 6(9), 2005.
- Maximum-margin matrix factorization. Advances in neural information processing systems, 17, 2004.
- Off-policy evaluation for slate recommendation. Advances in Neural Information Processing Systems, 30, 2017.
- Representation learning for online and offline rl in low-rank mdps. arXiv preprint arXiv:2110.04652, 2021.
- Deep content-based music recommendation. Advances in neural information processing systems, 26, 2013.
- Using content-based filtering for recommendation. In Proceedings of the machine learning in the new information age: MLnet/ECML2000 workshop, volume 30, pp. 47–56. Barcelona, 2000.
- Optimal bayesian recommendation sets and myopically optimal choice query sets. Advances in neural information processing systems, 23, 2010.
- White, R. Evidential symmetry and mushy credence. 2010.
- Self-supervised reinforcement learning for recommender systems. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, pp. 931–940, 2020.
- S3-rec: Self-supervised learning for sequential recommendation with mutual information maximization. In Proceedings of the 29th ACM international conference on information & knowledge management, pp. 1893–1902, 2020.