Graph-Based Model-Agnostic Data Subsampling for Recommendation Systems (2305.16391v2)
Abstract: Data subsampling is widely used to speed up the training of large-scale recommendation systems. Most subsampling methods are model-based and often require a pre-trained pilot model to measure data importance via e.g. sample hardness. However, when the pilot model is misspecified, model-based subsampling methods deteriorate. Since model misspecification is persistent in real recommendation systems, we instead propose model-agnostic data subsampling methods by only exploring input data structure represented by graphs. Specifically, we study the topology of the user-item graph to estimate the importance of each user-item interaction (an edge in the user-item graph) via graph conductance, followed by a propagation step on the network to smooth out the estimated importance value. Since our proposed method is model-agnostic, we can marry the merits of both model-agnostic and model-based subsampling methods. Empirically, we show that combing the two consistently improves over any single method on the used datasets. Experimental results on KuaiRec and MIND datasets demonstrate that our proposed methods achieve superior results compared to baseline approaches.
- Fundamentals of electric circuits. McGraw-Hill Higher Education Boston.
- Graph coarsening with neural networks. arXiv preprint arXiv:2102.01350 (2021).
- The electrical resistance of a graph captures its commute and cover times. computational complexity 6, 4 (1996), 312–340.
- Nitesh V Chawla. 2009. Data mining for imbalanced datasets: An overview. Data mining and knowledge discovery handbook (2009), 875–886.
- On sampling strategies for neural network-based collaborative filtering. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 767–776.
- Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7–10.
- Fan RK Chung. 1997. Spectral graph theory. Vol. 92. American Mathematical Soc.
- Simplify and robustify negative sampling for implicit collaborative filtering. Advances in Neural Information Processing Systems 33 (2020), 1094–1105.
- William Fithian and Trevor Hastie. 2014. Local case-control sampling: Efficient subsampling in imbalanced data sets. Annals of statistics 42, 5 (2014), 1693.
- KuaiRec: A Fully-observed Dataset and Insights for Evaluating Recommender Systems. arXiv preprint arXiv:2202.10842 (2022).
- Minimizing effective resistance of a graph. SIAM review 50, 1 (2008), 37–66.
- DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247 (2017).
- Structure-preserving sparsification methods for social networks. Social Network Analysis and Mining 6, 1 (2016), 1–22.
- Local uncertainty sampling for large-scale multiclass logistic regression. The Annals of Statistics 48, 3 (2020), 1770–1788.
- Frank Harary and Robert Z Norman. 1960. Some properties of line digraphs. Rendiconti del circolo matematico di palermo 9, 2 (1960), 161–168.
- Xiangnan He and Tat-Seng Chua. 2017. Neural factorization machines for sparse predictive analytics. In Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval. 355–364.
- Daniel G Horvitz and Donovan J Thompson. 1952. A generalization of sampling without replacement from a finite universe. Journal of the American statistical Association 47, 260 (1952), 663–685.
- Embedding-based retrieval in facebook search. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2553–2561.
- Combining label propagation and simple models out-performs graph neural networks. arXiv preprint arXiv:2010.13993 (2020).
- Junteng Jia and Austion R Benson. 2020. Residual correlation in graph neural network regression. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 588–598.
- Can M Le. 2021. Edge Sampling Using Local Network Information. J. Mach. Learn. Res. 22 (2021), 88–1.
- Autofis: Automatic feature interaction selection in factorization models for click-through rate prediction. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2636–2645.
- Measuring Calibration in Deep Learning.. In CVPR workshops, Vol. 2.
- Untangling hairballs. In International Symposium on Graph Drawing. Springer, 101–112.
- Contrastive learning with hard negative samples. arXiv preprint arXiv:2010.04592 (2020).
- Local graph sparsification for scalable clustering. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. 721–732.
- Hidden technical debt in machine learning systems. Advances in neural information processing systems 28 (2015).
- Continuous integration, delivery and deployment: a systematic review on approaches, tools, challenges and practices. IEEE Access 5 (2017), 3909–3943.
- Surprise sampling: Improving and extending the local case-control sampling. (2021).
- Daniel A Spielman and Nikhil Srivastava. 2008. Graph sparsification by effective resistances. In Proceedings of the fortieth annual ACM symposium on Theory of computing. 563–568.
- NetworKit: A tool suite for large-scale complex network analysis. Network Science 4, 4 (2016), 508–530.
- Daniel Ting and Eric Brochu. 2018. Optimal subsampling with influence functions. Advances in neural information processing systems 31 (2018).
- Guihong Wan and Harsha Kokel. 2021. Graph sparsification via meta-learning. DLG@ AAAI (2021).
- HaiYing Wang. 2020. Logistic regression for massive data with rare events. In International Conference on Machine Learning. PMLR, 9829–9836.
- Nonuniform Negative Sampling and Log Odds Correction with Rare Events Data. Advances in Neural Information Processing Systems 34 (2021), 19847–19859.
- Irgan: A minimax game for unifying generative and discriminative information retrieval models. In Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval. 515–524.
- Mind: A large-scale dataset for news recommendation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3597–3606.
- Noise contrastive estimation for one-class collaborative filtering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 135–144.
- Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 974–983.
- Optimizing top-n collaborative filtering via dynamic negative item sampling. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. 785–788.
- Robust graph representation learning via neural sparsification. In International Conference on Machine Learning. PMLR, 11458–11468.
- Learning with local and global consistency. Advances in neural information processing systems 16 (2003).
- Xiaohui Chen (73 papers)
- Jiankai Sun (53 papers)
- Taiqing Wang (5 papers)
- Ruocheng Guo (62 papers)
- Li-Ping Liu (27 papers)
- Aonan Zhang (32 papers)