Initializing Services in Interactive ML Systems for Diverse Users (2312.11846v2)
Abstract: This paper investigates ML systems serving a group of users, with multiple models/services, each aimed at specializing to a sub-group of users. We consider settings where upon deploying a set of services, users choose the one minimizing their personal losses and the learner iteratively learns by interacting with diverse users. Prior research shows that the outcomes of learning dynamics, which comprise both the services' adjustments and users' service selections, hinge significantly on the initialization. However, finding good initializations faces two main challenges: (i) Bandit feedback: Typically, data on user preferences are not available before deploying services and observing user behavior; (ii) Suboptimal local solutions: The total loss landscape (i.e., the sum of loss functions across all users and services) is not convex and gradient-based algorithms can get stuck in poor local minima. We address these challenges with a randomized algorithm to adaptively select a minimal set of users for data collection in order to initialize a set of services. Under mild assumptions on the loss functions, we prove that our initialization leads to a total loss within a factor of the globally optimal total loss with complete user preference data}, and this factor scales logarithmically in the number of services. This result is a generalization of the well-known $k$-means++ guarantee to a broad problem class, which is also of independent interest. The theory is complemented by experiments on real as well as semi-synthetic datasets.
- Better guarantees for k-means and euclidean k-median by primal-dual algorithms. SIAM Journal on Computing, 49(4):FOCS17–97, 2019.
- Np-hardness of euclidean sum-of-squares clustering. Machine learning, 75(2):245–248, 2009.
- Analysis of k-means and k-medoids algorithm for big data. Procedia Computer Science, 78:507–512, 2016.
- K-means++ the advantages of careful seeding. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pages 1027–1035, 2007.
- Scalable distributional robustness in a class of non-convex optimization with guarantees. Advances in Neural Information Processing Systems, 35:13826–13837, 2022.
- One for all: Simultaneous metric and preference learning over multiple users. arXiv e-prints, pages arXiv–2207, 2022.
- How fine-tuning allows for effective meta-learning. In Proc. of Advances in Neural Information Processing Systems, volume 34, 2021.
- Sanjoy Dasgupta. The hardness of k-means clustering. Department of Computer Science and Engineering, University of California, San Diego, 2008.
- Multi-learner risk reduction under endogenous participation dynamics. arXiv preprint arXiv:2206.02667, 2022.
- Retiring adult: New datasets for fair machine learning. Advances in Neural Information Processing Systems, 34:6478–6490, 2021.
- Decoupled classifiers for group-fair and efficient machine learning. In Conference on fairness, accountability and transparency, pages 119–133. PMLR, 2018.
- A metric for covariance matrices. Geodesy-the Challenge of the 3rd Millennium, pages 299–309, 2003.
- Socially fair k-means clustering. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pages 438–448, 2021.
- Eigenvalue and generalized eigenvalue problems: Tutorial. arXiv preprint arXiv:1903.11240, 2019.
- An efficient framework for clustered federated learning. Advances in Neural Information Processing Systems, 33:19586–19597, 2020.
- Competing ai: How does competition feedback affect machine learning? In International Conference on Artificial Intelligence and Statistics, pages 1693–1701. PMLR, 2021.
- The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis), 5(4):1–19, 2015.
- Fairness without demographics in repeated loss minimization. In International Conference on Machine Learning, pages 1929–1938. PMLR, 2018.
- Learning user preferences to incentivize exploration in the sharing economy. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
- Tight bounds for the expected risk of linear classifiers and pac-bayes finite-sample guarantees. In Artificial Intelligence and Statistics, pages 384–392. PMLR, 2014.
- Nicolas Hug. Surprise: A python library for recommender systems. Journal of Open Source Software, 5(52):2174, 2020. doi: 10.21105/joss.02174. URL https://doi.org/10.21105/joss.02174.
- A local search approximation algorithm for k-means clustering. In Proceedings of the eighteenth annual symposium on Computational geometry, pages 10–18, 2002.
- Meta-learning for mixed linear regression. In International Conference on Machine Learning, pages 5394–5404. PMLR, 2020.
- A better k-means++ algorithm via local search. In International Conference on Machine Learning, pages 3662–3671. PMLR, 2019.
- Federated learning: Challenges, methods, and future directions. IEEE signal processing magazine, 37(3):50–60, 2020.
- Stuart Lloyd. Least squares quantization in pcm. IEEE transactions on information theory, 28(2):129–137, 1982.
- Improved guarantees for k-means++ and k-means++ parallel. Advances in Neural Information Processing Systems, 33:16142–16152, 2020.
- Three approaches for personalization with applications to federated learning. arXiv preprint arXiv:2002.10619, 2020.
- Mixture of experts: a literature survey. The Artificial Intelligence Review, 42(2):275, 2014.
- The effectiveness of lloyd-type methods for the k-means problem. Journal of the ACM (JACM), 59(6):1–22, 2013.
- Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints. IEEE transactions on neural networks and learning systems, 32(8):3710–3722, 2020.
- Online learning in large-scale contextual recommender systems. IEEE Transactions on Services Computing, 9(3):433–445, 2014.
- Avoiding imposters and delinquents: Adversarial crowdsourcing and peer prediction. Advances in Neural Information Processing Systems, 29, 2016.
- Towards sample-efficient overparameterized meta-learning. In Proc. of Advances in Neural Information Processing Systems, volume 34, 2021.
- Learning preference distributions from distance measurements. In 2022 58th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 1–8. IEEE, 2022.
- Fairness without harm: Decoupled classifiers with preference guarantees. In International Conference on Machine Learning, pages 6373–6382. PMLR, 2019.
- Roman Vershynin. High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press, 2018.
- Group retention when using machine learning in sequential decision making: the interplay between user dynamics and fairness. Advances in Neural Information Processing Systems, 32, 2019.