Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank Bandits (2402.11156v2)
Abstract: We study low-rank matrix trace regression and the related problem of low-rank matrix bandits. Assuming access to the distribution of the covariates, we propose a novel low-rank matrix estimation method called LowPopArt and provide its recovery guarantee that depends on a novel quantity denoted by B(Q) that characterizes the hardness of the problem, where Q is the covariance matrix of the measurement distribution. We show that our method can provide tighter recovery guarantees than classical nuclear norm penalized least squares (Koltchinskii et al., 2011) in several problems. To perform efficient estimation with a limited number of measurements from an arbitrarily given measurement set A, we also propose a novel experimental design criterion that minimizes B(Q) with computational efficiency. We leverage our novel estimator and design of experiments to derive two low-rank linear bandit algorithms for general arm sets that enjoy improved regret upper bounds. This improves over previous works on low-rank bandits, which make somewhat restrictive assumptions that the arm set is the unit ball or that an efficient exploration distribution is given. To our knowledge, our experimental design criterion is the first one tailored to low-rank matrix estimation beyond the naive reduction to linear regression, which can be of independent interest.
- Improved Algorithms for Linear Stochastic Bandits. In Advances in Neural Information Processing Systems (NeurIPS), pp. 1–19, 2011.
- Associative reinforcement learning using linear probabilistic concepts. In Proceedings of the International Conference on Machine Learning (ICML), pp. 3–11, 1999.
- Auer, P. Using Confidence Bounds for Exploitation-Exploration Trade-offs. Journal of Machine Learning Research, 3:397–422, 2002.
- Certifying the restricted isometry property is hard. IEEE Transactions on Information Theory, 59(6):3448–3450, 2013.
- The netflix prize. In Proceedings of KDD cup and workshop, volume 2007, pp. 35. New York, 2007.
- High-dimensional experimental design and kernel bandits. In International Conference on Machine Learning, pp. 1227–1237. PMLR, 2021.
- Stochastic Linear Optimization under Bandit Feedback. In Proceedings of the Conference on Learning Theory (COLT), pp. 355–366, 2008.
- CVXPY: A Python-embedded modeling language for convex optimization. Journal of Machine Learning Research, 17(83):1–5, 2016.
- Design of c-optimal experiments for high-dimensional linear models. Bernoulli, 29(1):652–668, 2023.
- Norm-agnostic linear bandits. In International Conference on Artificial Intelligence and Statistics, pp. 73–91. PMLR, 2022.
- On low-rank trace regression under general sampling distribution. The Journal of Machine Learning Research, 23(1):14424–14472, 2022.
- On worst-case regret of linear thompson sampling. arXiv preprint arXiv:2006.06790, 2020.
- High-dimensional sparse linear bandits. Advances in Neural Information Processing Systems, 33:10753–10763, 2020.
- The noisy power method: A meta algorithm with applications. Advances in neural information processing systems, 27, 2014.
- Matrix analysis. Cambridge university press, 2012.
- Optimal gradient-based algorithms for non-concave bandit optimization. Advances in Neural Information Processing Systems, 34:29101–29115, 2021.
- Following the leader and fast rates in linear prediction: Curved constraint sets and other regularities. Advances in Neural Information Processing Systems, 29, 2016.
- Provable inductive matrix completion. arXiv preprint arXiv:1306.0626, 2013.
- Improved Regret Bounds of Bilinear Bandits using Action Space Dimension Analysis. In Proceedings of the International Conference on Machine Learning (ICML), accepted, pp. 3163–3172, 2021.
- Popart: Efficient sparse regression and experimental design for optimal sparse linear bandits. In Advances in Neural Information Processing Systems (NeurIPS), pp. 2102–2114. Curran Associates, Inc., 2022.
- On verifiable sufficient conditions for sparse signal recovery via l1subscript𝑙1l_{1}italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT minimization. Mathematical programming, 127:57–88, 2011.
- Bilinear Bandits with Low-rank Structure. In Proceedings of the International Conference on Machine Learning (ICML), volume 97, pp. 3163–3172, 2019.
- Efficient frameworks for generalized low-rank matrix bandit problems. Advances in Neural Information Processing Systems, 35:19971–19983, 2022.
- Bernoulli Rank-1 Bandits for Click Feedback. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pp. 2001–2007, 2017.
- Matrix Completion from Noisy Entries. J. Mach. Learn. Res., 11:2057–2078, 2010. ISSN 1532-4435.
- Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Annals of Statistics, 39(5):2302–2329, 2011.
- Bandit Principal Component Analysis. In Beygelzimer, A. and Hsu, D. (eds.), Proceedings of the Thirty-Second Conference on Learning Theory, volume 99 of Proceedings of Machine Learning Research, pp. 1994–2024, Phoenix, USA, 2019. PMLR.
- Stochastic Low-Rank Bandits. arXiv:1712.04644, 2017.
- Bandit Phase Retrieval, 2021.
- Bandit Algorithms. Cambridge University Press, 2020.
- A simple unified framework for high dimensional bandit problems. In International Conference on Machine Learning, pp. 12619–12655. PMLR, 2022.
- Low-rank generalized linear bandit problems. In International Conference on Artificial Intelligence and Statistics, pp. 460–468. PMLR, 2021.
- A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nature communications, 8(1):573, 2017.
- Nearly optimal algorithms for level set estimation. arXiv preprint arXiv:2111.01768, 2021.
- Minsker, S. Sub-gaussian estimators of the mean of a random matrix with heavy-tailed entries. The Annals of Statistics, 46(6A):2871–2903, 2018.
- Inductive matrix completion for predicting gene–disease associations. Bioinformatics, 30(12):i60–i68, 2014.
- Efficient and robust algorithms for adversarial linear contextual bandits. In Conference on Learning Theory, pp. 3049–3068. PMLR, 2020.
- Estimation of high-dimensional low-rank matrices. 2011.
- Linearly Parameterized Bandits. Math. Oper. Res., 35(2):395–411, 2010.
- Best-arm identification in linear bandits. Advances in Neural Information Processing Systems (NeurIPS), 27:828–836, 2014.
- Matrix perturbation theory. Academic press, 1990.
- Solving bernoulli rank-one bandits with unimodal thompson sampling. In Algorithmic Learning Theory, pp. 862–889. PMLR, 2020.
- Generalized low rank models. Foundations and Trends® in Machine Learning, 9(1):1–118, 2016.
- Spectral Bandits for Smooth Graph Functions. In Proceedings of the International Conference on Machine Learning (ICML), pp. 46—-54, 2014.
- Wainwright, M. J. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019. doi: 10.1017/9781108627771.
- Kyoungseok Jang (7 papers)
- Chicheng Zhang (37 papers)
- Kwang-Sung Jun (39 papers)