Adaptive Estimation of Random Vectors with Bandit Feedback: A mean-squared error viewpoint (2203.16810v3)

Published 31 Mar 2022 in cs.LG

Abstract: We consider the problem of sequentially learning to estimate, in the mean squared error (MSE) sense, a Gaussian $K$-vector of unknown covariance by observing only $m < K$ of its entries in each round. We first establish a concentration bound for MSE estimation. We then frame the estimation problem with bandit feedback, and propose a variant of the successive elimination algorithm. We also derive a minimax lower bound to understand the fundamental limit on the sample complexity of this problem.

References (15)

Correlated bandits or: How to minimize mean-squared error online, in: International Conference on Machine Learning, PMLR. pp. 686–694.
Optimal rates of convergence for covariance matrix estimation. The Annals of Statistics 38, 2118–2144.
Trading off rewards and errors in multi-armed bandits, in: International Conference on Artificial Intelligence and Statistics, PMLR. pp. 709–717.
Pac bounds for multi-armed bandit and markov decision processes, in: International Conference on Computational Learning Theory, Springer. pp. 255–270.
Near-optimal sensor placements in gaussian processes, in: International conference on Machine learning, pp. 265–272.
Correlated multi-armed bandits with a latent random source, in: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE. pp. 3572–3576.
Notes for ECE 534: an exploration of random processes for engineers. Univ. of Illinois at Urbana–Champaign .
On the complexity of best arm identification in multi-armed bandit models. The Journal of Machine Learning Research .
Efficient Sensor Placement Optimization for Securing Large Water Distribution Networks. Journal of Water Resources Planning and Management 134, 516–526.
Bandit algorithms. Cambridge University Press.
Most correlated arms identification, in: Confernce on Learning Theory, pp. 623–637.
Learning probabilistic models of cellular network traffic with applications to resource management, in: IEEE International Symposium on Dynamic Spectrum Access Networks, pp. 82–91.
High dimensional statistics. Lecture notes for course 18S997 .
Adaptive estimation of random vectors with bandit feedback, in: Indian Control Conference, pp. 1–2.
High-dimensional statistics: A non-asymptotic viewpoint. Cambridge University Press.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Adaptive Estimation of Random Vectors with Bandit Feedback: A mean-squared error viewpoint (2203.16810v3)

Summary

Related Papers