Improved Regret Bounds for Online Kernel Selection under Bandit Feedback (2303.05018v2)

Published 9 Mar 2023 in cs.LG

Abstract: In this paper, we improve the regret bound for online kernel selection under bandit feedback. Previous algorithm enjoys a $O((\Vert f\Vert^{2_{\mathcal{H}_i}+1)K^{{\frac{1}{3}}T^{{\frac{2}{3}})$}}} expected bound for Lipschitz loss functions. We prove two types of regret bounds improving the previous bound. For smooth loss functions, we propose an algorithm with a $O(U^{{\frac{2}{3}}K^{{-\frac{1}{3}}(\sum^{K_{i=1}L_T(f^{\ast_i))^{{\frac{2}{3}})$}}}}} expected bound where $L_T(f^\ast_i)$ is the cumulative losses of optimal hypothesis in $\mathbb{H}{i}={f\in\mathcal{H}_i:\Vert f\Vert{\mathcal{H}i}\leq U}$. The data-dependent bound keeps the previous worst-case bound and is smaller if most of candidate kernels match well with the data. For Lipschitz loss functions, we propose an algorithm with a $O(U\sqrt{KT}\ln^{{\frac{2}{3}}{T})$} expected bound asymptotically improving the previous bound. We apply the two algorithms to online kernel selection with time constraint and prove new regret bounds matching or improving the previous $O(\sqrt{T\ln{K}} +\Vert f\Vert²{\mathcal{H}_i}\max{\sqrt{T},\frac{T}{\sqrt{\mathcal{R}}}})$ expected bound where $\mathcal{R}$ is the time budget. Finally, we empirically verify our algorithms on online regression and classification tasks.

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Improved Regret Bounds for Online Kernel Selection under Bandit Feedback (2303.05018v2)

Summary

Related Papers