Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
91 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
o3 Pro
5 tokens/sec
GPT-4.1 Pro
15 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
Gemini 2.5 Flash Deprecated
12 tokens/sec
2000 character limit reached

Improved Regret Bounds for Online Kernel Selection under Bandit Feedback (2303.05018v2)

Published 9 Mar 2023 in cs.LG

Abstract: In this paper, we improve the regret bound for online kernel selection under bandit feedback. Previous algorithm enjoys a $O((\Vert f\Vert2_{\mathcal{H}_i}+1)K{\frac{1}{3}}T{\frac{2}{3}})$ expected bound for Lipschitz loss functions. We prove two types of regret bounds improving the previous bound. For smooth loss functions, we propose an algorithm with a $O(U{\frac{2}{3}}K{-\frac{1}{3}}(\sumK_{i=1}L_T(f\ast_i)){\frac{2}{3}})$ expected bound where $L_T(f\ast_i)$ is the cumulative losses of optimal hypothesis in $\mathbb{H}{i}={f\in\mathcal{H}_i:\Vert f\Vert{\mathcal{H}i}\leq U}$. The data-dependent bound keeps the previous worst-case bound and is smaller if most of candidate kernels match well with the data. For Lipschitz loss functions, we propose an algorithm with a $O(U\sqrt{KT}\ln{\frac{2}{3}}{T})$ expected bound asymptotically improving the previous bound. We apply the two algorithms to online kernel selection with time constraint and prove new regret bounds matching or improving the previous $O(\sqrt{T\ln{K}} +\Vert f\Vert2{\mathcal{H}_i}\max{\sqrt{T},\frac{T}{\sqrt{\mathcal{R}}}})$ expected bound where $\mathcal{R}$ is the time budget. Finally, we empirically verify our algorithms on online regression and classification tasks.

Citations (1)

Summary

We haven't generated a summary for this paper yet.