Lower bounds on learning rates under partial safety and no-WER
Derive nontrivial lower bounds on the achievable learning rate (for example, sublinear regret as a function of the time horizon T) for any online learning algorithm that simultaneously satisfies partial safety—meaning that no adaptive environment can extract nearly the entire surplus from the learner in the long run—and no-weak-external-regret (vanishing regret against stationary environments).
References
Additionally, deriving lower bounds on the learning rate for algorithms that satisfy both partial safety and no-WER remains an open challenge.
— Robust Online Learning with Private Information
(2505.05341 - Okumura, 8 May 2025) in Section: Concluding remarks