Strategizing against No-regret Learners (1909.13861v1)

Published 30 Sep 2019 in cs.GT and cs.LG

Abstract: How should a player who repeatedly plays a game against a no-regret learner strategize to maximize his utility? We study this question and show that under some mild assumptions, the player can always guarantee himself a utility of at least what he would get in a Stackelberg equilibrium of the game. When the no-regret learner has only two actions, we show that the player cannot get any higher utility than the Stackelberg equilibrium utility. But when the no-regret learner has more than two actions and plays a mean-based no-regret strategy, we show that the player can get strictly higher than the Stackelberg equilibrium utility. We provide a characterization of the optimal game-play for the player against a mean-based no-regret learner as a solution to a control problem. When the no-regret learner's strategy also guarantees him a no-swap regret, we show that the player cannot get anything higher than a Stackelberg equilibrium utility.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (3)

Yuan Deng (21 papers)
Jon Schneider (50 papers)
Balusubramanian Sivan (1 paper)

Citations (51)

View on Semantic Scholar

Strategizing against No-regret Learners (1909.13861v1)

Related Papers