Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits (1601.03855v1)

Published 15 Jan 2016 in cs.LG

Abstract: We study the K-armed dueling bandit problem which is a variation of the classical Multi-Armed Bandit (MAB) problem in which the learner receives only relative feedback about the selected pairs of arms. We propose a new algorithm called Relative Exponential-weight algorithm for Exploration and Exploitation (REX3) to handle the adversarial utility-based formulation of this problem. This algorithm is a non-trivial extension of the Exponential-weight algorithm for Exploration and Exploitation (EXP3) algorithm. We prove a finite time expected regret upper bound of order O(sqrt(K ln(K)T)) for this algorithm and a general lower bound of order omega(sqrt(KT)). At the end, we provide experimental results using real data from information retrieval applications.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Pratik Gajane (19 papers)
  2. Tanguy Urvoy (14 papers)
  3. Fabrice Clérot (12 papers)
Citations (43)

Summary

We haven't generated a summary for this paper yet.