Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dominate or Delete: Decentralized Competing Bandits in Serial Dictatorship (2006.15166v2)

Published 26 Jun 2020 in cs.LG, cs.DS, cs.GT, and stat.ML

Abstract: Online learning in a two-sided matching market, with demand side agents continuously competing to be matched with supply side (arms), abstracts the complex interactions under partial information on matching platforms (e.g. UpWork, TaskRabbit). We study the decentralized serial dictatorship setting, a two-sided matching market where the demand side agents have unknown and heterogeneous valuation over the supply side (arms), while the arms have known uniform preference over the demand side (agents). We design the first decentralized algorithm -- UCB with Decentralized Dominant-arm Deletion (UCB-D3), for the agents, that does not require any knowledge of reward gaps or time horizon. UCB-D3 works in phases, where in each phase, agents delete \emph{dominated arms} -- the arms preferred by higher ranked agents, and play only from the non-dominated arms according to the UCB. At the end of the phase, agents broadcast in a decentralized fashion, their estimated preferred arms through {\em pure exploitation}. We prove both, a new regret lower bound for the decentralized serial dictatorship model, and that UCB-D3 is order optimal.

Citations (10)

Summary

  • The paper introduces the UCB-D3 algorithm that enables decentralized agents to discard dominated arms and focus on non-dominated choices using a phased UCB strategy.
  • It models a two-sided matching market where demand-side agents with partial information compete for tasks, mirroring platforms like UpWork and TaskRabbit.
  • The authors establish a new regret lower bound and demonstrate that UCB-D3 achieves order-optimal performance in decentralized, competitive settings.

The paper "Dominate or Delete: Decentralized Competing Bandits in Serial Dictatorship" explores online learning within a two-sided matching market framework where agents compete for resources. This context is akin to platforms like UpWork or TaskRabbit, where demand-side agents (such as freelancers or service providers) are continuously vying to be matched with supply-side entities (tasks or clients).

Key Concepts:

  1. Two-Sided Matching Market:
    • These markets involve two groups: the demand side (agents) with unknown and varied valuations for supply side options (arms), and the supply side with a consistent preference for the demand side.
  2. Decentralized Serial Dictatorship:
    • Represents a scenario where agents, without centralized control, aim to maximize their own rewards by deciding which arms to compete for, with knowledge limited by partial information.
  3. UCB-D3 Algorithm:
    • The authors introduce the UCB with Decentralized Dominant-arm Deletion (UCB-D3), a novel decentralized algorithm for demand side agents.
    • Phased Approach: The algorithm operates in phases, where agents identify and discard dominated arms—those preferred by higher-ranked agents—and focus on non-dominated arms using the Upper Confidence Bound (UCB) strategy.
    • Decentralized Communication: At the end of each phase, agents broadcast their preferred arms through a pure exploitation phase, enabling decentralized learning.
  4. Regret Analysis:
    • The paper establishes a new regret lower bound specific to the decentralized serial dictatorship model.
    • It is demonstrated that UCB-D3 achieves order-optimal regret, indicating its efficiency and effectiveness in this decentralized setting.

Significance:

This research is significant for developing decentralized algorithms where individual agents make autonomous decisions based on limited information. The UCB-D3 algorithm exemplifies how agents can efficiently learn and adapt in competitive environments without relying on centralized orchestration. The model and solutions discussed can be applied to real-world scenarios in gig economy platforms, demonstrating practical relevance.