Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 85 tok/s
Gemini 2.5 Pro 55 tok/s Pro
GPT-5 Medium 35 tok/s Pro
GPT-5 High 35 tok/s Pro
GPT-4o 123 tok/s Pro
Kimi K2 203 tok/s Pro
GPT OSS 120B 457 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

EBaReT: Expert-guided Bag Reward Transformer for Auto Bidding (2507.16186v1)

Published 22 Jul 2025 in cs.LG and cs.IR

Abstract: Reinforcement learning has been widely applied in automated bidding. Traditional approaches model bidding as a Markov Decision Process (MDP). Recently, some studies have explored using generative reinforcement learning methods to address long-term dependency issues in bidding environments. Although effective, these methods typically rely on supervised learning approaches, which are vulnerable to low data quality due to the amount of sub-optimal bids and low probability rewards resulting from the low click and conversion rates. Unfortunately, few studies have addressed these challenges. In this paper, we formalize the automated bidding as a sequence decision-making problem and propose a novel Expert-guided Bag Reward Transformer (EBaReT) to address concerns related to data quality and uncertainty rewards. Specifically, to tackle data quality issues, we generate a set of expert trajectories to serve as supplementary data in the training process and employ a Positive-Unlabeled (PU) learning-based discriminator to identify expert transitions. To ensure the decision also meets the expert level, we further design a novel expert-guided inference strategy. Moreover, to mitigate the uncertainty of rewards, we consider the transitions within a certain period as a "bag" and carefully design a reward function that leads to a smoother acquisition of rewards. Extensive experiments demonstrate that our model achieves superior performance compared to state-of-the-art bidding methods.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube