AuctionNet: Ad Auction Simulation Benchmark
- AuctionNet Simulation Benchmark is a comprehensive and scalable research platform for evaluating ad auction decision-making using realistic simulation and extensive datasets.
- It integrates a full ad auction environment featuring latent diffusion for feature synthesis, cross-attention for value prediction, and multiple baseline bidding algorithms.
- The platform employs rigorous evaluation protocols with metrics like mean episode reward, RMSE, and Pearson correlation to benchmark both RL and traditional approaches.
AuctionNet Simulation Benchmark is a comprehensive and scalable research platform for evaluating decision-making algorithms in large-scale ad auction environments. AuctionNet provides an open-source implementation of an ad auction simulator, a massive pre-generated dataset reflecting real-world distributions, and multiple baseline algorithms with reproducible evaluation protocols. It is directly anchored to empirical ad auctions but supports general multi-agent decision-making research. Its extensibility and fidelity make it a reference standard for simulation-based performance analysis and off-policy evaluation in both academic and industrial settings (Su et al., 14 Dec 2024, Yeom et al., 3 Dec 2025).
1. Architecture and Components
AuctionNet is structured as follows (Su et al., 14 Dec 2024):
- Ad Auction Environment:
- Ad-opportunity generation: Employs a latent diffusion model (LDM) and VAE-style encoding/decoding for feature synthesis. This models realistic user/context distributions while protecting sensitive attributes.
- Value prediction: Stacked cross/self-attention blocks predict opportunity value, conditioned on latent features, category, and temporal embedding.
- Bidding module: Supports 48 agent types, including PID controllers, Offline RL agents (IQL), Behavior Cloning, Decision Transformers, and Online LP solvers.
- Auction module: Implements the Generalized Second-Price (GSP) mechanism with multi-slot extension and plug-in support for alternative rules.
- Pre-generated Dataset:
- Contains 21 episodes ("days"), each with ≈500 K ad opportunities and 48 time steps, yielding over 500 million records.
- Each record encodes advertiser index, category, budget, bid, impression flag, cost, conversion, and more.
- Distributional properties verifiably match real ad logs in PCA overlap and long-tailed features, supporting robust statistical modeling.
- Baseline Algorithms:
- Includes PID controller algorithms, Online LP (budget-constrained knapsack), Behavior Cloning, IQL (offline RL with expectile regression), and Decision Transformer.
- Performance is normalized against the “Abid” heuristic (uniform multiplier).
- API and Integration:
- Python library with Gym-style interface: agent registration, environment stepping, agent observing.
- Flexible agent extension via subclassing
BaseAgent. - Datasets provided via MIT-licensed GitHub repository and downloadable after competition.
2. Mathematical Foundations
AuctionNet formalizes auction allocation, pricing, and optimization as follows (Su et al., 14 Dec 2024, Yeom et al., 3 Dec 2025):
- Allocation and Pricing (GSP mechanism):
- For a set of bids , allocating slot to the highest , charging second-highest .
- Utility: , .
- Multi-Slot Extension:
- Bidders index , slot index , exposure rates .
- Objective for agent :
subject to budget constraint
Baseline Optimization and RL Objective:
- Linear Programming/Knapsack: maximize allocated value under budget.
- RL: , updated by policy gradient.
- Generative models (VAE/diffusion) estimated via corresponding objectives.
- Simulation Benchmarks:
- Mean episode reward is normalized to Abid=1.0.
3. Evaluation Protocols and Metrics
AuctionNet provides rigorous protocols for offline evaluation, policy comparison, and model selection (Su et al., 14 Dec 2024, Yeom et al., 3 Dec 2025):
- Performance Metrics:
- Experimental Setup:
- 80/20 train-validation split for RL agents.
- PID controller hyperparameters: , , .
- RL/baseline models trained for 100K gradient steps, batch size 256.
- Dataset is used in full for bid landscape modeling in OPE (no held-out split for reward modeling).
- Key Results:
- Online LP method achieves normalized episode reward ≈1.3, followed by IQL ≈1.15, BC ≈1.1, DT ≈1.05.
- RL and transformer models have headroom for further tuning.
- In OPE experiments (Yeom et al., 3 Dec 2025), DPM-based SNIPS estimator yields MDA = 100%, RMSE = 4.87pp, with substantially lower error than parametric baselines.
Metrics Table
| Metric | Definition | Application Context |
|---|---|---|
| Mean Reward | Normalized to Abid baseline | |
| MDA | Sign match on | OPE policy selection |
| RMSE | Abs error on lift | OPE, policy evaluation |
| Pearson | Linear corr of lifts | OPE, estimator validation |
4. Off-Policy Evaluation on AuctionNet
AuctionNet has enabled fundamental advances in reliable OPE for deterministic ad auctions (Yeom et al., 3 Dec 2025):
- Challenge: Winner-take-all setting yields zero propensity for non-winning actions, causing standard IPS estimators to fail.
- Bid Landscape Model: A Discrete Price Model (DPM) estimates unobserved market price by discretizing scores into quantile bins, computing densities and survival functions, and deriving instantaneous win probabilities as approximate propensity scores.
- SNIPS Estimator: Self-normalized inverse propensity scoring with APS, stabilized by weight-capping, applied on logged deterministic auction data.
- Empirical Findings: DPM-OPE matches online A/B test results with 92.9% MDA in CTR prediction, outperforming parametric OPE and maintaining scale-invariant error.
A plausible implication is that realistic simulators with empirical bid landscapes are necessary for credible OPE in deterministic mechanism design.
5. Extensibility and Integration
AuctionNet is designed for extensibility at several abstraction layers (Su et al., 14 Dec 2024, Kushnir et al., 2022):
- Agent Modeling: Custom agents are defined by subclassing
BaseAgentand implementing bidding/observation logic. - Auction Mechanisms: The core Python API allows for plug-in mechanism customization, multi-slot allocation, and exposure rate modeling.
- Dataset Usage: Comprehensive schema allows statistical analysis, algorithm benchmarking, and supervised learning.
- Benchmark Expansion: Scenarios from multi-dimensional auction simulation (auctionsim) can be systematically integrated by adding parameter configurations:
- for each scenario.
- Metrics per scenario: optimal revenue, exclusive revenue, runtime, ICC/Border violations, exclusion region masks.
- Visualization of allocation and exclusion regions.
- Reproducibility: Open-source codebase, full dataset access, documented API facilitate extension to new algorithms, auction formats, and statistical models.
6. Related Benchmarks and Mathematical Auctions
AuctionNet subsumes and integrates simulation benchmarks for optimal multi-dimensional auction mechanisms (Kushnir et al., 2022):
- auctionsim Library: Provides simulation experiments on optimal vs. exclusive-buyer mechanisms, supporting Uniform, Beta, TruncatedNormal, and Mixture distributions over types and grade values.
- Mathematical Formulation: LP-based mechanism encoding incentive compatibility (ICC), Border constraints, and direct-revelation IR; supports revenue gap analysis, exclusion region characterization.
- Benchmark Interface: Empirical scaling , instance runtimes $2-15$s, numerical accuracy confirmed by LP duality gap, ICC violation minimization.
- Conjecture Validation: Supports systematic policy and mechanism comparison, providing empirical evidence for or against literature conjectures.
AuctionNet’s integration of these libraries enables a unified evaluation framework for both industrial bidding algorithms and theoretical mechanism simulations.
7. Research Applications and Directions
AuctionNet is applicable to the following areas:
- Auction-based machine learning, RL-based auto-bidding, and bid landscape modeling.
- Off-policy evaluation, model selection, and safe deployment in online advertising systems.
- General large-scale game decision-making, multi-agent optimization, and POSG analysis.
- Mechanism design research, including optimal allocation, revenue maximization, and incentive analysis.
- Statistical benchmarking and generative modeling for synthetic data synthesis.
- Comparative paper of allocation mechanisms, including second-price, GSP, and custom auction rules.
The design and empirical protocols of AuctionNet position it as a central benchmark for empirical, theoretical, and methodological advances in auction environments and algorithmic decision-making.