Papers
Topics
Authors
Recent
Search
2000 character limit reached

GBB Semi-Feedback Fixed-Price Mechanisms

Updated 30 January 2026
  • The paper introduces a two-phase algorithm combining profit accumulation and Exp3 bandit learning to achieve near-optimal O(T^(2/3)) regret under strict GBB constraints.
  • The mechanism leverages partial feedback by using unbiased surrogate estimators for gains-from-trade while maintaining nonnegative cumulative profit.
  • The research delineates a regret landscape that clearly separates GBB protocols from stricter SBB/WBB models across independent and adversarial value settings.

Global–Budget–Balanced (GBB) semi-feedback fixed-price mechanisms are a class of online learning algorithms for repeated bilateral trade that optimize regret under strong budget constraints and limited information. In these protocols, a mechanism posts two prices for buyer and seller, learns only partial feedback about trade outcomes and seller value, and is required to maintain a nonnegative ex post profit over TT rounds. Recent research has rigorously characterized the achievable regret rates in this setting for both independent and adversarial value models, culminating in tight O~(T2/3)\widetilde{O}(T^{2/3}) upper and Ω(T2/3)\Omega(T^{2/3}) lower bounds for adversarial values and Θ~(T2/3)\widetilde{\Theta}(T^{2/3}) for independent values (Jin, 23 Jan 2026, Chen et al., 6 Apr 2025). This establishes a sharp separation from settings with less restrictive budget constraints or more informative feedback.

1. Formal Model and Problem Setting

The TT-round bilateral trade protocol considered in GBB semi-feedback fixed-price mechanisms involves a seller and buyer with private valuations (vtS,vtB)[0,1]2(v^S_t, v^B_t) \in [0,1]^2 at each round tt. The mechanism posts a pair of prices (pt,qt)(p_t, q_t), where ptp_t is offered to the seller and qtq_t to the buyer, both in [0,1][0,1]. Trade succeeds (It=1\mathbb{I}_t =1) exactly when vtSptv^S_t \le p_t and vtBqtv^B_t \ge q_t. The realized gains-from-trade (GFT) per round are Gt(pt,qt)=(vtBvtS)ItG_t(p_t, q_t) = (v^B_t - v^S_t)\mathbb{I}_t, and the profit (surplus) is Πt(pt,qt)=(qtpt)It\Pi_t(p_t, q_t) = (q_t - p_t)\mathbb{I}_t.

The mechanism satisfies the global-budget-balanced (GBB) constraint:

t=1TΠt(pt,qt)0\sum_{t=1}^T \Pi_t(p_t, q_t) \ge 0

ensuring nonnegative cumulative profit across all rounds, unlike strong (per-round) budget balance (SBB), which is infeasible to achieve in this feedback regime. Semi-feedback means that in each round, the mechanism observes only (vtS,It)(v^S_t, \mathbb{I}_t): it knows the seller’s value and whether a trade occurred, but not the buyer’s value.

The regret is measured against the benchmark of the best single (SBB) price pp^*:

Regret(T)=maxp[0,1]t=1T[1{vtSpvtB}1{vtSptvtB}]\mathrm{Regret}(T) = \max_{p^* \in [0,1]} \sum_{t=1}^T \left[ \mathbf{1}\{v^S_t \le p^* \le v^B_t\} - \mathbf{1}\{v^S_t \le p_t \le v^B_t\} \right]

where 1{vtSpvtB}\mathbf{1}\{v^S_t \le p^* \le v^B_t\} is the indicator that trade would succeed at benchmark price pp^*.

2. Main Algorithmic Paradigm and Upper Bound Construction

The state-of-the-art GBB semi-feedback fixed-price mechanism is a two-phase algorithm (“ALG”) consisting of profit accumulation followed by bandit-style learning on a nearly-diagonal discretization (Jin, 23 Jan 2026).

Phase I: Profit Accumulation

  • Leverages a black-box subroutine (BCCF24) restricted to posting prices in the upper-left half-space (i.e., always pqp \le q), ensuring nonnegative per-round profit.
  • Stops once cumulative profit exceeds β=O(T2/3)\beta = O(T^{2/3}) or after TT rounds.
  • Achieves O(T2/3log5/3T)O(T^{2/3}\log^{5/3}T) regret with high probability and maintains GBB.

Phase II: Exp3-Type Bandit Learning

  • Discretizes the SBB diagonal into KT1/3/polylog(T)K \sim T^{1/3}/\mathrm{polylog}(T) grid points: {(k/K,(k1)/K)}k=1K\{(k/K, (k-1)/K)\}_{k=1}^K.
  • In each subsequent round, forms exponential-weights (Exp3) over these grid points using importance-weighted unbiased estimators of a surrogate reward, based only on semi-feedback.
  • Mixes “exploitation” (selecting according to weights) and “exploration” (random sampling pair of prices).
  • Ensures GBB by allowing p>qp > q only when sufficient surplus buffer is accumulated.

The main theorem asserts that for absolute constants C,α>0C,\alpha > 0:

Regret(T)CT2/3logαT\mathrm{Regret}(T) \le C T^{2/3} \log^\alpha T

with GBB holding ex post (indeed, α=5/3\alpha=5/3 and C310C \approx 310) (Jin, 23 Jan 2026).

3. Tight Regret Lower Bound: Adversarial and Independent Values

Matching lower bounds have been established for GBB semi-feedback mechanisms. In particular, [CJLZ25, see (Jin, 23 Jan 2026)] proves that no GBB mechanism in this setting can obtain regret o(T2/3)o(T^{2/3}), even for independent seller and buyer values.

The construction partitions the rounds into KT1/3K \approx T^{1/3} contiguous blocks. In each block kk, value pairs (vtS,vtB)(v^S_t, v^B_t) are concentrated near two points such that the optimal SBB price is near k/Kk/K. Any exploration outside that diagonal incurs large local regret within the block, while information-theoretic constraints and the structure of the feedback signal preclude circumventing exploration cost. A counting argument yields overall regret Ω(T2/3)\Omega(T^{2/3}).

A plausible implication is that the Θ(T2/3)\Theta(T^{2/3}) scaling is intrinsic to this feedback-budget regime. For correlated or adversarial values under GBB and semi-feedback, prior work showed higher Θ(T3/4)\Theta(T^{3/4}) regret (Chen et al., 6 Apr 2025).

4. Regret Landscape Across Value, Feedback, and Balance Models

The latest research provides a unified minimax regret landscape for fixed-price bilateral trade mechanisms, covering all combinations of:

  • Value Models: Independent, correlated, adversarial,
  • Feedback: Full, two-bit/one-bit (partial), semi (semi-transparent).
  • Budget Balance: Strong (SBB), weak (WBB), global (GBB).

The following table (from (Jin, 23 Jan 2026, Chen et al., 6 Apr 2025, Cesa-Bianchi et al., 2023)) summarizes tight minimax regret rates (ignoring polylogarithms):

Feedback & BB Independent Values Correlated/Adversarial Values
Full + any BB Θ(T1/2)\Theta(T^{1/2}) Θ(T1/2)\Theta(T^{1/2})
Partial+SBB/WBB Θ(T)\Theta(T) Θ(T)\Theta(T)
Partial+GBB Θ~(T2/3)\widetilde{\Theta}(T^{2/3}) Θ~(T3/4)\widetilde{\Theta}(T^{3/4})
Semi+GBB Θ~(T2/3)\widetilde{\Theta}(T^{2/3}) Θ~(T2/3)\widetilde{\Theta}(T^{2/3})

This suggests GBB is the critical relaxation enabling sublinear regret in minimal-feedback mechanisms, distinguishing it from SBB/WBB, which suffer linear regret under partial or semi-feedback.

5. Technical Insights: Surrogate Estimation and Semi-Feedback

Semi-feedback presents a fundamental obstacle: the buyer’s value is unobserved, only trade success/failure and seller’s value are revealed. Modern algorithms circumvent this by constructing unbiased surrogate estimators for gains-from-trade at candidate price pairs, using only available signals and importance weighting. In Phase II, surrogate reward gktg^t_k at grid index kk combines the observed vtSv^S_t and It\mathbb{I}_t:

gkt=[vtSk1K]+1{vtSkK}+[kKvtS]+1{k1KvtS}g^t_k = [v^S_t - \frac{k-1}{K}]_+ \, \mathbf{1}\{v^S_t \le \frac{k}{K}\} + [\frac{k}{K} - v^S_t]_+ \, \mathbf{1}\{\frac{k-1}{K} \le v^S_t\}

Algorithmic analysis leverages the Exp3 framework for contextual bandits, controlling discretization, exploration cost, and surplus buffer to guarantee both GBB and near-optimal regret.

6. Broader Implications and Open Directions

The resolution of the Θ~(T2/3)\widetilde{\Theta}(T^{2/3}) regret rate for GBB semi-feedback mechanisms completes the theory of regret minimization in fixed-price bilateral trade under budget constraints and partial information (Jin, 23 Jan 2026, Chen et al., 6 Apr 2025). Extensions of interest include: sharpening constants, incorporating richer feedback (for example, glimpses of buyer’s value), and generalizing to settings with multi-unit or multi-dimensional trade and adversarial/budget constraints.

A plausible implication is that methodologies for surrogate reward estimation and profit-buffered two-phase algorithms may generalize to other settings where minimal feedback and tight budget constraints interact—such as dynamic markets, mechanism design for multi-agent scenarios, and combinatorial auctions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (3)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to GBB Semi-Feedback Fixed-Price Mechanisms.