Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 73 tok/s
Gemini 2.5 Pro 57 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 94 tok/s Pro
Kimi K2 198 tok/s Pro
GPT OSS 120B 454 tok/s Pro
Claude Sonnet 4.5 33 tok/s Pro
2000 character limit reached

Minimalist Thompson Sampling (MINTS)

Updated 10 September 2025
  • MINTS is a minimalist Bayesian framework that targets only the optimizer using profile likelihood to eliminate nuisance parameters, leading to efficient sequential decision-making.
  • The algorithm bypasses full joint priors by focusing on key parameters, reducing computational complexity while maintaining near-optimal regret guarantees.
  • MINTS integrates classical convex optimization techniques with Bayesian inference to handle structured bandits, dynamic pricing, and high-dimensional optimization problems.

Minimalist Thompson Sampling (MINTS) is a streamlined Bayesian methodology for sequential decision-making and stochastic optimization. Eschewing the traditional requirement for a full probabilistic model over all parameters, MINTS targets only the component of direct interest—typically, the optimizer in bandit or function maximization problems—while utilizing profile likelihood to eliminate nuisance parameters. This enables efficient handling of problems with structural constraints and yields a concise yet robust version of Thompson Sampling, with strong theoretical guarantees and clarity of interpretation across multi-armed, contextual, and structured optimization domains (Wang, 7 Sep 2025).

1. Minimalist Bayesian Formulation

Traditional Bayesian methods for sequential optimization and decision-making typically require placing a prior on the full parameter space (e.g., all arm means in a bandit, the entire function surface in optimization). The minimalist Bayesian framework (Wang, 7 Sep 2025) departs from this by specifying a prior only on the parameter of interest, such as the location of the optimizer γ\gamma in a convex set X\mathcal{X}. All other (“nuisance”) parameters are removed via a profile likelihood: πt(γ)=L(γ;Dt)π0(γ)L(γ;Dt)π0(γ)dγ\pi_t(\gamma) = \frac{L(\gamma; D_t)\,\pi_0(\gamma)}{\int L(\gamma'; D_t)\,\pi_0(\gamma')\,d\gamma'} where DtD_t is the observed data up to round tt, π0(γ)\pi_0(\gamma) is the prior on γ\gamma, and L(γ;Dt)=sup{(θ;Dt):θΘγ}L(\gamma; D_t) = \sup\{ \ell(\theta; D_t): \theta \in \Theta_\gamma \} is the profile likelihood, maximizing the likelihood over all nuisance parameters with γ\gamma held fixed. This “minimizes” the Bayesian structure to just the component relevant for selection.

The resulting generalized posterior allows seamless incorporation of convexity, monotonicity, and Lipschitz constraints, as these are naturally encoded in the feasible set Θγ\Theta_\gamma for the optimization (Wang, 7 Sep 2025).

2. The MINTS Algorithm

The MINTS procedure builds on the generalized posterior to select actions via posterior sampling:

  1. Action selection: Sample xtπt1x_t \sim \pi_{t-1}.
  2. Feedback: Observe yty_t, append (xt,yt)(x_t, y_t) to the data.
  3. Posterior update: Recompute πt(x)\pi_t(x) over xXx \in \mathcal{X}, using updated data and the profile likelihood.

This sample-and-update cycle mirrors classical Thompson Sampling but avoids the computational and modeling overhead of full Bayesian inference in high-dimensional spaces, as only the marginal likelihood of the optimizer is modeled directly. When implemented, the profile likelihood L(x;Dt)L(x; D_t) is computed (potentially via convex programs) to check the consistency of xx as a putative maximizer given the structural constraints and observed data.

3. Applications to Structured Bandits and Optimization

MINTS is well-suited for problems with structural constraints or infinite action spaces. For instance:

  • Continuum-armed Lipschitz bandits: The action space is [0,1]d[0,1]^d, and the reward function is MM-Lipschitz. For the noiseless case, the likelihood that xx is optimal is determined by solving a convex feasibility problem that enforces all observed values are consistent with xx as a (possibly unique) optimizer, subject to the Lipschitz constraint.
  • Dynamic pricing: The feasible set for maximal demand probabilities encodes both monotonicity (in price) and Lipschitz restrictions. MINTS naturally incorporates these by maximizing the likelihood only over demand vectors consistent with an optimizer at the candidate price.

By focusing only on the optimal parameter and using convex optimization to eliminate nuisance variables, MINTS can efficiently handle nontrivial constraints without introducing high-dimensional hyperpriors.

4. Connections to Classical Convex Optimization Methods

A distinctive feature of MINTS is its probabilistic reinterpretation of classical cutting-plane methods:

  • Center-of-gravity method: With a uniform prior and noiseless first-order information, the generalized posterior at each round is uniform over the intersection of half-spaces defined by observed subgradients. The posterior mean corresponds to the center-of-gravity iterate.
  • Ellipsoid method: Restricting to ellipsoidal distributional families and updating by KL projection of the generalized posterior yields the standard ellipsoid method recursion. This recasts these classical methods as variants of Bayesian posterior updating under the minimalist paradigm (Wang, 7 Sep 2025).

5. Regret and Performance Guarantees

MINTS achieves regret bounds in the multi-armed bandit setting matching state-of-the-art results up to logarithmic factors. For instance, with a Gaussian likelihood model and KK arms, the expected regret satisfies

E[R(T)]C(min{j:Δj>0logTΔj,KTlogK}+j=1KΔj)\mathbb{E}[R(T)] \leq C \left(\min\left\{ \sum_{j:\Delta_j>0} \frac{\log T}{\Delta_j},\, \sqrt{KT\log K} \right\} + \sum_{j=1}^K \Delta_j \right)

where Δj\Delta_j is the optimality gap for arm jj and CC a constant depending on the reward noise scale (Wang, 7 Sep 2025). A more refined statement is given by

E[R(T)]Cinfδ0{j:Δj>δ(log(max{TΔj2,e})Δj+Δj)+Tmaxj:ΔjδΔj}\mathbb{E}[R(T)] \leq C \inf_{\delta \geq 0} \left\{ \sum_{j:\Delta_j>\delta} \left( \frac{\log(\max\{T\Delta_j^2,e\})}{\Delta_j} + \Delta_j \right) + T\max_{j:\Delta_j\le\delta}\Delta_j \right\}

indicating near-optimality in both problem-dependent and minimax regimes. These bounds are derived without requiring a full joint prior on the reward vector, validating the statistical efficiency of the minimalist framework.

6. Implications and Research Directions

The minimalist Bayesian framework allows principled stochastic optimization while reducing model complexity and enabling flexible violation of the assumptions behind classical Bayesian approaches (e.g., full prior coverage of all parameters). The MINTS algorithm extends naturally to high-dimensional, structured, and continuous-action problems, bridging the gap between Bayesian and convex optimization viewpoints. This approach opens avenues for the design of acquisition rules and posterior computations customized for problem structure, and suggests new directions for integrating profile-likelihood-based Bayesian updating with reinforcement learning and contextual Bandit models.

Methodologically, this framework underscores that strong theoretical guarantees and algorithmic flexibility can be achieved through targeted use of Bayesian inference, without incurring the computational burden and modeling constraints associated with full-probabilistic approaches.

7. Summary Table: MINTS Framework Overview

Feature Classical Bayesian TS MINTS
Prior specification Full parameter vector Optimum only (or component of interest)
Nuisance parameter treatment Marginalization Profile likelihood (maximization)
Structural constraints Difficult to encode Natural via feasible set
Regret guarantees Near-optimal (with full model) Near-optimal (after profiling nuisance)
Computational cost High in large parameter space Moderate; convex program for profile

The minimalist approach, as formalized in MINTS (Wang, 7 Sep 2025), thus delivers a scalable and theoretically sound methodology for sequential decision-making under uncertainty, aligned with modern applications in structured bandits and stochastic optimization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Minimalist Thompson Sampling (MINTS).