Minimalist Thompson Sampling (MINTS)
- MINTS is a minimalist Bayesian framework that targets only the optimizer using profile likelihood to eliminate nuisance parameters, leading to efficient sequential decision-making.
- The algorithm bypasses full joint priors by focusing on key parameters, reducing computational complexity while maintaining near-optimal regret guarantees.
- MINTS integrates classical convex optimization techniques with Bayesian inference to handle structured bandits, dynamic pricing, and high-dimensional optimization problems.
Minimalist Thompson Sampling (MINTS) is a streamlined Bayesian methodology for sequential decision-making and stochastic optimization. Eschewing the traditional requirement for a full probabilistic model over all parameters, MINTS targets only the component of direct interest—typically, the optimizer in bandit or function maximization problems—while utilizing profile likelihood to eliminate nuisance parameters. This enables efficient handling of problems with structural constraints and yields a concise yet robust version of Thompson Sampling, with strong theoretical guarantees and clarity of interpretation across multi-armed, contextual, and structured optimization domains (Wang, 7 Sep 2025).
1. Minimalist Bayesian Formulation
Traditional Bayesian methods for sequential optimization and decision-making typically require placing a prior on the full parameter space (e.g., all arm means in a bandit, the entire function surface in optimization). The minimalist Bayesian framework (Wang, 7 Sep 2025) departs from this by specifying a prior only on the parameter of interest, such as the location of the optimizer in a convex set . All other (“nuisance”) parameters are removed via a profile likelihood: where is the observed data up to round , is the prior on , and is the profile likelihood, maximizing the likelihood over all nuisance parameters with held fixed. This “minimizes” the Bayesian structure to just the component relevant for selection.
The resulting generalized posterior allows seamless incorporation of convexity, monotonicity, and Lipschitz constraints, as these are naturally encoded in the feasible set for the optimization (Wang, 7 Sep 2025).
2. The MINTS Algorithm
The MINTS procedure builds on the generalized posterior to select actions via posterior sampling:
- Action selection: Sample .
- Feedback: Observe , append to the data.
- Posterior update: Recompute over , using updated data and the profile likelihood.
This sample-and-update cycle mirrors classical Thompson Sampling but avoids the computational and modeling overhead of full Bayesian inference in high-dimensional spaces, as only the marginal likelihood of the optimizer is modeled directly. When implemented, the profile likelihood is computed (potentially via convex programs) to check the consistency of as a putative maximizer given the structural constraints and observed data.
3. Applications to Structured Bandits and Optimization
MINTS is well-suited for problems with structural constraints or infinite action spaces. For instance:
- Continuum-armed Lipschitz bandits: The action space is , and the reward function is -Lipschitz. For the noiseless case, the likelihood that is optimal is determined by solving a convex feasibility problem that enforces all observed values are consistent with as a (possibly unique) optimizer, subject to the Lipschitz constraint.
- Dynamic pricing: The feasible set for maximal demand probabilities encodes both monotonicity (in price) and Lipschitz restrictions. MINTS naturally incorporates these by maximizing the likelihood only over demand vectors consistent with an optimizer at the candidate price.
By focusing only on the optimal parameter and using convex optimization to eliminate nuisance variables, MINTS can efficiently handle nontrivial constraints without introducing high-dimensional hyperpriors.
4. Connections to Classical Convex Optimization Methods
A distinctive feature of MINTS is its probabilistic reinterpretation of classical cutting-plane methods:
- Center-of-gravity method: With a uniform prior and noiseless first-order information, the generalized posterior at each round is uniform over the intersection of half-spaces defined by observed subgradients. The posterior mean corresponds to the center-of-gravity iterate.
- Ellipsoid method: Restricting to ellipsoidal distributional families and updating by KL projection of the generalized posterior yields the standard ellipsoid method recursion. This recasts these classical methods as variants of Bayesian posterior updating under the minimalist paradigm (Wang, 7 Sep 2025).
5. Regret and Performance Guarantees
MINTS achieves regret bounds in the multi-armed bandit setting matching state-of-the-art results up to logarithmic factors. For instance, with a Gaussian likelihood model and arms, the expected regret satisfies
where is the optimality gap for arm and a constant depending on the reward noise scale (Wang, 7 Sep 2025). A more refined statement is given by
indicating near-optimality in both problem-dependent and minimax regimes. These bounds are derived without requiring a full joint prior on the reward vector, validating the statistical efficiency of the minimalist framework.
6. Implications and Research Directions
The minimalist Bayesian framework allows principled stochastic optimization while reducing model complexity and enabling flexible violation of the assumptions behind classical Bayesian approaches (e.g., full prior coverage of all parameters). The MINTS algorithm extends naturally to high-dimensional, structured, and continuous-action problems, bridging the gap between Bayesian and convex optimization viewpoints. This approach opens avenues for the design of acquisition rules and posterior computations customized for problem structure, and suggests new directions for integrating profile-likelihood-based Bayesian updating with reinforcement learning and contextual Bandit models.
Methodologically, this framework underscores that strong theoretical guarantees and algorithmic flexibility can be achieved through targeted use of Bayesian inference, without incurring the computational burden and modeling constraints associated with full-probabilistic approaches.
7. Summary Table: MINTS Framework Overview
Feature | Classical Bayesian TS | MINTS |
---|---|---|
Prior specification | Full parameter vector | Optimum only (or component of interest) |
Nuisance parameter treatment | Marginalization | Profile likelihood (maximization) |
Structural constraints | Difficult to encode | Natural via feasible set |
Regret guarantees | Near-optimal (with full model) | Near-optimal (after profiling nuisance) |
Computational cost | High in large parameter space | Moderate; convex program for profile |
The minimalist approach, as formalized in MINTS (Wang, 7 Sep 2025), thus delivers a scalable and theoretically sound methodology for sequential decision-making under uncertainty, aligned with modern applications in structured bandits and stochastic optimization.