Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 98 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 15 tok/s
GPT-5 High 16 tok/s Pro
GPT-4o 86 tok/s
GPT OSS 120B 470 tok/s Pro
Kimi K2 158 tok/s Pro
2000 character limit reached

Online Cost-sensitive Max-Entropy Sampling

Updated 7 September 2025
  • Online cost-sensitive maximum entropy sampling is a framework that sequentially selects measurements under budget constraints to maximize the log-determinant of a covariance submatrix.
  • It employs convex relaxations and dual formulations to transform combinatorial problems into efficient, scalable models with provable approximation guarantees.
  • The approach supports real-time adaptation via strategies like randomized rounding, local search, and branch-and-bound, proving effective in sensor placement, active learning, and feature selection.

The online cost-sensitive maximum entropy sampling problem concerns the sequential and adaptive selection of a subset of samples or measurements, subject to cost constraints, with the aim of maximizing information—formalized via the entropy (or log-determinant) of the corresponding submatrix of a given covariance matrix or an analogous uncertainty functional. This paradigm merges elements of discrete submodular optimization, cost-sensitive (or budget-aware) learning, and online decision-making under uncertainty. Core variants of this problem are central in sensor placement, adaptive experimental design, streaming feature selection, and active learning with resource or cost constraints.

1. Mathematical Formulation and Problem Scope

Given an n×nn \times n positive semidefinite covariance matrix CC, one seeks to select (online or adaptively) a subset S{1,,n}S \subseteq \{1, \ldots, n\} of cardinality ss (or budget iScibudget\sum_{i \in S} c_i \leq \mathrm{budget} for item costs cic_i), such that the log-determinant of the principal submatrix C[S,S]C[S,S] (i.e., logdet(C[S,S])\log \det(C[S,S])) is maximized. In the online cost-sensitive regime, the selection is incremental, with potential for real-time cost changes, necessitating rapid, budget-aware updates to the selection strategy.

More generally, for the generalized maximum-entropy sampling problem (GMESP), the objective is to maximize the sum of the logarithms of the largest tst \leq s eigenvalues of C[S,S]C[S,S] (Ponte et al., 1 Apr 2024).

Key formalizations:

  • Standard MESP: maxS:S=s,  iSciBlogdet(C[S,S])\max_{S: |S|=s, \; \sum_{i \in S} c_i \leq B} \log\det(C[S,S])
  • GMESP: maxS:S=s,  iSciB=1tlogλ(C[S,S])\max_{S: |S|=s, \; \sum_{i \in S} c_i \leq B} \sum_{\ell=1}^t \log \lambda_\ell(C[S,S])

In online settings, cic_i and budget BB may be revealed or adaptively updated sequentially, and the selection process should allow for efficient reoptimization.

2. Convex Relaxations and Dual Formulations

Recent advances have centered on deriving tight, scalable convex relaxations of MESP and GMESP, often via matrix factorization and duality (Li et al., 2020, Chen et al., 2021, Ponte et al., 1 Apr 2024, Fampa et al., 7 Jul 2025). The foundational strategy is to relax the original combinatorial optimization as follows:

  • Let x{0,1}nx \in \{0,1\}^n represent selection variables: xi=1x_i = 1 if iSi \in S.
  • Relax x[0,1]nx \in [0,1]^n, transforming the problem to:

max    Γt(FDiag(x)F)\max \;\; \Gamma_t \left(F^\top \operatorname{Diag}(x) F \right)

subject to ex=se^\top x = s, AxbAx \leq b, 0x10 \leq x \leq 1, where C=FFC = F F^\top and Γt\Gamma_t aggregates the tt largest eigenvalues appropriately as in (Ponte et al., 1 Apr 2024).

  • The dual formulation (DGFact/DDGFact) leads to efficient convex programs, which are exact for MESP (t=st=s) and approximate otherwise, with a quantifiable additive gap (at most tlog(s/t)t \log(s/t) to the spectral bound) (Ponte et al., 1 Apr 2024).

These relaxations underpin both the estimation of upper bounds for branch-and-bound search—and enable principled variable-fixing schemes, where dual multipliers determine if certain variables must be set to $0$ or $1$ in optimal solutions, efficiently pruning the search space (Chen et al., 2021).

Convex relaxations also allow seamless introduction of cost constraints (AxbAx \leq b), making them well-suited to the online cost-sensitive case where AA and bb adapt dynamically.

3. Algorithmic Schemes and Approximation Strategies

Several algorithmic approaches emerge from these relaxations:

  • Randomized Sampling Algorithms: A near-optimal fractional solution xx^* to the relaxed convex program yields a probability distribution for randomized rounding (sampling SS according to the marginals), with explicit approximation guarantees on the objective gap relative to the fractional optimum. Deterministic derandomization via conditional expectations is also supported (Li et al., 2020).
  • Local Search (Exchange) Algorithms: Iterative improvements by exchanging set elements are shown, under new mathematical tools for the analysis of rank-one updates, to possess explicit approximation ratios (scaling as O(slogs)O(s \log s) under mild conditions) (Li et al., 2020). These schemes can efficiently refine selections under online cost or budget changes.
  • Branch-and-Bound (B&B) Frameworks: At each node, the convex relaxation provides an upper bound, and variable-fixing rules derived from dual multipliers accelerate convergence. Recent work highlights the efficacy of such schemes, especially in large-scale or cost-sensitive instances—including warm-starting convex programs as cost constraints are updated online (Ponte et al., 1 Apr 2024).
  • Advanced Bound-Improvement Techniques:
    • Linx and BQP convex relaxations as alternative upper bounds (Fampa et al., 7 Jul 2025)
    • Masking: Hadamard multiplication with a correlation matrix to tighten bounds.
    • Generalized Scaling: Coordinate-wise scaling of variables in the bounds, convex in log\log-scaling factors, facilitating rapid adaptation to individual cost weights (Fampa et al., 7 Jul 2025).

These algorithmic primitives are readily adapted for online or cost-aware situations by sequentially updating cost constraints and leveraging warm-starts or incremental computation.

4. Extensions to Cost-Sensitive and Online Regimes

The transition from static to cost-sensitive and online variants is enabled by formulating selection constraints as (potentially time-dependent) AxbAx \leq b, with cost coefficients adaptable as the process unfolds (Ponte et al., 1 Apr 2024). In streaming scenarios, this lends itself to:

  • Online Variable Updating: New data or changes in cost structure (for instance, device failure, budget replenishment) can be incorporated instantly—by augmenting AA or bb, re-solving the convex relaxation, and extending extant B&B trees (Ponte et al., 1 Apr 2024, Li et al., 2020).
  • Variable-Fixing for Rapid Adaptation: When dual multipliers are sufficiently large, variables can be irrevocably set (e.g., excluded due to excessive cost), which is particularly impactful for real-time systems (Chen et al., 2021).
  • Generalized Eigenvalue Objectives: In generalized MESP (GMESP), practical applications such as PCA-driven sensor selection require maximizing the sum of the top t<st < s eigenvalues rather than the full log-determinant, and the new relaxations accommodate both cost sensitivity and this partial-eigenvalue objective (Ponte et al., 1 Apr 2024).

A plausible implication is that these dual and relaxation-driven algorithms enable high-frequency cost updates and real-time adaptation, with empirical effectiveness demonstrated for problems up to n=2000n=2000 (Li et al., 2020, Chen et al., 2021).

5. Empirical Performance and Software Availability

Empirical investigations demonstrate that:

  • The presented relaxations and algorithms (randomized rounding, local search, B&B) efficiently solve medium and large problems to near-optimality—with local search yielding log-determinant gaps <0.1< 0.1 in large instances (Li et al., 2020).
  • On difficult benchmark and real-world problems, variable-fixing and mixing of multiple upper bounds (e.g., linx and factorization) substantially reduce search space and computational burden (Chen et al., 2021).
  • When integrated into online or adaptive workflows, these schemes accommodate continuous changes in cost while maintaining both scalability and solution quality.

Open-source implementations covering Frank–Wolfe, randomized rounding, local search, and associated relaxations are publicly available, facilitating broader application and reproducibility (Li et al., 2020).

Connections are drawn between MESP/GMESP and related domains:

  • 0/1 D-optimal Design, Data Fusion, and Principal Component Analysis: Every positive definite MESP instance can be reformulated as a data fusion problem, supporting methodological cross-pollination (Fampa et al., 7 Jul 2025).
  • Submodular Optimization and Experimental Design: Online cost-sensitive maximum entropy sampling inherits submodularity properties under certain settings, implying greedy approximations may prove robust in low-noise or low-cost-heterogeneity regimes.
  • Active Learning and Streaming Feature Selection: The core MESP ideas integrate naturally with adaptive active learning under resource constraints.
  • Generalized Scaling and Masking: New computational techniques (generalized per-item scaling, masking) facilitate further improvement of upper-bounding strategies, suggesting applicability in fine-grained, dynamically cost-sensitive settings (Fampa et al., 7 Jul 2025).

Open challenges include further development of online algorithms coupling entropy-optimality with explicit budget/policy constraints, extension to high-throughput deployments, and integration with hardware-adaptive or communication-constrained systems.

7. Summary Table: Core Relaxations and Algorithms

Relaxation / Algorithm Adaptivity to Cost/Budget Scalability / Online Suitability
Factorization Bound / Γ-fn Direct via AxbAx \leq b High; amenable to warm-start
Linx Bound Per-variable scaling Efficient with ADMM/quasi-Newton
Local Search / Exchange Cost weighting in moves Near-optimal, highly scalable
Randomized Rounding Cost via marginals Efficient; batch/streaming
Variable-Fixing Dual-based, online-ready Reduces computation in B&B

The spectrum of recent advances in maximum-entropy sampling—anchored in convex relaxation theory, dual-based variable-fixing, and efficient combinatorial heuristics—now provides a rigorous and computationally practical foundation for online cost-sensitive maximum entropy sampling. These methods collectively support scalable, real-time, and adaptive decision-making under information-theoretic and resource constraints (Fampa et al., 7 Jul 2025, Ponte et al., 1 Apr 2024, Li et al., 2020, Chen et al., 2021).