Regularized Linear Randomize-Then-Optimize (RLRTO)
- RLRTO is a framework that recasts inference and sampling in high-dimensional problems as a series of randomized, regularized optimization tasks.
- It integrates regularization penalties such as ℓ₂ and ℓ₁ along with variable transformations to stabilize optimization and manage non-Gaussian priors.
- RLRTO enhances scalability and efficiency in applications like Bayesian inverse problems, experimental design, and statistical learning through subspace acceleration and randomized sketching.
Regularized Linear Randomize-Then-Optimize (RLRTO) is a computational framework that unites randomized optimization and regularization for efficient sampling, inference, and learning in high-dimensional linear and nearly linear problems. RLRTO methods recast statistical inference, constrained optimization, and learning with non-Gaussian priors as a sequence of randomized optimization problems, often combined with regularization penalties (such as ℓ₂ or ℓ₁ norms). This family of methods is used in Bayesian inverse problems, experimental design, stochastic optimization, statistical learning, and constrained Gaussian process modeling, among other applications.
1. Foundational Principles
RLRTO extends the “randomize-then-optimize” (RTO) paradigm initially developed for Bayesian inverse problems with Gaussian priors (1607.01904), generalizing it to incorporate regularized objectives and non-Gaussian priors. The central idea is to map difficult inference or sampling tasks into the solution of a sequence of randomized, regularized optimization problems. These are typically of the form: where is a data-fitting term, is a regularizer (e.g., quadratic, ℓ₁, or Besov-type), is a (possibly linearized) feature mapping, and is random noise, often drawn from a standard normal distribution.
Regularization serves both to encode prior information and to stabilize optimization or inference, especially in ill-posed or high-dimensional settings. The “randomization” step—introducing noise to the right-hand side or objective—enables effective posterior exploration or uncertainty quantification, transforming sampling into (potentially solvable or parallelizable) optimization tasks.
2. Regularization and Variable Transformation
RLRTO can incorporate a variety of regularizers depending on problem structure and modeling goals:
- Quadratic penalties (ℓ₂): Promote smoothness, enable piecewise affine solution paths in linear programs (LPs), and yield explicit, quantitative convergence rates to the true minimizer as regularization vanishes (2408.04088).
- ℓ₁-type penalties: Include total variation (TV) and Besov priors, promoting sparsity or piecewise-constant structures in the inferred parameters (1607.01904, 2506.16888).
- Generalized nonsmooth penalties: Used in composite problems, possibly under stochastic linear operators (2305.01055).
For non-Gaussian priors (e.g., ℓ₁ or Besov), RLRTO employs deterministic or approximate variable transformations to “Gaussianize” the prior, mapping the resulting posterior into a form amenable to RTO sampling. For example, in the one-dimensional Laplace prior case: where is the standard Gaussian cdf. In the multivariate case, a componentwise transformation is coupled via the prior’s structure (e.g., via a difference or wavelet operator), allowing the use of standard Gaussian reference measures in the RTO step (1607.01904, 2506.16888).
3. Algorithms and Computational Techniques
RLRTO converts inference or sampling into the solution of regularized, randomized optimization problems, typically approached as follows:
a. Stochastic Regularized Optimization:
Problems are formulated as: with capturing both data fidelity and regularization structure, and . For quadratically regularized LPs and inverse problems, efficient factorizations (QR or SVD) yield direct or iterative solutions (2408.04088, 2506.16888).
b. Metropolis–Hastings Correction:
Because the proposal distributions generated by RLRTO are tractable, they admit analytical or numerically computable proposal densities, allowing for exact or approximate independence Metropolis–Hastings acceptance corrections to ensure sampling from the true posterior (1607.01904, 2506.16888, 1903.00870).
c. Linear and Subspace Methods:
For problems where the forward model or constraints can be linearized or projected onto a lower-dimensional subspace (e.g., via SVD), subspace acceleration greatly reduces computational complexity. The projection splits the parameter space into “data-informed” and orthogonal subspaces, focusing optimization on the informative directions (1903.00870).
d. Randomized Sketching:
In operator regression, randomized range-finding/sketching techniques are used to approximate high- or infinite-dimensional operators via low-rank projections, substantially improving scalability while preserving empirical risk guarantees (2312.17348).
e. Augmented Lagrangian and Alternating Schemes:
In stochastic composite problems, RLRTO techniques can be combined with augmented Lagrangian or ADMM variants, alternating updates between primal variables, penalty parameters, and dual variables, while using online-sampled operators and ensuring almost sure convergence to critical points (2305.01055).
4. Applications and Use Cases
RLRTO has been successfully applied in a spectrum of scientific and engineering contexts:
a. High-Dimensional Bayesian Inverse Problems
- Total Variation/Besov Regularization:
Sampling posteriors for signals or fields with discontinuities or blocky features, such as deconvolution, inpainting, or PDE inversion (1607.01904, 2506.16888).
- Discretization Invariance:
By using Besov priors and the RLRTO approach, uncertainty quantification remains stable under grid refinement (2506.16888).
b. Large-Scale Optimization and Learning
- Quadratically Regularized LPs/Optimal Transport:
RLRTO provides sparse solutions and explicit control over approximation error and parameter selection, crucial in large-scale transport or allocation problems (2408.04088).
- Reduced Rank Regression:
Randomized sketching yields scalable, regularized estimation of low-rank operators for structured prediction and dynamical system identification (2312.17348).
c. Experimental Design and Variable Selection
- Efficient Design for Lasso and Regularized Models:
Nearly orthogonal Latin hypercube designs (NOLHDs) improve variable selection accuracy and stability in high-dimensional regression using RLRTO frameworks (2104.01673).
- Regret-Minimizing Experimental Design:
Explicit regret bounds guide data allocation for “estimate-then-optimize” scenarios, tightly coupling statistical estimation and decision-making (2210.15576).
d. Machine Learning and Policy Search
- Entropy-Regularized Linear Quadratic Control:
RLRTO is embedded in policy learning for control problems, balancing exploration and exploitation, and enabling fast (even super-linear) policy convergence and transferability (2311.14168).
e. Constrained Gaussian Process Modeling
- Monotonicity-Constrained GPs:
Sampling from GPs with inequality constraints is performed efficiently by solving randomized, constrained quadratic optimization problems—each yielding independent samples—thus outperforming traditional MCMC approaches in scalability and sample independence (2507.06677).
5. Theoretical Guarantees and Quantitative Results
RLRTO approaches admit rigorous analysis and theoretical guarantees, especially in linear or convex regimes:
- Quantitative Convergence:
Explicit formulas describe exact thresholds for regularization parameters above which the regularized solution matches the minimal-norm solution; suboptimality bounds are available for all regularization levels (2408.04088).
- Sampling Consistency:
With appropriate variable transformation and correction, RLRTO proposals target the correct posterior; acceptance rates and effective sample sizes scale independently of dimension under subspace acceleration (1607.01904, 1903.00870).
- Complexity and Convergence:
For randomized and noisy problems, expected convergence rates to stationarity (e.g., ) are established, and the methods offer robustness to model error, stochasticity, and high-dimensionality (1807.02176).
- Almost Sure Criticality:
Alternating RLRTO-type augmented Lagrangian methods guarantee that all cluster points satisfy first-order optimality conditions with probability one, even under operator sampling noise (2305.01055).
6. Implementation Considerations and Limitations
- Computational Scaling:
Subspace acceleration and sketching are essential for practical deployment in high dimensions; otherwise, per-sample cost may be cubic in the number of variables (e.g., due to determinant computations) (1903.00870, 2312.17348).
- Parallelization:
RLRTO methods allow for parallel or even embarrassingly parallel sampling, as randomization and optimization problems for each sample are statistically independent (2507.06677).
- Parameter Tuning:
Hyperparameter choices (regularization strength, prior parameters, sketch dimensions) significantly impact both accuracy and computational efficiency; grid search or Bayesian optimization is often required (2312.17348, 2506.16888).
- Model Structure:
Performance and guarantees are most complete in linear or near-linear problems. For highly nonlinear or nonconvex settings, additional approximations or tailored transformations may be necessary, with some trade-offs in accuracy or efficiency (1607.01904, 2305.01055).
7. Extensions and Future Directions
- Broader Classes of Priors:
The RLRTO strategy can be extended to arbitrary non-Gaussian or nonconvex priors, provided an invertible (or approximately invertible) transformation to a reference measure exists and is tractable (1607.01904).
- Active Experimental Design:
Explicit regret or information-theoretic criteria enable active data collection strategies for RLRTO frameworks, as in regret-minimizing experiment design (2210.15576).
- Operator Learning and Nonparametric Inference:
Randomized sketching-based RLRTO opens avenues for efficient operator learning in infinite-dimensional spaces and for modern applications in neuroscience, control, and dynamical systems (2312.17348).
- Constraint Handling in Surrogate Modeling:
Embedding inequality or structure constraints via RLRTO optimization enables the construction of physically interpretable surrogates for differential equations and scientific models (2507.06677).
RLRTO serves as a foundational approach uniting randomized optimization and regularization for scalable, robust, and uncertainty-aware inference and learning. Its role is especially prominent when classical sampling or learning methods are infeasible due to high dimensionality, structural priors, or complex constraints, and when error quantification or efficient parallelization are essential.