Meshfree Stochastic Optimization
- Meshfree stochastic optimization is a framework that uses scattered node interpolation techniques like MLS and RBF to solve high-dimensional stochastic problems.
- The approach overcomes the curse of dimensionality by eliminating fixed grids and employing adaptive, projection-based and trust-region algorithms in stochastic control.
- It guarantees robust convergence and noise tolerance through probabilistic model accuracy criteria, with applications in finance, machine learning, and scientific computing.
Meshfree stochastic optimization refers to algorithmic frameworks for solving stochastic optimization and optimal control problems where spatial discretization is achieved without a fixed mesh, typically utilizing scattered data interpolation techniques such as Moving Least Squares (MLS) and Radial Basis Functions (RBF). These approaches are designed to overcome the curse of dimensionality inherent in traditional grid-based spatial discretizations, enabling tractable solutions in moderate to high-dimensional stochastic settings. They have been applied both to stochastic optimal control—often formulated through forward-backward stochastic differential equations (BSDEs)—and to general stochastic optimization via model-based, trust-region algorithms that tolerate noise and heavy-tailed outliers.
1. Mathematical Formulation of Meshfree Stochastic Control
Consider a stochastic control problem in a filtered probability space with an -dimensional Brownian motion . The controlled -dimensional state process follows the SDE
where with a closed, convex set. The cost functional is
where and are sufficiently smooth, with at most linear growth. The objective is to minimize over all admissible, square-integrable controls .
First-order optimality conditions, derived from the stochastic maximum principle, yield a projected-gradient fixed-point equation: where is projection onto , and the stochastic gradient is computable via the adapted adjoint solution of a coupled BSDE system (Sun et al., 2021).
2. Meshfree Spatial Discretization Techniques
In high-dimensional state spaces, traditional tensor-product grid discretizations become computationally infeasible due to exponential scaling. Meshfree stochastic optimization circumvents this limitation by representing all spatially dependent quantities on a scattered node set , employing meshfree interpolation schemes.
2.1 Moving Least Squares (MLS)
For any target function , at a query point , the MLS interpolant is defined as the minimizer over local polynomials : where and is a compactly supported weight. This yields high-order local approximation with error , where is the fill distance (Sun et al., 2021).
2.2 Radial Basis Functions (RBF)
Given a strictly conditionally positive-definite kernel , the RBF interpolant at is constructed by solving the linear system for coefficients , such that
with forming a basis for polynomials up to degree . For thin-plate splines, error estimates of order with typically 2, are established, under standard smoothness conditions (Sun et al., 2021).
3. Algorithmic Frameworks
3.1 Gradient Projection Solver for BSDEs
The meshfree stochastic optimization framework discretizes time and uses projection-based gradient updates: where is the discretized control space. At each time slice and each meshfree spatial node, the coupled forward SDE and backward BSDE are temporally discretized (e.g., Euler scheme) and the required conditional expectations are evaluated via Gaussian quadrature with meshfree interpolation of the BSDE adjoint variables (Sun et al., 2021).
3.2 Meshfree Trust-Region Methods (STORM)
For unconstrained stochastic optimization, meshfree model-based trust-region methods like STORM construct random quadratic surrogate models over randomly sampled points in a trust-region neighborhood : ensuring with high probability that models are "fully linear" (i.e., both function and gradient approximations are sufficiently accurate over the region). Iterates are accepted or rejected based on noisy function decrease estimates. Probabilistic conditions on model accuracy () and estimate fidelity () are imposed; sample sizes are dynamically adapted to ensure high-fidelity model construction as (Chen et al., 2015).
4. Theoretical Convergence and Error Analysis
Rigorous convergence of meshfree stochastic optimization frameworks is established under standard regularity and contraction conditions:
- For meshfree stochastic control: If time step , solutions of the fully-discrete meshfree BSDE system satisfy
and the projected gradient scheme yields control iterates satisfying (Sun et al., 2021).
- For STORM: If model and estimate probabilities satisfy and other regularity conditions hold, then almost sure convergence to first-order stationary points is achieved, i.e., with probability 1 (Chen et al., 2015).
The convergence proofs combine error propagation for discretized BSDEs, stochastic approximation, and submartingale arguments. Key estimates involve telescoping error for BSDEs, quadrature error, Euler-scheme discretization, and meshfree interpolation error, ensuring first-order convergence when ties between temporal and spatial steps are balanced.
5. Numerical Performance in High Dimensions
Meshfree stochastic optimization methods have been validated on stochastic control and unconstrained minimization test problems in moderate dimensionality (). Key outcomes include:
- For stochastic control problems with known closed-form solutions, both MLS and RBF spatial discretizations achieve first-order accuracy in the control variable ( scales linearly with ).
- RBF methods with scattered nodes outperform tensor-product polynomial interpolation on full grids for , reaching lower error and significantly reduced runtime. For , meshfree RBF is roughly three times faster and remains accurate where tensor grids become infeasible (Sun et al., 2021).
- For STORM and related meshfree trust-region methods, empirical studies on synthetic and real-world datasets (e.g., logistic regression over MNIST, a9a, covtype) demonstrate superior reliability and convergence rate compared to sample-average and adaptive stochastic gradient methods, even under highly biased (failure) or heavy-tailed noise. Robustness is maintained up to failure probabilities of $0.2-0.3$ per evaluation (Chen et al., 2015).
6. Robustness to Noise and Outliers
Meshfree stochastic optimization approaches explicitly account for noise present in function evaluations through probabilistic model and estimate quality criteria:
- Under unbiased noise: Sample-averaging at meshfree points ensures model accuracy with high probability; prescribed sample sizes grow inversely with trust-region radius or grid spacing.
- Under biased "failure" noise (where an evaluation may return a large outlier with fixed probability): Iterative re-sampling and independent model construction guarantee model quality with probability for meshfree point set . Global convergence can still be proven for sufficiently low outlier rates (Chen et al., 2015).
This stochastic robustness is a distinguishing feature relative to conventional stochastic-gradient and deterministic meshfree methods.
7. Significance and Future Directions
Meshfree stochastic optimization methods provide a rigorous framework for high-dimensional, derivative-free, and noisy optimization problems that are intractable by conventional mesh-based discretizations. The combination of meshfree spatial approximation (MLS, RBF), adaptive time-stepping, and probabilistic trust-region or projection-based optimization is effective up to moderate state space dimensionality and extends naturally to scattered data and data-driven settings. The theoretical guarantees under minimal noise assumptions and practical scalability suggest substantial applicability across stochastic control, finance, scientific computing, and machine learning (Sun et al., 2021Chen et al., 2015). A plausible implication is that further methodological advances in meshfree interpolation and scalable random model selection could push the practical limits of stochastic optimization even further, particularly in combination with advanced sampling and quadrature schemes.