Meshfree Stochastic Optimization

Updated 5 February 2026

Meshfree stochastic optimization is a framework that uses scattered node interpolation techniques like MLS and RBF to solve high-dimensional stochastic problems.
The approach overcomes the curse of dimensionality by eliminating fixed grids and employing adaptive, projection-based and trust-region algorithms in stochastic control.
It guarantees robust convergence and noise tolerance through probabilistic model accuracy criteria, with applications in finance, machine learning, and scientific computing.

Meshfree stochastic optimization refers to algorithmic frameworks for solving stochastic optimization and optimal control problems where spatial discretization is achieved without a fixed mesh, typically utilizing scattered data interpolation techniques such as Moving Least Squares (MLS) and Radial Basis Functions (RBF). These approaches are designed to overcome the curse of dimensionality inherent in traditional grid-based spatial discretizations, enabling tractable solutions in moderate to high-dimensional stochastic settings. They have been applied both to stochastic optimal control—often formulated through forward-backward stochastic differential equations (BSDEs)—and to general stochastic optimization via model-based, trust-region algorithms that tolerate noise and heavy-tailed outliers.

1. Mathematical Formulation of Meshfree Stochastic Control

Consider a stochastic control problem in a filtered probability space $(\Omega,\mathcal{F},\{\mathcal{F}_t\}_{0\le t\le T},\mathbb{P})$ with an $m$ -dimensional Brownian motion $W_t$ . The controlled $d$ -dimensional state process $x_t$ follows the SDE

$dx^u_t = b(x^u_t,u_t)\,dt + \sigma(x^u_t,u_t)\,dW_t, \qquad x_0^u = x_0 \in \mathbb{R}^d,$

where $u: [0,T]\to\mathbb{R}^{d_1}$ with $u(t) \in \mathbf{C} \subset \mathbb{R}^{d_1}$ a closed, convex set. The cost functional is

$J(u) = \mathbb{E}\left[\int_0^T j(x^u_t,u_t)\,dt + k(x^u_T)\right],$

where $j$ and $k$ are sufficiently smooth, with at most linear growth. The objective is to minimize $J(u)$ over all admissible, square-integrable controls $u \in U$ .

First-order optimality conditions, derived from the stochastic maximum principle, yield a projected-gradient fixed-point equation: $u_t^* = P_{\mathbf{C}}\left(u_t^* - \rho\,J'(u^*)|_t\right),$ where $P_{\mathbf{C}}$ is projection onto $\mathbf{C}$ , and the stochastic gradient $J'(u)$ is computable via the adapted adjoint solution $(p_t, q_t)$ of a coupled BSDE system (Sun et al., 2021).

2. Meshfree Spatial Discretization Techniques

In high-dimensional state spaces, traditional tensor-product grid discretizations become computationally infeasible due to exponential scaling. Meshfree stochastic optimization circumvents this limitation by representing all spatially dependent quantities on a scattered node set $X=\{x_k\}_{k=1}^M \subset \mathbb{R}^d$ , employing meshfree interpolation schemes.

2.1 Moving Least Squares (MLS)

For any target function $\phi$ , at a query point $x$ , the MLS interpolant $s_{\phi, X}(x)$ is defined as the minimizer over local polynomials $p \in \Pi_\ell$ : $\min_{p \in \Pi_\ell} \sum_{i\in \mathcal{I}(x)} [\phi(x_i) - p(x_i)]^2 \omega(x,x_i),$ where $\mathcal{I}(x) = \{i : \|x-x_i\| < r\}$ and $\omega$ is a compactly supported weight. This yields high-order local approximation with error $\|\phi - s_{\phi,X}\|_{L^\infty} \leq C h^{\ell+1} |\phi|_{C^{\ell+1}}$ , where $h$ is the fill distance (Sun et al., 2021).

2.2 Radial Basis Functions (RBF)

Given a strictly conditionally positive-definite kernel $\Psi(\|x-y\|)$ , the RBF interpolant at $x$ is constructed by solving the linear system for coefficients $v_j$ , $z_k$ such that

$s_{\phi, X}(x) = \sum_{j=1}^M v_j \Psi(\|x-x_j\|) + \sum_k z_k p_k(x),$

with $p_k$ forming a basis for polynomials up to degree $m-1$ . For thin-plate splines, error estimates of order $O(h^\alpha)$ with $\alpha$ typically 2, are established, under standard smoothness conditions (Sun et al., 2021).

3. Algorithmic Frameworks

3.1 Gradient Projection Solver for BSDEs

The meshfree stochastic optimization framework discretizes time and uses projection-based gradient updates: $u^{i+1} = P_{U_N}(u^i - \rho_i J_N'(u^i)),$ where $U_N$ is the discretized control space. At each time slice and each meshfree spatial node, the coupled forward SDE and backward BSDE are temporally discretized (e.g., Euler scheme) and the required conditional expectations are evaluated via Gaussian quadrature with meshfree interpolation of the BSDE adjoint variables (Sun et al., 2021).

3.2 Meshfree Trust-Region Methods (STORM)

For unconstrained stochastic optimization, meshfree model-based trust-region methods like STORM construct random quadratic surrogate models over randomly sampled points in a trust-region neighborhood $B(x_k, \Delta_k)$ : $m_k(x_k+s) = f_k + g_k^\top s + s^\top H_k s,$ ensuring with high probability that models are "fully linear" (i.e., both function and gradient approximations are sufficiently accurate over the region). Iterates are accepted or rejected based on noisy function decrease estimates. Probabilistic conditions on model accuracy ( $\alpha$ ) and estimate fidelity ( $\beta$ ) are imposed; sample sizes are dynamically adapted to ensure high-fidelity model construction as $\Delta_k \to 0$ (Chen et al., 2015).

4. Theoretical Convergence and Error Analysis

Rigorous convergence of meshfree stochastic optimization frameworks is established under standard regularity and contraction conditions:

For meshfree stochastic control: If time step $\Delta t \sim h$ , solutions of the fully-discrete meshfree BSDE system satisfy

$\max_n \mathbb{E}|p_{t_n}(x) - p_n(x)|^2 + \sum_n \mathbb{E}|q_{t_n}(x) - q_n(x)|^2 = O((\Delta t)^2),$

and the projected gradient scheme yields control iterates $u^{i,N}$ satisfying $\|u^* - u^{i,N}\| = O(\Delta t)$ (Sun et al., 2021).

For STORM: If model and estimate probabilities satisfy $\alpha\beta \geq 1/2$ and other regularity conditions hold, then almost sure convergence to first-order stationary points is achieved, i.e., $\lim_{k\to\infty} \|\nabla f(X_k)\|=0$ with probability 1 (Chen et al., 2015).

The convergence proofs combine error propagation for discretized BSDEs, stochastic approximation, and submartingale arguments. Key estimates involve telescoping error for BSDEs, quadrature error, Euler-scheme discretization, and meshfree interpolation error, ensuring first-order convergence when ties between temporal and spatial steps are balanced.

5. Numerical Performance in High Dimensions

Meshfree stochastic optimization methods have been validated on stochastic control and unconstrained minimization test problems in moderate dimensionality ( $d=2,3,4$ ). Key outcomes include:

For stochastic control problems with known closed-form solutions, both MLS and RBF spatial discretizations achieve first-order accuracy in the control variable ( $\|u^{N,i}-u^*\|_{L^2}$ scales linearly with $\Delta t$ ).
RBF methods with $M\sim200$ scattered nodes outperform tensor-product polynomial interpolation on full grids for $d=3,4$ , reaching lower error and significantly reduced runtime. For $d=4$ , meshfree RBF is roughly three times faster and remains accurate where tensor grids become infeasible (Sun et al., 2021).
For STORM and related meshfree trust-region methods, empirical studies on synthetic and real-world datasets (e.g., logistic regression over MNIST, a9a, covtype) demonstrate superior reliability and convergence rate compared to sample-average and adaptive stochastic gradient methods, even under highly biased (failure) or heavy-tailed noise. Robustness is maintained up to failure probabilities of $0.2-0.3$ per evaluation (Chen et al., 2015).

6. Robustness to Noise and Outliers

Meshfree stochastic optimization approaches explicitly account for noise present in function evaluations through probabilistic model and estimate quality criteria:

Under unbiased noise: Sample-averaging at meshfree points ensures model accuracy with high probability; prescribed sample sizes grow inversely with trust-region radius or grid spacing.
Under biased "failure" noise (where an evaluation may return a large outlier with fixed probability): Iterative re-sampling and independent model construction guarantee model quality with probability $(1-\sigma)^{|Y|}$ for meshfree point set $Y$ . Global convergence can still be proven for sufficiently low outlier rates (Chen et al., 2015).

This stochastic robustness is a distinguishing feature relative to conventional stochastic-gradient and deterministic meshfree methods.

7. Significance and Future Directions

Meshfree stochastic optimization methods provide a rigorous framework for high-dimensional, derivative-free, and noisy optimization problems that are intractable by conventional mesh-based discretizations. The combination of meshfree spatial approximation (MLS, RBF), adaptive time-stepping, and probabilistic trust-region or projection-based optimization is effective up to moderate state space dimensionality and extends naturally to scattered data and data-driven settings. The theoretical guarantees under minimal noise assumptions and practical scalability suggest substantial applicability across stochastic control, finance, scientific computing, and machine learning (Sun et al., 2021 Chen et al., 2015). A plausible implication is that further methodological advances in meshfree interpolation and scalable random model selection could push the practical limits of stochastic optimization even further, particularly in combination with advanced sampling and quadrature schemes.

Markdown Upgrade to Chat

References (2)

Meshfree Approximation for Stochastic Optimal Control Problems (2021)

Stochastic Optimization Using a Trust-Region Method and Random Models (2015)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Meshfree Stochastic Optimization.