Path Integral Optimiser: Global Optimisation via Neural Schrödinger-Föllmer Diffusion (2506.06815v1)

Published 7 Jun 2025 in cs.LG

Abstract: We present an early investigation into the use of neural diffusion processes for global optimisation, focusing on Zhang et al.'s Path Integral Sampler. One can use the Boltzmann distribution to formulate optimization as solving a Schr\"odinger bridge sampling problem, then apply Girsanov's theorem with a simple (single-point) prior to frame it in stochastic control terms, and compute the solution's integral terms via a neural approximation (a Fourier MLP). We provide theoretical bounds for this optimiser, results on toy optimisation tasks, and a summary of the stochastic theory motivating the model. Ultimately, we found the optimiser to display promising per-step performance at optimisation tasks between 2 and 1,247 dimensions, but struggle to explore higher-dimensional spaces when faced with a 15.9k parameter model, indicating a need for work on adaptation in such environments.

Abstract PDF Chat (Pro)

Summary

The paper presents a novel method using neural Schrödinger-Föllmer diffusion to approximate global optimization via path integral formulation.
It reformulates optimization as minimizing the Kullback-Leibler divergence between distributions using a Boltzmann framework and Euler-Maruyama discretization.
Empirical results show promising optimization performance in high dimensions, while scalability remains a challenge for very large parameter spaces.

Path Integral Optimiser: Global Optimisation via Neural Schr\"odinger-F\"ollmer Diffusion

The paper introduces a novel approach to global optimization using diffusion processes inspired by quantum mechanics, specifically the Schrödinger-Föllmer diffusion process. This method, termed the Path Integral Optimizer (PIO), leverages neural networks to approximate diffusion in high-dimensional spaces, aiming to improve the efficiency and efficacy of optimization across complex domains.

Theoretical Framework and Motivation

Optimization in machine learning often involves navigating high-dimensional, non-convex objective landscapes. Conventional methods like Stochastic Gradient Descent (SGD) and its variants (e.g., Adam, Adagrad) have limitations related to their reliance on first-order gradient information and difficulties in generalizing across the parameter space. Diffusion models, which have demonstrated success in sampling from structured distributions (e.g., denoising diffusion models in image generation), promise theoretical advantages and superior sampling efficiency. This paper proposes the application of these diffusion models to optimization tasks.

Methodology

The Path Integral Optimiser is built upon Zhang et al.'s Path Integral Sampler and employs a Boltzmann distribution to frame optimization as a Schrödinger bridge sampling problem. The optimization problem thus becomes one of minimizing the Kullback-Leibler divergence between an initial and target distribution, executed through a neural approximation using Fourier MLPs. The optimizer's theoretical bounds and empirical performance are evaluated.

Key components of the approach include:

Neural Schrödinger-Föllmer Diffusion: The process is defined as a drift minimization problem using an identity-variance Itô process, with the drift term approximated by a neural network.
Path Integral Sampler: The sampler employs Euler-Maruyama discretization and leverages neural networks to approximate the drift component, enabling efficient sampling.
Boltzmann Transformation: The optimization task is reformulated via Boltzmann transformation to align sampling goals with optimization objectives.

Results and Implications

The paper presents theoretical guarantees that underpin the optimizers' performance, illustrating that, with sufficiently small parameters, the process can converge to global minimizers.

Empirically, PIO exhibits promising optimization performance across tasks with up to 1,247 dimensions, although it struggles with significantly larger parameter spaces, such as those encountered in models with over 15,000 parameters. This indicates potential avenues for scalability improvements, such as enhancing the depth of neural approximation networks and utilizing ensemble methods.

Conclusion and Future Directions

The Path Integral Optimiser represents a noteworthy contribution to the optimization domain within machine learning, marrying the diffusion sampling process with optimization tasks. While current results are constrained by high-dimensional scalability challenges, theoretical guarantees provide a foundation for future refinements. Notable avenues for future exploration include scaling the neural drift approximation, ensemble trajectory strategies, and improved parallelization for performance optimization. This research opens up new possibilities for employing quantum-inspired diffusion processes within the field of machine learning optimization.