Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 71 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 18 tok/s Pro
GPT-5 High 15 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 196 tok/s Pro
GPT OSS 120B 467 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Energy-Weighted Flow Matching: Unlocking Continuous Normalizing Flows for Efficient and Scalable Boltzmann Sampling (2509.03726v1)

Published 3 Sep 2025 in stat.ML and cs.LG

Abstract: Sampling from unnormalized target distributions, e.g. Boltzmann distributions $\mu_{\text{target}}(x) \propto \exp(-E(x)/T)$, is fundamental to many scientific applications yet computationally challenging due to complex, high-dimensional energy landscapes. Existing approaches applying modern generative models to Boltzmann distributions either require large datasets of samples drawn from the target distribution or, when using only energy evaluations for training, cannot efficiently leverage the expressivity of advanced architectures like continuous normalizing flows that have shown promise for molecular sampling. To address these shortcomings, we introduce Energy-Weighted Flow Matching (EWFM), a novel training objective enabling continuous normalizing flows to model Boltzmann distributions using only energy function evaluations. Our objective reformulates conditional flow matching via importance sampling, allowing training with samples from arbitrary proposal distributions. Based on this objective, we develop two algorithms: iterative EWFM (iEWFM), which progressively refines proposals through iterative training, and annealed EWFM (aEWFM), which additionally incorporates temperature annealing for challenging energy landscapes. On benchmark systems, including challenging 55-particle Lennard-Jones clusters, our algorithms demonstrate sample quality competitive with state-of-the-art energy-only methods while requiring up to three orders of magnitude fewer energy evaluations.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper introduces EWFM which trains continuous normalizing flows to sample from unnormalized Boltzmann distributions using energy evaluations.
  • It proposes iterative (iEWFM) and annealed (aEWFM) strategies to refine proposals, reduce variance, and enhance sample quality.
  • Empirical results show competitive negative log-likelihoods and significantly fewer energy evaluations compared to state-of-the-art methods.

Energy-Weighted Flow Matching for Scalable Boltzmann Sampling

Introduction

The paper presents Energy-Weighted Flow Matching (EWFM), a framework for training continuous normalizing flows (CNFs) to sample from unnormalized target distributions, specifically Boltzmann distributions, using only energy function evaluations. This addresses a central challenge in scientific computing: generating independent samples from high-dimensional, multi-modal equilibrium distributions where direct sampling is infeasible and trajectory-based methods (e.g., MCMC, MD) suffer from poor mixing due to energy barriers. EWFM enables the use of expressive CNF architectures for Boltzmann sampling without requiring target samples, overcoming limitations of previous approaches that either need large datasets or are restricted to less expressive models.

Energy-Weighted Flow Matching Objective

The core innovation is the reformulation of the conditional flow matching (CFM) loss, which typically requires samples from the target distribution, into an importance-weighted objective that can be estimated using samples from an arbitrary proposal distribution. The EWFM objective leverages the known unnormalized Boltzmann density exp(E(x)/T)\exp(-E(x)/T) and corrects for the mismatch between the proposal and target via importance weights:

LEWFM(θ;μprop)=Et,Xt,X1[w(X1)EX1μprop[w(X1)]utθ(Xt)ut(XtX1)2]\mathcal{L}_{\text{EWFM}}(\theta; \mu_{\text{prop}}) = \mathbb{E}_{t, X_t, X_1}\left[ \frac{w(X_1)}{\mathbb{E}_{X'_1 \sim \mu_{\text{prop}}}[w(X'_1)]} \| u_t^\theta(X_t) - u_t(X_t | X_1) \|^2 \right]

where w(x1)=exp(E(x1)/T)/μprop(x1)w(x_1) = \exp(-E(x_1)/T)/\mu_{\text{prop}}(x_1), and utθu_t^\theta is the parameterized vector field of the CNF. Figure 1

Figure 1: Comparison of CFM (left, requiring target samples) and EWFM (right, using proposal samples reweighted by Boltzmann importance weights).

This formulation is mathematically equivalent to the original CFM loss under mild support conditions, ensuring that minimization yields the same optimum as if target samples were available.

Iterative and Annealed EWFM Algorithms

Iterative EWFM (iEWFM)

Direct estimation of the EWFM objective can suffer from high variance in importance weights if the proposal is far from the target. iEWFM mitigates this by iteratively refining the proposal: the current model is used as the proposal for the next training step, progressively improving overlap with the target and stabilizing gradient estimates. The algorithm employs a sample buffer to amortize the cost of CNF density evaluations and energy computations. Figure 2

Figure 2: iEWFM progressively refines the proposal distribution, reducing importance weight variance and improving coverage of the target.

Annealed EWFM (aEWFM)

For highly complex energy landscapes, the initial proposal may be too poor for effective bootstrapping. aEWFM introduces temperature annealing: training begins at a high temperature (flatter landscape), gradually cooling to the target temperature. This increases the overlap between proposal and target in early stages, further stabilizing training.

Empirical Evaluation

The framework is evaluated on standard Boltzmann sampling benchmarks: GMM-40 (2D, 40-component mixture), DW-4 (8D double-well), LJ-13 (39D Lennard-Jones), and LJ-55 (165D Lennard-Jones). EWFM variants are compared to state-of-the-art energy-only methods: FAB and iDEM.

Key findings:

  • Sample Quality: iEWFM and aEWFM achieve competitive or superior negative log-likelihood (NLL) and Wasserstein distance (W2\mathcal{W}_2) compared to iDEM and FAB, especially on high-dimensional LJ-13 and LJ-55.
  • Energy Evaluation Efficiency: EWFM variants require up to three orders of magnitude fewer energy evaluations than iDEM, and are comparable to FAB, which is critical for expensive energy functions.
  • Scalability: aEWFM demonstrates robust performance on the challenging LJ-55 system, indicating scalability to very high dimensions. Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3: Qualitative sample quality across benchmarks. EWFM and aEWFM accurately capture mixture components and energy distributions, with aEWFM performing well even on LJ-55.

Theoretical and Practical Implications

The EWFM framework enables the use of CNFs for Boltzmann sampling in settings where only energy evaluations are available, unlocking the expressivity of advanced architectures for scientific applications. The iterative proposal refinement is theoretically motivated by variance reduction in self-normalized importance sampling, and the annealing strategy is justified by improved overlap in probability mass. Figure 4

Figure 4: Boltzmann sampling problem illustrated: high-energy barriers separate low-energy regions, making direct sampling challenging.

The main trade-off is the computational cost of CNF density evaluations, which can dominate wall-clock time despite buffer amortization. This is offset by the dramatic reduction in energy evaluations, making the approach attractive for systems where energy computation is the bottleneck.

Limitations and Future Directions

  • Computational Bottleneck: CNF density evaluation remains expensive; mixture model proposals or more efficient density approximations could alleviate this.
  • Intermediate Complexity: On DW-4, iEWFM and aEWFM underperform compared to iDEM and FAB, possibly due to bias in gradient estimates from model proposals.
  • Evaluation Metrics: Current NLL evaluation relies on training a secondary CNF; alternative metrics not requiring model retraining would be preferable.

Future work should include systematic comparison with concurrent methods (e.g., TA-BG, Adjoint Sampling), evaluation on larger molecular systems, and investigation of hybrid approaches incorporating small amounts of target data. Methodological variations such as fine-tuning versus retraining across annealing steps warrant further paper.

Conclusion

Energy-Weighted Flow Matching provides a principled, efficient, and scalable approach for training CNFs as Boltzmann generators using only energy evaluations. The iterative and annealed algorithms achieve competitive sample quality with dramatically reduced energy evaluation requirements, particularly on high-dimensional systems. The framework advances the practical applicability of generative modeling for scientific sampling tasks, with several promising directions for further optimization and extension.

X Twitter Logo Streamline Icon: https://streamlinehq.com