Energy-Guided Diffusion

Updated 19 May 2026

Energy-guided diffusion is a technique that steers generative trajectories using an energy function to encode constraints, rewards, or domain preferences.
The method employs contrastive energy prediction (CEP) to achieve exact intermediate guidance during reverse-diffusion, ensuring rigorous theoretical guarantees.
Practical applications span offline reinforcement learning, image synthesis, and molecular design, with careful tuning required to manage computational overhead and stability.

Energy-guided diffusion refers to a family of sampling, optimization, and policy-generation techniques for diffusion-based generative models in which the sample generation process is explicitly guided by an energy function. This energy function can encode arbitrary prior knowledge, reward, constraint, or domain-specific preference, thus steering the generative trajectories toward samples of desired properties. The concept originated to address sampling from distributions of the form $p(x) \propto q(x) \exp(-\beta E(x))$ , where $q(x)$ is a pretrained diffusion model distribution and $E(x)$ is a (possibly unnormalized) energy. A rigorous theoretical and practical foundation for energy-guided diffusion was established by Lu et al. through the development of contrastive energy prediction (CEP) for exact estimation and sampling in offline RL and related domains (Lu et al., 2023).

1. Mathematical Formulation and Intermediate Guidance

Given an initial data distribution $q_0(x)$ on $\mathbb{R}^d$ and a user-defined energy function $E^0(x) \geq 0$ , the energy-guided target is:

$p_0(x) \propto q_0(x) \exp(-\beta E^0(x)), \quad \beta \geq 0$

In practical settings, only $q_0$ is tractable (via pretrained diffusion), and the aim is to sample from $p_0$ . A standard approach is to modify the reverse-diffusion score $\nabla_x \log q_t(x_t)$ by adding an energy-guidance term to approximate the ideal but intractable $q(x)$ 0. The key theoretical result (Lu et al., 2023) is the explicit form for the intermediate marginals and guidance:

$q(x)$ 1

$q(x)$ 2

The central technical challenge is that the intermediate energy $q(x)$ 3 and its gradient are intractable for arbitrary $q(x)$ 4.

2. Contrastive Energy Prediction (CEP): Exact Learning of Guidance

Contrastive energy prediction (CEP) provides a practical objective for learning the time-dependent energy model $q(x)$ 5. For each time $q(x)$ 6, one draws $q(x)$ 7 clean samples $q(x)$ 8 and corresponding noisy versions $q(x)$ 9. Soft labels $E(x)$ 0 are assigned, and $E(x)$ 1 is trained to match these under a softmax using an InfoNCE-style loss:

$E(x)$ 2

Theorem 2 of Lu et al. demonstrates that, with unlimited data and model capacity, the minimizer satisfies $E(x)$ 3, yielding the exact intermediate guidance up to an additive constant (Lu et al., 2023).

3. Algorithmic Implementation and Pseudocode

The full energy-guided diffusion sampling procedure is as follows (for ODE or SDE solvers):

Pretrain the diffusion score model $E(x)$ 4.
Train the energy model $E(x)$ 5 using the CEP objective.
Guided Sampling: For each reverse diffusion step,
- Compute $E(x)$ 6.
- Compute $E(x)$ 7, with guidance scale $E(x)$ 8.
- Combine scores: $E(x)$ 9.
- Update via solver: $q_0(x)$ 0.

This procedure supports any sufficiently expressive diffusion architecture and can be specialized to both continuous and discrete data (e.g., image pixels, RL actions). Guidance scale tuning and efficient minibatch construction (precomputing a support set for RL) are recommended to control computational overhead and stability (Lu et al., 2023).

4. Theoretical Guarantees and Extensions

The CEP framework is unique in providing an exact convergence guarantee to the true intermediate energy guidance, as opposed to approximate methods (MSE or direct resampling). The key analytical result (Theorem 1, 2 in (Lu et al., 2023)) is rigorous and hinges on the sufficiency of infinite data/model capacity and softmax normalization. This foundation has been generalized and connected to constrained RL via analytic approximations and Taylor/MGF expansion in subsequent work, notably AEPO, which provides closed-form intermediate energy expressions for conditional Gaussian diffusion and an analytic flow for the energy-guided path (Hu et al., 3 May 2025).

5. Applications and Empirical Validation

Offline Reinforcement Learning

CEP and its variants achieve state-of-the-art empirical results on D4RL locomotion and AntMaze tasks, consistently outperforming CQL, BCQ, IQL, Diffuser, and Diffusion-QL in both standard and challenging settings. For example, QGPO achieves an average AntMaze score of 78.3 compared to the next best ~74.2. Removing CEP guidance or falling back to approximate alternatives leads to a marked performance drop (Lu et al., 2023).

Image Synthesis

CEP-guided diffusion matches FID, precision, and recall of standard classifier guidance on class-conditional ImageNet synthesis; it also enables continuous energy control such as global color adjustment without distorting perceptual realism (Lu et al., 2023).

Other Domains

Extensions of energy-guided diffusion implementations include:

Personalized image editing with multi-scale text and image energies (Jiang et al., 6 Mar 2025).
Specific binding molecule generation in SBDD via contrastively trained SE(3)-equivariant energy networks (Gao et al., 2024).
Structure-aware protein ensemble generation under physical force landscape or MM potentials, combining data-driven and physics-based forces (Wang et al., 2024).
Training-free, plug-and-play conditional sampling from pretrained models using externally defined energies (Yu et al., 2023).

6. Limitations, Computational Considerations, and Broader Implications

CEP introduces O(K) overhead in minibatch energy computation, which can be managed by in-support batch construction for RL. Numerical instabilities can arise at high β due to sharply peaked energy labels, necessitating careful normalization and batch sizing (Lu et al., 2023). While exact in the infinite data limit, practical performance depends on capacity, data coverage, and suitable guidance scale tuning.

The generality of the framework enables extension to arbitrary continuous and discrete energy functions, beyond classifier guidance, at the cost of increased computation. Use in generative modeling—such as controllable image synthesis and action distribution stratification—raises ethical and safety considerations concerning malice-biased content and requires appropriate safeguards.

In summary, energy-guided diffusion via exact intermediate guidance (CEP and analytic variants) establishes a rigorous principled mechanism for sampling and optimizing under general energy tilts in diffusion generative models, with validated practical impact across reinforcement learning, computer vision, and scientific ML domains (Lu et al., 2023, Hu et al., 3 May 2025).