Papers
Topics
Authors
Recent
2000 character limit reached

Rectified Guidance in Generative Models

Updated 29 December 2025
  • Rectified Guidance (ReCFG) is a family of theoretically principled, geometry-aware conditioning methods that correct expectation shifts and stability issues in generative models.
  • It introduces techniques like coefficient rectification, Jacobian correction, and anchored guidance to reparameterize guidance vectors for improved marginal consistency.
  • Empirical results show that ReCFG enhances sample fidelity, prompt adherence, and identity preservation with minimal computational overhead compared to traditional guidance schemes.

Rectified Guidance (ReCFG) refers to a family of theoretically principled, geometry-aware conditioning methods for diffusive and flow-based generative models. These approaches resolve fundamental theoretical inconsistencies and stability issues in conventional guidance techniques—such as Classifier Guidance (CG) and Classifier-Free Guidance (CFG)—by either reparameterizing guidance coefficients, correcting vector fields via Jacobian information, or re-anchoring conditional trajectories to learned transport manifolds. ReCFG has been developed in several recent instantiations for diffusion, rectified flow, and flow matching models, each with distinct mathematical foundations and algorithmic frameworks (Sun et al., 23 May 2024, &&&1&&&, Gao et al., 31 Jan 2025, Saini et al., 9 Oct 2025).

1. Theoretical Foundations of Rectified Guidance

Naive guidance schemes, including standard CFG and CG, often introduce theory-practice discrepancies. In conditional diffusion models, CFG computes the reverse-time vector field at each step as a weighted linear combination of conditional and unconditional predictions:

st,γ(xtc)=γst(xtc)+(1γ)st(xt)s_{t,\gamma}(x_t|c) = \gamma\, s_t(x_t|c) + (1-\gamma)\, s_t(x_t)

where st(xtc)=xtlogqt(xtc)s_t(x_t|c) = \nabla_{x_t}\log q_t(x_t|c) and st(xt)=xtlogqt(xt)s_t(x_t) = \nabla_{x_t}\log q_t(x_t). However, if γ1\gamma \neq 1, this form induces a nonzero mean in the effective score, violating the reciprocal principle of score-based sampling and causing an expectation shift in the output distribution, which impairs sample fidelity as guidance strength increases (Xia et al., 24 Oct 2024).

Worse, standard guidance operates under a “scaled marginal” objective—applying the same modification across all timesteps—which is theoretically over-constrained and results in a degenerate diffusion process (Gao et al., 31 Jan 2025). ReCFG addresses these mismatches by either correcting the guidance coefficients (to restore zero-mean property) or redesigning the guidance field using joint-scaling, detector Jacobians, or anchored manifold corrections, as further detailed below.

2. Rectified Guidance in Diffusion Models

2.1 Coefficient-Rectified CFG

Rectified CFG (also denoted “ReCFG” in (Xia et al., 24 Oct 2024)) eliminates the expectation shift by independently optimizing the weights for the conditional and unconditional scores. The rectified combination is

st,γ1,γ0(xtc)=γ1st(xtc)+γ0st(xt)s_{t,\gamma_1,\gamma_0}(x_t|c) = \gamma_1\, s_t(x_t|c) + \gamma_0\, s_t(x_t)

with constraints:

  • zero-mean: Eqt(c)[st,γ1,γ0(xtc)]=0\mathbb{E}_{q_t(\cdot | c)}[s_{t,\gamma_1,\gamma_0}(x_t|c)] = 0
  • sharpening: γ1>1\gamma_1>1, γ00\gamma_0\le0, γ1+γ01\gamma_1+\gamma_0\ge1
  • closed form solution: for target strength w=γ1w=\gamma_1,

γ0(c,t)=wR(c,t),R(c,t)=E[εθ(xt,c,t)]E[εθ(xt,t)]\gamma_0(c,t) = -w\, R(c,t),\quad R(c,t) = \frac{\mathbb{E}[\varepsilon_\theta(x_t,c,t)]}{\mathbb{E}[\varepsilon_\theta(x_t,t)]}

where εθ\varepsilon_\theta is the network prediction. This correction is efficiently implemented: R(c,t)R(c,t) is pre-computed, and inference uses a lookup per timestep and prompt, with no additional runtime overhead beyond two forward passes per sample step (Xia et al., 24 Oct 2024).

2.2 Rectified Gradient Guidance (REG)

Rectified Gradient Guidance (REG) (Gao et al., 31 Jan 2025) extends to a broader family of guidance functions, improving theoretical consistency by rederiving the optimal guidance field from a joint-scaled distribution objective. Let R0(x0,y)R_0(x_0, y) be the reward applied at the endpoint, and Et(xt,y)E_t(x_t, y) the future-trajectory expected reward,

Et(xt,y)=pθ(x0xt,y)R0(x0,y)dx0E_t(x_t, y) = \int p_\theta(x_0|x_t, y) R_0(x_0, y)\, dx_0

The optimal (but intractable) guidance field is

ϵˉθ,t=ϵθ,t1αˉtxtlogEt(xt,y)\bar{\epsilon}^\star_{\theta, t} = \epsilon_{\theta, t} - \sqrt{1-\bar{\alpha}_t}\, \nabla_{x_t} \log E_t(x_t, y)

In practice, EtE_t is approximated using the diagonal part of the local reward and the Jacobian of the sample-to-endpoint mapping, yielding the REG update:

ϵˉθ,tREG=ϵθ,t1αˉtxtlogRt(xt,y)[11αˉtxt(1Tϵθ,t)]\bar{\epsilon}^{\rm REG}_{\theta, t} = \epsilon_{\theta, t} -\sqrt{1-\bar{\alpha}_t} \nabla_{x_t}\log R_t(x_t, y) \odot \left[ 1 - \sqrt{1-\bar{\alpha}_t}\, \frac{\partial}{\partial x_t}(\mathbf{1}^T\epsilon_{\theta, t})\right]

This approach consistently reduces the discrepancy between practical and theoretically optimal guidance, at a modest overhead cost (one extra backward pass per sampling step) (Gao et al., 31 Jan 2025).

3. Rectified Guidance for Rectified Flows and Flow Matching

3.1 Anchored Classifier Guidance for Rectified Flows

Rectified Classifier Guidance (also denoted “ReCFG” in (Sun et al., 23 May 2024)) is designed specifically for flow-based ODE generative models employing rectified flows. The method transforms test-time classifier guidance into a fixed-point problem on the clean endpoint z1z_1, regularized by anchoring the guided trajectory to a reference (unguided) rectified flow:

z^1=z1+sJz1logp(cz1),J:=z0z1\hat{z}_1 = z_1 + s \cdot J \cdot \nabla_{z_1} \log p(c | z_1), \quad J := \nabla_{z_0} z_1

where JJ is the linearized transport Jacobian and p(cz)p(c | z) is an off-the-shelf image discriminator. Under mild Lipschitz assumptions, this fixed-point map is a contraction for sufficiently small ss, ensuring convergence and stability. Piecewise updating and local linearization are used in practice (Sun et al., 23 May 2024).

This stabilization enables training-free identity personalization in rectified flows, avoiding the need for noise-aware classifiers and maintaining endpoint fidelity over diverse personalization prompts and discriminators.

3.2 Geometry-Aware Rectified-CFG++ for Flow Matching

Rectified-CFG++ (Saini et al., 9 Oct 2025) extends rectified guidance to general flow-matching architectures. Each ODE solver step consists of a predictor–corrector:

  • Predictor: make a conditional flow (conditional velocity vθ(xt,t,y)v_\theta(x_t, t, y)) step.
  • Corrector: interpolate between conditional and unconditional velocities at the intermediate point using a scheduled weight α(t)\alpha(t):

v^t=vtc+α(t)(vtΔt2cvtΔt2u)\hat{v}_t = v^c_t + \alpha(t)\, (v_{t-\frac{\Delta t}{2}}^c - v_{t-\frac{\Delta t}{2}}^u)

Theoretical guarantees include marginal consistency (the marginal flow is unbiased with respect to the data manifold) and bounded deviation from the data manifold, proportional to maxtα(t)\max_t \alpha(t) and the velocity difference. This approach ensures stability across strong guidance scales and suppresses off-manifold drift, which is especially acute in deterministic rectified flow models when naïvely applying CFG (Saini et al., 9 Oct 2025).

4. Comparative Empirical Evaluation

Empirical evaluations consistently demonstrate that Rectified Guidance methods yield improvements over standard guidance in image fidelity, prompt adherence, and stability, often with minimal inference overhead:

Model Type Method FID CLIP Score ↑ Comment
EDM2 ImageNet CFG 5.59 Baseline (w=3.0)
EDM2 ImageNet ReCFG 4.84 Rectified coefficients (Xia et al., 24 Oct 2024)
Stable Diffusion 3 CFG 156.6 0.209 5 steps, w=7.5
Stable Diffusion 3 ReCFG 140.9 0.229 +9.6% CLIP, -10% FID (Xia et al., 24 Oct 2024)
SD 3.5/Flux CFG 20.29/37.86 0.3506/0.3351 Baseline (ω=3.0)
SD 3.5/Flux ReCFG++ 20.22/32.23 0.3497/0.3493 5–15% FID gain (Saini et al., 9 Oct 2025)

In rectified flows, anchored classifier guidance achieves superior identity preservation on CelebA-HQ (identity ≃ 0.593 vs. 0.581 for state-of-the-art baselines), with robust handling of multiple identities and styles (Sun et al., 23 May 2024). REG shows similar consistent gains (ΔFID ≈ 0.3) across diverse class- and text-conditional models (Gao et al., 31 Jan 2025).

5. Practical Implementation and Limitations

Rectified Guidance methods are designed for drop-in adoption in existing workflows. Coefficient rectification (ReCFG (Xia et al., 24 Oct 2024)) requires precomputing a lookup table of means for each prompt and timestep. REG (Gao et al., 31 Jan 2025) introduces a backprop-based correction factor, generally doubling gradient evaluation at inference. Anchored guidance (Sun et al., 23 May 2024) and predictor-corrector schemes (Saini et al., 9 Oct 2025) are compatible with any off-the-shelf discriminators or velocity field predictors, and bring negligible or moderate computational overhead.

Practical issues may arise in open-vocabulary text models for coefficient lookup, memory/compute in very large networks (REG’s extra backward), and out-of-manifold drift in non-rectified flows (alleviated by geometry-aware correction in Rectified-CFG++). All methods assume sufficiently accurate underlying unconditional and conditional predictors, and that transport trajectories can be locally linearized or reasonably approximated.

6. Significance, Variants, and Extensions

Rectified Guidance reconciles conditional generative modeling with foundational probability flows, yielding:

  • Theoretical alignment with time-reversible diffusion and flow matching.
  • Closed-form correction for expectation shift and marginal consistency.
  • Tunable, precomputed, or data-driven guidance schedules and weights.
  • Empirical robustness to strong guidance strengths, out-of-manifold deviation, and prompt-drift artifacts.

Variants include joint-scaling objectives, Jacobian-corrected vector fields, fixed-point-anchored personalization, and multi-modal predictor–corrector frameworks. These methodologies generalize to continuous normalizing flows and are extensible to spatio-temporal and multimodal domains. A plausible implication is increased reliability and sample quality for high-fidelity conditional generation tasks—including personalized and compositional image synthesis, video, and audio generation (Sun et al., 23 May 2024, Xia et al., 24 Oct 2024, Gao et al., 31 Jan 2025, Saini et al., 9 Oct 2025).

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Rectified Guidance (ReCFG).