Papers
Topics
Authors
Recent
2000 character limit reached

SoFlow: Direct Generative ODE Modeling

Updated 18 December 2025
  • SoFlow is a generative modeling framework that directly learns the closed-form solution of the velocity ODE underlying diffusion-based models, enabling efficient one- or few-step data generation.
  • It utilizes a novel parameterization with complementary Flow Matching and Solution Consistency losses and leverages Diffusion Transformers in VAE latent space for superior performance.
  • The method significantly reduces GPU memory usage and training time while achieving competitive FID scores compared to traditional multi-step diffusion and GAN approaches.

Solution Flow Models (SoFlow) constitute a generative modeling framework that directly learns the closed-form solution of the velocity ordinary differential equation (ODE) underlying diffusion-based models, enabling efficient one-step or few-step data sample generation. By explicitly modeling the mapping from a latent prior to data in a single network pass, SoFlow overcomes the inefficiency of traditional multi-step denoising approaches. The approach is characterized by a novel parameterization of the generative ODE’s solution, a pair of complementary loss functions—Flow Matching and Solution Consistency—and an architecture leveraging Diffusion Transformers (DiT) in VAE latent space, achieving state-of-the-art performance among one-step generative models on ImageNet 256×256 (Luo et al., 17 Dec 2025).

1. Mathematical Structure of the SoFlow Framework

SoFlow starts from a continuous interpolant (“noising process”) bridging data x0pdatax_0\sim p_{\rm data} and a tractable prior x1N(0,I)x_1\sim\mathcal{N}(0,I): xt=αtx0+βtx1,t[0,1]x_{t} = \alpha_{t}\,x_{0} + \beta_{t}\,x_{1},\qquad t\in[0,1] with α0=1\alpha_0=1, β0=0\beta_0=0, α1=0\alpha_1=0, β1=1\beta_1=1, α,βC1\alpha,\beta\in C^1. This yields a marginal velocity field

v(xt,t)=Ep(x0,x1xt)[αtx0+βtx1]v(x_{t},t) = \mathbb{E}_{p(x_0,x_1\mid x_t)}\left[\alpha_t' x_0 + \beta_t' x_1\right]

defining the generative ODE,

dX(t)dt=v(X(t),t),X(1)=x1N(0,I)\frac{dX(t)}{dt} = v\big(X(t),t\big),\qquad X(1)=x_{1}\sim\mathcal{N}(0,I)

to be solved backward in time. Rather than numerically integrating this ODE, SoFlow directly learns its solution function: f(xt,t,s)=X(s)where X() solves dXdu=v(X,u),X(t)=xtf(x_t,\,t,\,s) = X(s) \quad\text{where } X(\cdot) \text{ solves } \frac{dX}{du}=v(X,u),\, X(t)=x_t satisfying

f(xt,t,t)=xt,sf(xt,t,s)=v(f(xt,t,s),s)f(x_t,t,t) = x_t,\qquad \partial_{s}f(x_t,t,s) = v\big(f(x_t,t,s),s\big)

for 0st10\leq s\leq t\leq1. Thus f(,,)f(\cdot,\cdot,\cdot) instantaneously maps any xtx_t to xsx_s in closed-form, fundamentally distinguishing SoFlow from velocity-based diffusion and flow-matching models (Luo et al., 17 Dec 2025).

2. Loss Function Design: Flow Matching and Solution Consistency

The parametric solution map is

fθ(xt,t,s)=a(t,s)xt+b(t,s)Fθ(xt,t,s)f_{\theta}(x_t,t,s) = a(t,s)x_t + b(t,s)F_{\theta}(x_t,t,s)

where a(t,s)a(t,s) and b(t,s)b(t,s) are known (e.g., Euler or trigonometric parameterizations), FθF_\theta is a neural network, and a(t,t)=1a(t,t)=1, b(t,t)=0b(t,t)=0. Flow Matching loss anchors the network’s instantaneous velocity: LFM(θ)=Et,x0,x1[wFM(t)n2a(t,t)xt+2b(t,t)Fθ(xt,t,t)(αtx0+βtx1)2]\mathcal{L}_{\rm FM}(\theta) = \mathbb{E}_{t,x_0,x_1} \left[ \frac{w_{\rm FM}(t)}{n} \left\| \partial_{2}a(t,t)x_t + \partial_{2}b(t,t) F_{\theta}(x_t,t,t) - (\alpha_t' x_0 + \beta_t' x_1) \right\|^2 \right] with wFM(t)w_{\rm FM}(t) an adaptive weight, tt sampled logit-normal. The correct instantaneous velocity is enforced via analytic differentiation of the solution map at s=ts=t.

Classifier-Free Guidance (CFG) is incorporated by interpolating class-conditional and unconditional velocity estimates during loss computation.

To guarantee correct solution mapping over finite intervals, SoFlow introduces a Solution Consistency loss: LSCM(θ)=Et,l,s,x0,x1[wSCM(t,l,s)nfθ(xt,t,s)fθ(xt+(αtx0+βtx1)(lt),l,s)2]\mathcal{L}_{\rm SCM}(\theta) = \mathbb{E}_{t,l,s,x_0,x_1} \left[ \frac{w_{\rm SCM}(t,l,s)}{n}\left\| f_{\theta}(x_t,t,s) - f_{\theta}^-\left(x_t+(\alpha_t'x_0+\beta_t'x_1)(l-t),l,s\right) \right\|^2 \right] where fθf_\theta^- is a stopped-gradient network copy and wSCM(t,l,s)w_{\rm SCM}(t,l,s) is an adaptive weight. The final objective is a weighted sum

L(θ)=λLFM(θ)+(1λ)LSCM(θ)\mathcal{L}(\theta) = \lambda\,\mathcal{L}_{\rm FM}(\theta) + (1-\lambda)\,\mathcal{L}_{\rm SCM}(\theta)

with λ=0.75\lambda=0.75. Notably, LSCM\mathcal{L}_{\rm SCM} involves no Jacobian-vector products, yielding superior training efficiency compared to flow-anchored objectives (Luo et al., 17 Dec 2025).

3. Model Architecture and Implementation Protocols

SoFlow adopts the Diffusion Transformer (DiT) backbone, operating in VAE latent space (32×32×4) for ImageNet 256×256 generation. Model variants include B/2 (131M), M/2 (308M), L/2 (459M), XL/2 (676M), with patch size 2×2 for B/2 upwards. Training from scratch uses batch size 256 and 240 epochs, AdamW optimizer (lr=1×1041\times 10^{-4}, betas=(0.9,0.99)), no weight decay or lr decay, and EMA 0.9999.

Hyperparameters:

  • Time sampling: logit-normal for Losses ((μFM,σFM)=(0.2,1.0)(\mu_{\rm FM},\sigma_{\rm FM})=(-0.2,1.0) for LFM\mathcal{L}_{\rm FM}, (μt,σt)=(0.2,0.8)(\mu_t,\sigma_t)=(0.2,0.8), (μs,σs)=(1.0,0.8)(\mu_s,\sigma_s)=(-1.0,0.8) for LSCM\mathcal{L}_{\rm SCM}).
  • Noising schedule: linear (αt=1t,  βt=t\alpha_t=1-t,\;\beta_t=t) with Euler parameterization.
  • CFG strength ww and velocity-mix mm tuned per model size; ww decays from 2.5/2.0 to 1.0 for large tt.
  • CIFAR-10 experiments use a U-Net backbone, RAdam, batch size 1024, 800K steps, analogous settings (Luo et al., 17 Dec 2025).

4. Empirical Performance and Benchmarks

On ImageNet 256×256, SoFlow sets new FID-50K standards among one-step generative models across all tested DiT model scales:

Method Params 1–NFE FID
MeanFlow B/2 131 M 6.17
SoFlow B/2 131 M 4.85
MeanFlow M/2 308 M 5.01
SoFlow M/2 308 M 3.73
MeanFlow L/2 459 M 3.84
SoFlow L/2 459 M 3.20
MeanFlow XL/2 676 M 3.43
SoFlow XL/2 676 M 2.96

For two function evaluations, SoFlow XL/2 achieves 2.66 FID (vs. 2.93 for MeanFlow XL/2). These results are competitive with multi-step diffusion, autoregressive, and GAN methods at comparable or lower NFE, with SoFlow’s performance realized at significantly reduced inference cost. SoFlow consistently outperforms MeanFlow, the previous strongest one-step baseline (Luo et al., 17 Dec 2025).

5. Sampling, Inference Efficiency, and Practical Implications

SoFlow’s generative sampling proceeds as:

  1. Sample x1N(0,I)x_1\sim\mathcal{N}(0,I);
  2. Produce x0=fθ(x1,1,0)x_0 = f_\theta(x_1, 1, 0) via a single forward pass.

Optional few-step sampling is possible by perturbing x0x_0 and recursively invoking the solution map. The Solution Consistency loss eliminates JVPs, reducing peak GPU memory by 31% and enabling 23% faster training than MeanFlow on H100 GPUs. SoFlow inherits the computational efficiency of state-of-the-art attention kernels used in the DiT backbone (Luo et al., 17 Dec 2025).

6. Limitations, Extensions, and Forward-Looking Directions

SoFlow is currently best suited to scenarios demanding minimal NFE, with a tradeoff remaining at ultra-low NFE budgets compared to deep multi-step diffusion. Several extension directions are highlighted:

  • Optimization of noising/interpolant schedules.
  • Improved weighting schemes or variance reduction within the loss.
  • Application to text-to-image, video, or hybrid few-step regimes.
  • Empirical exploration of higher NFE hybrids (2–4 steps) for further FID improvements.

SoFlow’s formulation offers a platform for rapid progress in efficient generative modeling, with a unified framework that supports precise velocity field learning, CFG integration, and closed-form mapping from prior to data (Luo et al., 17 Dec 2025).


For the SoFlow “Semi-dilute-Flow” model relevant to polymer compression in Couette flow, see Dunstan (Dunstan, 2014). That distinct SoFlow framework models coil compression, not generative learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Solution Flow Models (SoFlow).