AYF-LMD: Lagrangian Map Distillation
- The paper introduces AYF-LMD, which leverages two-time neural operators to distill flow maps that approximate continuous probability paths between noise and data distributions.
- The methodology employs a Lagrangian perspective with a specialized distillation loss to enforce map-velocity consistency, enabling rapid convergence and robust few-step sampling.
- Extensive experiments on ImageNet and text-to-image benchmarks demonstrate state-of-the-art performance, outperforming traditional flow matching and consistency models.
Lagrangian Map Distillation (AYF-LMD) is a continuous-time generative modeling distillation framework designed to produce neural flow maps that efficiently and accurately approximate probability path dynamics between noise and data distributions. Developed within the Align Your Flow (AYF) methodology, AYF-LMD leverages the Lagrangian perspective to train two-time neural operators, enabling high sample quality in few sampling steps for both unconditional and conditional tasks, including high-resolution image and text-to-image synthesis (Sabour et al., 17 Jun 2025, Boffi et al., 2024).
1. Flow Maps and Two-Time Operators
AYF-LMD builds upon the formulation of two-time flow maps associated with time-dependent velocity fields. For a dynamical probabilistic process with state at time evolving according to , the flow map aims to transport at time directly to the solution at any time , respecting . In practice, is parameterized as
0
where 1 encodes the learned average velocity over 2 (Sabour et al., 17 Jun 2025).
2. Lagrangian Map Distillation Objective
The Lagrangian Map Distillation (LMD) loss enforces the neural flow map to satisfy the trajectory-level ODE:
3
The AYF-LMD objective is formalized as:
4
where ODE5 denotes a one-step Euler integration from 6 to 7 and the expectation is over the data and time distributions with weighting 8 (Sabour et al., 17 Jun 2025).
Taking the infinitesimal limit (9) and differentiating yields the Lagrangian PINN loss (Boffi et al., 2024):
0
which matches the derivatives of the map with the learned velocity field over all time pairs. The loss admits analytic and empirical advantages: rapid convergence and stability for few-step distillation (Sabour et al., 17 Jun 2025, Boffi et al., 2024).
3. Unified Framework and Special Cases
AYF-LMD situates itself within the broader two-time map distillation framework, encompassing various fast generative modeling objectives as limiting cases:
- When 1, the LMD loss reduces to classical flow matching, which matches the instantaneous velocities.
- When applied with 2, other distillation objectives like continuous-time consistency models are recovered in the Eulerian formulation.
- The general two-time map LMD objective also unifies trajectory distillation and neural-operator regression losses: trajectory-based methods precompute true ODE solutions for distinct time pairs and enforce regression, while flow map matching and progressive distillation are interpreted as variations of learning approximate or composed two-time maps (Sabour et al., 17 Jun 2025, Boffi et al., 2024).
4. Training Methodology and Practical Implementation
AYF-LMD is realized via modern neural architectures (typically deep U-Nets), with time coordinates 3 and 4 sinusoidally embedded and fed throughout the network (Boffi et al., 2024). The map parameterization ensures the boundary condition 5 by design:
6
The training loop involves:
- Sampling batches of initial states 7 via interpolants between base and data distributions,
- Randomly drawing times 8 (uniform or weighted sampling),
- Computing the map 9 and its time derivative with autodiff,
- Minimizing the squared error between the neural derivative and the target velocity field as specified above.
Typical optimization employs Adam with batch sizes in the 0–1 regime, learning rates 2, and 3–4 iterations depending on dataset size and complexity. For teacher-student distillation (e.g. from large-parameterized velocity models), the trained teacher's vector field is used as the reference 5 (Sabour et al., 17 Jun 2025, Boffi et al., 2024).
5. Performance and Empirical Behavior
AYF-LMD demonstrates robust empirical performance, yielding state-of-the-art few-step sample quality with substantially reduced inference cost. On benchmarks including ImageNet 64×64 and 512×512, AYF distilled models using AYF-LMD achieve class-conditional FID scores that outperform prior consistency and flow matching approaches at 1–8 sampling steps:
- On ImageNet 64×64, AYF-LMD achieves FID = 1.25 for 6 steps and further improves with optional adversarial fine-tuning.
- On ImageNet 512×512, using a 280M-parameter model, AYF-LMD achieves FID = 1.87 at 7 steps (Sabour et al., 17 Jun 2025).
For text-to-image tasks, AYF-LMD distilled models were subjectively preferred by human raters over leading LoRA-based few-step samplers in user studies (Sabour et al., 17 Jun 2025). LMD converges more rapidly and stably than direct Flow Map Matching (FMM) losses, particularly in the few-step regime, as confirmed on CIFAR-10 and ImageNet-32 (Boffi et al., 2024).
6. Extensions: Guidance and Adversarial Fine-tuning
Classical classifier-free guidance is prone to over-shooting at large scales; instead, AYF employs autoguidance, where a lower-quality checkpoint 8 is linearly mixed with the velocity teacher 9 via a scalar 0 sampled uniformly during training. All tangent calculations in distillation are adjusted accordingly:
1
with 2 typical (Sabour et al., 17 Jun 2025).
After AYF-LMD training, adversarial fine-tuning may be applied. A StyleGAN2 discriminator is used under a relativistic Softplus loss, regularized with R3/R4, and the total loss combines the LMD objective with adversarial feedback. This post-processing sharpens samples while preserving diversity and yields further improvements in FID across datasets (Sabour et al., 17 Jun 2025).
7. Variants and Theoretical Foundations
AYF-LMD extends the Lagrangian flow map distillation paradigm introduced by Boffi et al. for consistency and flow-matching models, providing theoretical unification and extension. Related developments include:
- Initial/Terminal Velocity Matching (ITVM), which augments LMD with specialized matching terms at initial and terminal times using exponential moving average stabilization, leading to superior few-step performance in both low- and high-dimensional domains (Khungurn et al., 2 May 2025).
- Direct (velocity-free) flow-map training via stochastic interpolants is also possible, enabling self-consistent fitting of two-time maps without a pretrained velocity field. However, empirical results indicate that AYF-LMD with teacher guidance remains preferable for rapid convergence and top-tier few-step quality (Boffi et al., 2024).
The Lagrangian map distillation framework provides a general, theoretically grounded strategy for scalable generative modeling distillation, supporting both data-driven and teacher-driven scenarios, and enabling superior trade-offs in sample quality, diversity, and computational efficiency.