DriftXpress: Faster Drifting Models via Projected RKHS Fields

Published 12 May 2026 in cs.LG and cs.AI | (2605.12183v1)

Abstract: Drifting Models have emerged as a new paradigm for one-step generative modeling, achieving strong image quality without iterative inference. The premise is to replace the iterative denoising process in diffusion models with a single evaluation of a generator. However, this creates a different trade-off: drifting reduces inference cost by moving much of the computation into training. We introduce DriftXpress, an accelerated formulation of drifting models based on projected RKHS fields. DriftXpress approximates the drifting kernel in a low-rank feature space. This preserves the attraction-repulsion structure of the original drifting field while reducing the cost of field evaluation. Across image-generation benchmarks, DriftXpress achieves comparable FID to standard drifting while reducing wall-clock training cost. These results show that the training-inference trade-off of drifting models can be pushed further without giving up their one-step inference advantage.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper introduces a projected RKHS field formulation using Nyström approximations to accelerate training of drifting models while preserving field fidelity.
It achieves up to 6.6× speedup on benchmarks like SVHN and CIFAR10 with marginal FID degradation, demonstrating effective scalability.
Empirical results and theoretical guarantees validate that approximating only the attraction term preserves performance, crucial for large-scale multi-class datasets.

DriftXpress: Accelerated Drifting Models via Projected RKHS Fields

Introduction and Context

Drifting models have recently been established as a non-adversarial, one-step generative modeling framework. These models bypass the multi-step iterative inference characteristic of diffusion and flow-based models, achieving image synthesis by transforming random noise into samples in a single generator pass. Their distinctive approach constructs an attraction-repulsion vector field in feature space—a sample is attracted toward the data manifold and repelled from the current generator’s distribution. This paradigm eliminates sampling latencies but shifts the computational bottleneck toward training, where repeated kernel interactions with the dataset dominate costs (2605.12183).

However, as training support sizes scale, especially on complex datasets and with large batch sizes, exact kernel-based field estimation in drifting models incurs prohibitive memory and runtime costs. DriftXpress addresses this by introducing a projected reproducing kernel Hilbert space (RKHS) field formulation, substituting the exhaustive attractive kernel interactions with a precomputed low-rank Nyström approximation evaluated via selected data landmarks.

Methodology: Projected Drifting Fields via Nyström Approximation

DriftXpress reformulates the attraction component of the drifting field as a projection onto a subspace of the RKHS, whose basis is defined by a set of landmark features sampled from the training data. The underlying kernel (Laplace in the reported experiments) enables representation of the original kernel interactions through finite-dimensional embeddings, allowing the computation to be factorized as a query-landmark operation.

The field is:

Attraction: Approximated using a Nyström feature map and precomputed summaries of the training data. For each generated sample, only low-rank operations with respect to the landmarks are required.
Repulsion: Remains exact—computed directly among the current generated batch without approximation. Ablation studies confirm that repulsion is significantly more sensitive to approximation error; projecting both attraction and repulsion consistently leads to degenerate performance.

To ensure scalability in high-class-count and high-support regimes (e.g., CIFAR100, ImageNet), DriftXpress shards per-class summaries, accumulating partial results sequentially, thereby controlling memory.

The method provides strong theoretical control: Theorem 3.2 and Corollary 3.3 precisely relate the field approximation error to the kernel residual $\|r_U(x)\|^2$ and the kernel mass, tightly bounding field distortion under the projection. This ensures the vector field’s fidelity when the Nyström basis is sufficiently expressive.

Empirical Results

DriftXpress is benchmarked against standard drifting on SVHN, CIFAR10, CIFAR100, and ImageNet using consistent architectures and objectives. The main results:

Throughput: DriftXpress achieves 6.68× (SVHN) and 6.63× (CIFAR10) speedups in images processed per second, with only marginal FID degradation or even mild improvement. On SVHN, best FID is 3.11 vs. 2.94; on CIFAR10, it is 5.52 vs. 5.64. On CIFAR100 and ImageNet (where sharding is used), speedups are 2.95× and 2.64×, respectively.
Wall-Clock Convergence: DriftXpress reaches competitive FID faster in wall-clock time due to stable, dataset-wide field estimation and more efficient optimization—visible in early training step snapshots, which show DriftXpress generators producing recognizable content earlier.
Batch Size Sensitivity: Larger batch sizes confer further runtime advantages; at low batch sizes, DriftXpress exhibits superior FID trajectories due to field smoothness.
Landmark Selection and Ratio: Random per-class selection yields competitive results with minimal computational overhead, aligning with Nyström method theory. Enhanced selection (k-means, facility location) can provide incremental improvements. Increasing landmarks per class improves field fidelity and FID up to a point of diminishing returns (optimal ratios: 0.0256–0.1024 per class).
Memory and Sharding: The unsharded variant is faster but requires substantial VRAM as the number of classes increases. Sharding allows DriftXpress to match standard drifting’s memory requirements and remain feasible at scale in multi-class settings.

Ablations show that approximating the repulsive term leads to dramatic sample collapse, underscoring the necessity of maintaining exact repulsion during training.

Theoretical and Practical Implications

DriftXpress pushes the efficiency frontier of one-step generative models by making the key kernel-based field estimation orders of magnitude faster. Practically, this enables:

Training on significantly larger datasets/batch sizes within fixed compute or time budgets.
Feasible scaling to complex domains (e.g., ImageNet) previously constrained by kernel computation costs.
Easy adoption in distributed settings—the expensive data interactions are moved to a one-time preprocessing step, amortized across devices and training steps.

Theoretically, the work sharpens the understanding of non-adversarial, kernel-based generative models, showing how RKHS projections and low-rank approximations can preserve the rich geometry of attraction-repulsion fields without incurring full kernel matrix costs. The design also reinforces the emerging consensus that generative field estimation must treat positive (data) and negative (model) interactions asymmetrically in terms of approximation strategy.

Limitations and Future Directions

DriftXpress’ limitations include:

Asymmetry: Projected attraction but exact repulsion breaks the field’s anti-symmetry, though theory quantifies the induced error. Approximations for the repulsive component that do not cause collapse remain an open area.
Static Encoder Dependence: Cached summaries depend on a fixed feature encoder; any adaptation of this encoder invalidates precomputed summaries.
Landmark Budgets: Large landmark banks, while manageable via sharding, still limit joint memory scaling when many classes and high per-class cardinality coincide.

Future work could address safe approximate repulsion, explore adaptive or hierarchical landmark selection conditioned on generated samples, and further refine memory/computation trade-offs for even larger scales. There is also scope to connect more tightly with recent work on Wasserstein gradient flows, moment matching, and the broader landscape of non-adversarial, kernel-based generative models.

Conclusion

DriftXpress demonstrates that the training-inference trade-off in drifting models—whose generative performance previously came at substantial computational expense—can be shifted further. By leveraging projected RKHS fields through low-rank Nyström approximations, it preserves the geometric richness of the attraction-repulsion mechanism while delivering significant training acceleration and maintaining single-pass inference. The approach brings one-step generative modeling closer to practical scalability for large and complex datasets, and advances the theory and engineering of kernel-based generative methods (2605.12183).

Markdown Report Issue