Achieving smooth, high-fidelity, temporally coherent 3D morphing within SLAT-based generative frameworks

Develop a method that achieves smooth, high-fidelity, and temporally coherent 3D morphing within Structured Latent (SLAT)-based 3D generative models such as Trellis, ensuring plausible deformations and consistent textures throughout the morphing sequence.

Background

The paper critiques prior 3D morphing strategies—matching-based approaches, 2D morphing lifted to 3D, and direct interpolation in generative models—as insufficient for producing semantically coherent and temporally smooth 3D transitions. Trellis, a SLAT-based 3D generator, offers a structured latent representation that is promising for training-free applications, but naive SLAT fusion results in poor transitions.

The authors identify that blending SLAT features at the attention level (rather than at noise or condition levels) can improve plausibility and smoothness. They propose Morphing Cross-Attention (MCA) and Temporal-Fused Self-Attention (TFSA), plus an orientation correction strategy, specifically to tackle this challenge within SLAT-based generative frameworks.

References

In summary, while the SLAT representation offers compelling opportunities for 3D morphing, achieving smooth, high-fidelity, and temporally coherent 3D morphing within modern SLAT-based generative frameworks remains an open and pressing challenge.

MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing  (2601.00204 - Sun et al., 1 Jan 2026) in Section 1 (Introduction), final paragraph