Matten: Video Generation with Mamba-Attention (2405.03025v2)

Published 5 May 2024 in cs.CV

Abstract: In this paper, we introduce Matten, a cutting-edge latent diffusion model with Mamba-Attention architecture for video generation. With minimal computational cost, Matten employs spatial-temporal attention for local video content modeling and bidirectional Mamba for global video content modeling. Our comprehensive experimental evaluation demonstrates that Matten has competitive performance with the current Transformer-based and GAN-based models in benchmark performance, achieving superior FVD scores and efficiency. Additionally, we observe a direct positive correlation between the complexity of our designed model and the improvement in video quality, indicating the excellent scalability of Matten.

References (68)

Citations (8)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

[2405.03025] Matten: Video Generation with Mamba-Attention (1 point, 0 comments)

Matten: Video Generation with Mamba-Attention (2405.03025v2)

Summary

Related Papers

Reddit