2000 character limit reached
DreaMoving: A Human Video Generation Framework based on Diffusion Models (2312.05107v2)
Published 8 Dec 2023 in cs.CV
Abstract: In this paper, we present DreaMoving, a diffusion-based controllable video generation framework to produce high-quality customized human videos. Specifically, given target identity and posture sequences, DreaMoving can generate a video of the target identity moving or dancing anywhere driven by the posture sequences. To this end, we propose a Video ControlNet for motion-controlling and a Content Guider for identity preserving. The proposed model is easy to use and can be adapted to most stylized diffusion models to generate diverse results. The project page is available at https://dreamoving.github.io/dreamoving
- Frozen in time: A joint video and image encoder for end-to-end retrieval. In IEEE International Conference on Computer Vision, 2021.
- Zoedepth: Zero-shot transfer by combining relative and metric depth, 2023.
- Minigpt-v2: large language model as a unified interface for vision-language multi-task learning. arXiv preprint arXiv:2310.09478, 2023.
- Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4690–4699, 2019.
- Animatediff: Animate your personalized text-to-image diffusion models without specific tuning. arXiv preprint arXiv:2307.04725, 2023.
- LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022.
- Openclip, 2021. If you use this software, please cite it as below.
- Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
- High-resolution image synthesis with latent diffusion models, 2021.
- Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. 2022.
- Effective whole-body pose estimation with two-stages distillation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4210–4220, 2023.
- Ip-adapter: Text compatible image prompt adapter for text-to-image diffusion models. 2023.
- Adding conditional control to text-to-image diffusion models, 2023.
- Mengyang Feng (12 papers)
- Jinlin Liu (10 papers)
- Kai Yu (202 papers)
- Yuan Yao (292 papers)
- Zheng Hui (27 papers)
- Xiefan Guo (8 papers)
- Xianhui Lin (11 papers)
- Haolan Xue (2 papers)
- Chen Shi (55 papers)
- Xiaowen Li (14 papers)
- Aojie Li (3 papers)
- Miaomiao Cui (27 papers)
- Peiran Ren (28 papers)
- Xuansong Xie (69 papers)
- Xiaoyang Kang (7 papers)
- Biwen Lei (12 papers)