ElastoGen: 4D Generative Elastodynamics (2405.15056v2)
Abstract: We present ElastoGen, a knowledge-driven AI model that generates physically accurate 4D elastodynamics. Unlike deep models that learn from video- or image-based observations, ElastoGen leverages the principles of physics and learns from established mathematical and optimization procedures. The core idea of ElastoGen is converting the differential equation, corresponding to the nonlinear force equilibrium, into a series of iterative local convolution-like operations, which naturally fit deep architectures. We carefully build our network module following this overarching design philosophy. ElastoGen is much more lightweight in terms of both training requirements and network scale than deep generative models. Because of its alignment with actual physical procedures, ElastoGen efficiently generates accurate dynamics for a wide range of hyperelastic materials and can be easily integrated with upstream and downstream deep modules to enable end-to-end 4D generation.
- Augmenting physical simulators with stochastic neural networks: Case study of planar pushing and bouncing. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3066–3073. IEEE, 2018.
- Wasserstein gan, 2017.
- Tc4d: Trajectory-conditioned text-to-4d generation. arXiv preprint arXiv:2403.17920, 2024a.
- 4d-fy: Text-to-4d generation using hybrid score distillation sampling. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024b.
- Real-time subspace integration for st. venant-kirchhoff deformable models. ACM transactions on graphics (TOG), 24(3):982–990, 2005.
- Interaction networks for learning about objects, relations and physics. Advances in neural information processing systems, 29, 2016.
- Align your latents: High-resolution video synthesis with latent diffusion models, 2023.
- Projective dynamics: fusing constraint projections for fast simulation. ACM Trans. Graph., 33(4), 2014.
- FA Brogan. An element independent corotational procedure for the treatment of large rotations. Journal of Pressure Vessel Technology, 108:165, 1986.
- Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012, 2015.
- A compositional object-based approach to learning physical dynamics. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017.
- Rewon Child. Very deep vaes generalize autoregressive models and can outperform them on images. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021.
- Physics informed neural fields for smoke reconstruction with sparse data. ACM Trans. Graph., 41(4), 2022.
- Nice: Non-linear independent components estimation, 2015.
- Density estimation using real NVP. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017.
- Pie-nerf: Physics-based interactive elastodynamics with nerf. arXiv preprint arXiv:2311.13099, 2023.
- Gaussian splashing: Dynamic fluid synthesis with gaussian splatting. arXiv preprint arXiv:2401.15318, 2024.
- Coercing machine learning to output physically accurate results. J. Comput. Phys., 406:109099, 2020.
- Sharp interface approaches and deep learning techniques for multiphase flows. Journal of Computational Physics, 380:442–463, 2019.
- Finite difference method for numerical computation of discontinuous solutions of the equations of fluid dynamics. Matematičeskij sbornik, 47(3):271–306, 1959.
- Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
- Improved training of wasserstein gans. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 5767–5777, 2017.
- Flexible diffusion modeling of long videos. Advances in Neural Information Processing Systems, 35:27953–27965, 2022.
- Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
- Imagen video: High definition video generation with diffusion models. arXiv preprint arXiv:2210.02303, 2022a.
- Video diffusion models. Advances in Neural Information Processing Systems, 35:8633–8646, 2022b.
- The finite element method for engineers. John Wiley & Sons, 2001.
- Pytorch. Programming with TensorFlow: Solution for Edge Computing Applications, pages 87–104, 2021.
- Zero-shot text-guided object generation with dream fields. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 867–876, 2022.
- Vr-gs: A physical dynamics-aware interactive gaussian splatting system in virtual reality. arXiv preprint arXiv:2401.16663, 2024.
- Dreampose: Fashion image-to-video synthesis via stable diffusion. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pages 22623–22633. IEEE, 2023.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4), 2023.
- Glow: Generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pages 10236–10245, 2018.
- Auto-encoding variational Bayes. In International Conference on Learning Representations (ICLR), 2014.
- Neural relational inference for interacting systems. In International conference on machine learning, pages 2688–2697. PMLR, 2018.
- Penetration-free projective dynamics on the gpu. 2022.
- Pac-nerf: Physics augmented continuum neural radiance fields for geometry-agnostic system identification. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023.
- Learning compositional koopman operators for model-based control. arXiv preprint arXiv:1910.08264, 2019a.
- Learning particle dynamics for manipulating rigid bodies, deformable objects, and fluids. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019b.
- Propagation networks for model-based control under partial observation. In 2019 International Conference on Robotics and Automation (ICRA), pages 1205–1211. IEEE, 2019c.
- Magic3d: High-resolution text-to-3d content creation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 300–309, 2023.
- Align your gaussians: Text-to-4d with dynamic 3d gaussians and composed diffusion models. arXiv preprint arXiv:2312.13763, 2023.
- One-2-3-45: Any single image to 3d mesh in 45 seconds without per-shape optimization. Advances in Neural Information Processing Systems, 36, 2024.
- Zero-1-to-3: Zero-shot one image to 3d object. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9298–9309, 2023.
- Fast simulation of mass-spring systems. ACM Transactions on Graphics (TOG), 32(6):1–7, 2013.
- Quasi-newton methods for real-time simulation of hyperelastic materials. ACM Transactions on Graphics (TOG), 36(3):23, 2017.
- Lars M. Mescheder. On the convergence properties of GAN training. CoRR, abs/1801.04406, 2018.
- Latent-nerf for shape-guided generation of 3d shapes and textures. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12663–12673, 2023.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
- Position based dynamics. Journal of Visual Communication and Image Representation, 18(2):109–118, 2007.
- Richard M Murray. Nonlinear control of mechanical systems: A lagrangian perspective. Annual Reviews in Control, 21:31–42, 1997.
- Conditional image-to-video generation with latent flow diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18444–18455, 2023.
- Solving inverse-pde problems with physics-aware neural networks. Journal of Computational Physics, 440:110414, 2021.
- Dreamfusion: Text-to-3d using 2d diffusion. In The Eleventh International Conference on Learning Representations, 2022.
- Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics, 378:686–707, 2019.
- Junuthula Narasimha Reddy. An introduction to the finite element method. New York, 27:14, 1993.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
- Graph networks as learnable physics engines for inference and control. In International conference on machine learning, pages 4470–4479. PMLR, 2018.
- Make-it-4d: Synthesizing a consistent long-term dynamic scene video from a single image. In Proceedings of the 31st ACM International Conference on Multimedia, pages 8167–8175, 2023.
- Fem simulation of 3d deformable solids: a practitioner’s guide to theory, discretization and model reduction. In Acm siggraph 2012 courses, pages 1–50. 2012.
- Text-to-4d dynamic scene generation. arXiv preprint arXiv:2301.11280, 2023.
- Deep unsupervised learning using nonequilibrium thermodynamics. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, pages 2256–2265. JMLR.org, 2015.
- Solver-in-the-loop: Learning from differentiable physics to interact with iterative pde-solvers. Advances in Neural Information Processing Systems, 33:6111–6122, 2020.
- NVAE: A deep hierarchical variational autoencoder. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
- Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. Advances in Neural Information Processing Systems, 36, 2024.
- Adaptive nonlinear finite elements for deformable body simulation using dynamic progressive meshes. In Computer Graphics Forum, pages 349–358. Wiley Online Library, 2001.
- Physgaussian: Physics-integrated 3d gaussians for generative dynamics. arXiv preprint arXiv:2311.12198, 2023.
- Comp4d: Llm-guided compositional 4d scene generation. arXiv preprint arXiv:2403.16993, 2024.
- Nonlinear material design using principal stretches. ACM Transactions on Graphics (TOG), 34(4):1–11, 2015.
- Learning physical constraints with neural projections. Advances in Neural Information Processing Systems, 33:5178–5189, 2020.
- 4dgen: Grounded 4d content generation with spatial-temporal consistency. arXiv preprint arXiv:2312.17225, 2023.
- Metadiff: Meta-learning with conditional diffusion for few-shot learning. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 16687–16695, 2024.
- An efficient multigrid method for the simulation of high-resolution elastic solids. ACM Trans. Graph., 29(2), 2010.
- The finite element method in engineering science. McGraw-hill London, 1971.
- The finite element method: its basis and fundamentals. Elsevier, 2005.