D-Cubed: Latent Diffusion Trajectory Optimisation for Dexterous Deformable Manipulation (2403.12861v1)
Abstract: Mastering dexterous robotic manipulation of deformable objects is vital for overcoming the limitations of parallel grippers in real-world applications. Current trajectory optimisation approaches often struggle to solve such tasks due to the large search space and the limited task information available from a cost function. In this work, we propose D-Cubed, a novel trajectory optimisation method using a latent diffusion model (LDM) trained from a task-agnostic play dataset to solve dexterous deformable object manipulation tasks. D-Cubed learns a skill-latent space that encodes short-horizon actions in the play dataset using a VAE and trains a LDM to compose the skill latents into a skill trajectory, representing a long-horizon action trajectory in the dataset. To optimise a trajectory for a target task, we introduce a novel gradient-free guided sampling method that employs the Cross-Entropy method within the reverse diffusion process. In particular, D-Cubed samples a small number of noisy skill trajectories using the LDM for exploration and evaluates the trajectories in simulation. Then, D-Cubed selects the trajectory with the lowest cost for the subsequent reverse process. This effectively explores promising solution areas and optimises the sampled trajectories towards a target task throughout the reverse diffusion process. Through empirical evaluation on a public benchmark of dexterous deformable object manipulation tasks, we demonstrate that D-Cubed outperforms traditional trajectory optimisation and competitive baseline approaches by a significant margin. We further demonstrate that trajectories found by D-Cubed readily transfer to a real-world LEAP hand on a folding task.
- Learning dexterous in-hand manipulation. The International Journal of Robotics Research, 39(1):3–20, 2020.
- A system for general in-hand object re-orientation. In Aleksandra Faust, David Hsu, and Gerhard Neumann, editors, Proceedings of the 5th Conference on Robot Learning, volume 164 of Proceedings of Machine Learning Research, pages 297–307. PMLR, 08–11 Nov 2022.
- Deep dynamics models for learning dexterous manipulation. In Conference on Robot Learning, pages 1101–1112. PMLR, 2020.
- Contact-invariant optimization for hand manipulation. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, page 137–144, 2012.
- Dexterous manipulation using both palm and fingers. In 2014 IEEE International Conference on Robotics and Automation (ICRA), pages 1560–1565, 2014.
- Relaxed-rigidity constraints: Kinematic trajectory optimization and collision avoidance for in-grasp manipulation, 2018.
- Solving challenging dexterous manipulation tasks with trajectory optimisation and reinforcement learning, 2021.
- Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation. Robotics and Autonomous Systems, 112, 11 2018.
- Learning from demonstrations for autonomous soft-tissue retraction. In 2021 International Symposium on Medical Robotics (ISMR), pages 1–7. IEEE, 2021.
- Robocraft: Learning to see, simulate, and shape elasto-plastic objects with graph networks. arXiv preprint arXiv:2205.02909, 2022.
- Robocook: Long-horizon elasto-plastic object manipulation with diverse tools. arXiv preprint arXiv:2306.14447, 2023.
- Plasticinelab: A soft-body manipulation benchmark with differentiable physics, 2021.
- Rethinking optimization with differentiable simulation from a global perspective. In 6th Annual Conference on Robot Learning, 2022.
- Dexdeform: Dexterous deformable object manipulation with human demonstrations and differentiable physics. In The Eleventh International Conference on Learning Representations, 2023.
- Denoising diffusion probabilistic models. arXiv preprint arxiv:2006.11239, 2020.
- Reuven Y. Rubinstein. Optimization of computer simulation models with rare events. European Journal of Operational Research, 99(1):89–112, 1997.
- Marin Kobilarov. Cross-entropy motion planning. The International Journal of Robotics Research, 31(7):855–871, 2012.
- Softgym: Benchmarking deep reinforcement learning for deformable object manipulation. In Conference on Robot Learning, pages 432–448. PMLR, 2021.
- Benchmarking deformable object manipulation with differentiable physics. arXiv preprint arXiv:2210.13066, 2022.
- Benchmarking the sim-to-real gap in cloth manipulation, 2024.
- ShadowRobot. Shadowrobot dexterous hand., 2015.
- Optimal control with learned local models: Application to dexterous manipulation. In 2016 IEEE International Conference on Robotics and Automation (ICRA), pages 378–383, 2016.
- Leveraging scene embeddings for gradient-based motion planning in latent space. In 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023.
- Differentiable simulation of soft multi-body systems. Advances in Neural Information Processing Systems, 34:17123–17135, 2021.
- Difftaichi: Differentiable programming for physical simulation. In International Conference on Learning Representations, 2019.
- Diffskill: Skill abstraction from differentiable physics for deformable object manipulations with tools. In International Conference on Learning Representations, 2022.
- Aggressive driving with model predictive path integral control. In 2016 IEEE International Conference on Robotics and Automation (ICRA), pages 1433–1440, 2016.
- 3d diffusion policy, 2024.
- Deep dynamics models for learning dexterous manipulation, 2019.
- Learning complex dexterous manipulation with deep reinforcement learning and demonstrations, 2018.
- Solving rubik’s cube with a robot hand. ArXiv, abs/1910.07113, 2019.
- Deep unsupervised learning using nonequilibrium thermodynamics. In Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 2256–2265. PMLR, 07–09 Jul 2015.
- Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2020.
- Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
- Diffusion policy: Visuomotor policy learning via action diffusion. arXiv preprint arXiv:2303.04137, 2023.
- Planning with diffusion for flexible behavior synthesis. In International Conference on Machine Learning, 2022.
- Adaptdiffuser: Diffusion models as adaptive self-evolving planners. arXiv preprint arXiv:2302.01877, 2023.
- World models via policy-guided trajectory diffusion, 2023.
- Denoising likelihood score matching for conditional score-based data generation. In International Conference on Learning Representations, 2022.
- Diffusion-es: Gradient-free planning with diffusion for autonomous driving and zero-shot instruction following. arXiv preprint arXiv:2402.06559, 2024.
- Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pages 2256–2265. PMLR, 2015.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2):3, 2022.
- Dexpilot: Vision-based teleoperation of dexterous robotic hand-arm system. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 9164–9170. IEEE, 2020.
- Accelerating reinforcement learning with learned skill priors. In Conference on robot learning, pages 188–204. PMLR, 2021.
- Playfusion: Skill acquisition via diffusion from language-annotated play. In Conference on Robot Learning, pages 2012–2029. PMLR, 2023.
- Reuven Rubinstein. The cross-entropy method for combinatorial and continuous optimization. Method. Comput. Appl. Prob., 1(2):127–190, sep 1999.
- Sample-efficient cross-entropy method for real-time planning. In Conference on Robot Learning, pages 1049–1065. PMLR, 2021.
- Online variants of the cross-entropy method, 2008.
- Sinkhorn divergences for unbalanced optimal transport. arXiv preprint arXiv:1910.12958, 2019.
- Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
- Deep reinforcement learning at the edge of the statistical precipice. Advances in Neural Information Processing Systems, 2021.
- Leap hand: Low-cost, efficient, and anthropomorphic hand for robot learning. Robotics: Science and Systems (RSS), 2023.
- Isaac gym: High performance gpu-based physics simulation for robot learning, 2021.
- SAPIEN: A simulated part-based interactive environment. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
- Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.