Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation (2403.03890v1)

Published 6 Mar 2024 in cs.RO, cs.AI, cs.CV, and cs.LG

Abstract: This paper introduces Hierarchical Diffusion Policy (HDP), a hierarchical agent for multi-task robotic manipulation. HDP factorises a manipulation policy into a hierarchical structure: a high-level task-planning agent which predicts a distant next-best end-effector pose (NBP), and a low-level goal-conditioned diffusion policy which generates optimal motion trajectories. The factorised policy representation allows HDP to tackle both long-horizon task planning while generating fine-grained low-level actions. To generate context-aware motion trajectories while satisfying robot kinematics constraints, we present a novel kinematics-aware goal-conditioned control agent, Robot Kinematics Diffuser (RK-Diffuser). Specifically, RK-Diffuser learns to generate both the end-effector pose and joint position trajectories, and distill the accurate but kinematics-unaware end-effector pose diffuser to the kinematics-aware but less accurate joint position diffuser via differentiable kinematics. Empirically, we show that HDP achieves a significantly higher success rate than the state-of-the-art methods in both simulation and real-world.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Is conditional generative modeling all you need for decision making? In The Eleventh International Conference on Learning Representations, 2023.
  2. Hindsight experience replay. Advances in neural information processing systems, 30, 2017.
  3. Imitation learning as state matching via differentiable physics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7846–7855, 2023a.
  4. Daxbench: Benchmarking deformable object manipulation with differentiable physics. In The Eleventh International Conference on Learning Representations, 2023b.
  5. Diffusion policy: Visuomotor policy learning via action diffusion. In Proceedings of Robotics: Science and Systems (RSS), 2023.
  6. Act3d: Infinite resolution action detection transformer for robotic manipulation. arXiv preprint arXiv:2306.17817, 2023.
  7. Rvt: Robotic view transformer for 3d object manipulation. arXiv preprint arXiv:2306.14896, 2023.
  8. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016.
  9. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598, 2022.
  10. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  11. Imagen video: High definition video generation with diffusion models. arXiv preprint arXiv:2210.02303, 2022.
  12. Difftaichi: Differentiable programming for physical simulation. arXiv preprint arXiv:1910.00935, 2019.
  13. Plasticinelab: A soft-body manipulation benchmark with differentiable physics. arXiv preprint arXiv:2104.03311, 2021.
  14. Perceiver io: A general architecture for structured inputs & outputs. arXiv preprint arXiv:2107.14795, 2021.
  15. Coarse-to-fine q-attention with learned path ranking. arXiv preprint arXiv:2204.01571, 2022a.
  16. Coarse-to-fine q-attention with tree expansion. arXiv preprint arXiv:2204.12471, 2022b.
  17. Q-attention: Enabling efficient learning for vision-based robotic manipulation. IEEE Robotics and Automation Letters, 7(2):1612–1619, 2022.
  18. Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task. In Conference on Robot Learning, pages 334–343. PMLR, 2017.
  19. Rlbench: The robot learning benchmark & learning environment. IEEE Robotics and Automation Letters, 5(2):3019–3026, 2020.
  20. Coarse-to-fine q-attention: Efficient learning for visual robotic manipulation via discretisation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13739–13748, 2022.
  21. Planning with diffusion for flexible behavior synthesis. arXiv preprint arXiv:2205.09991, 2022.
  22. Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation. arXiv preprint arXiv:1806.10293, 2018.
  23. Efficient diffusion policies for offline reinforcement learning. arXiv preprint arXiv:2305.20081, 2023.
  24. Learning multi-level hierarchies with hindsight. arXiv preprint arXiv:1712.00948, 2017.
  25. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  26. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022.
  27. Sim-to-real reinforcement learning for deformable object manipulation. In Conference on Robot Learning, pages 734–743. PMLR, 2018.
  28. Hierarchical reinforcement learning under mixed observability. In International Workshop on the Algorithmic Foundations of Robotics, pages 188–204. Springer, 2022.
  29. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30, 2017.
  30. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  31. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2):3, 2022.
  32. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
  33. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
  34. Perceiver-actor: A multi-task transformer for robotic manipulation. In Conference on Robot Learning, pages 785–799. PMLR, 2023.
  35. Make-a-video: Text-to-video generation without text-video data. arXiv preprint arXiv:2209.14792, 2022.
  36. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  37. Diffusion policies as an expressive policy class for offline reinforcement learning. In The Eleventh International Conference on Learning Representations, 2023.
  38. Novel view synthesis with diffusion models. arXiv preprint arXiv:2210.04628, 2022.
  39. Learning to manipulate deformable objects without demonstrations. arXiv preprint arXiv:1910.13439, 2019.
  40. Unifying diffusion models with action detection transformers for multi-task robotic manipulation. In 7th Annual Conference on Robot Learning, 2023.
  41. Accelerated policy learning with parallel differentiable simulation. In International Conference on Learning Representations, 2022.
  42. Efficient tactile simulation with differentiability for robotic manipulation. In Conference on Robot Learning, pages 1488–1498. PMLR, 2023.
  43. On the effectiveness of fine-tuning versus meta-reinforcement learning. Advances in Neural Information Processing Systems, 35:26519–26531, 2022.
  44. Learning fine-grained bimanual manipulation with low-cost hardware. arXiv preprint arXiv:2304.13705, 2023.
  45. PyTorch Kinematics. 2023.
Citations (22)

Summary

  • The paper introduces a hierarchical diffusion policy that fuses high-level task planning with low-level, kinematics-aware motion generation.
  • It implements a dual-tier architecture featuring a task-planning agent for end-effector pose prediction and a goal-conditioned diffuser for trajectory refinement.
  • Empirical evaluations demonstrate significant success rate improvements in both simulated and real-world robotic manipulation tasks.

Hierarchical Diffusion Policy for Enhanced Robotic Manipulation Skills

Introduction to Hierarchical Diffusion Policy (HDP)

The research introduces the Hierarchical Diffusion Policy (HDP), a novel approach aimed at enhancing multi-task robotic manipulation through a factorised policy that intricately combines high-level task planning with low-level motion trajectory generation. This dual-tier architecture not only streamlines long-horizon task planning but also ensures the generation of fine-grained, low-level actions tailored for complex manipulation tasks.

Core Components of HDP

The HDP framework is constructed upon two pivotal components:

High-Level Task-Planning Agent

At the heart of HDP's high-level component lies a task-planning agent dedicated to predicting the next-best end-effector pose (NBP). This prediction is crucial for setting a goal that guides the subsequent low-level motion trajectory generation. This module is particularly essential for imparting the HDP with the capability to tackle both the spatial and temporal aspects of task planning.

Low-Level Goal-Conditioned Diffusion Policy

The low-level component of HDP, termed Robot Kinematics Diffuser (RK-Diffuser), is ingeniously designed to generate optimal motion trajectories. It leverages a kinematics-aware goal-conditioned control mechanism to ensure the generated trajectories are not only task-relevant but also adhere to the robot's kinematics constraints. This is achieved through a novel approach of producing end-effector pose and joint position trajectories simultaneously and refining them via differentiable kinematics, ensuring maximum control flexibility and accuracy.

Empirical Validation

HDP's performance was empirically evaluated across a range of simulated and real-world tasks. The results affirm HDP's superiority, with significant improvements in success rates over state-of-the-art methods. Specifically, HDP demonstrates its prowess in handling both simulation and real-world tasks, showcasing its practical applicability and effectiveness in robotic manipulation.

Theoretical Implications and Future Directions

The introduction of HDP ushers in a new perspective on robotic manipulation, highlighting the potential of integrating hierarchical policy structures with diffusion-based motion planning. The success of HDP points to the promising direction of exploring further the capabilities of diffusion models in robotic control and planning, paving the way for future advancements in AI-driven robotic manipulation.

Conclusion

HDP marks a significant stride forward in the field of robotic manipulation. By marrying high-level task planning with low-level, kinematics-aware motion generation, HDP sets a new standard for complex manipulation tasks. Its successful application across various tasks underscores the effectiveness and versatility of this approach, laying a robust foundation for future exploration in the integration of advanced AI techniques with robotic control and planning systems.