Papers
Topics
Authors
Recent
Search
2000 character limit reached

LDP: A Local Diffusion Planner for Efficient Robot Navigation and Collision Avoidance

Published 2 Jul 2024 in cs.RO and cs.AI | (2407.01950v1)

Abstract: The conditional diffusion model has been demonstrated as an efficient tool for learning robot policies, owing to its advancement to accurately model the conditional distribution of policies. The intricate nature of real-world scenarios, characterized by dynamic obstacles and maze-like structures, underscores the complexity of robot local navigation decision-making as a conditional distribution problem. Nevertheless, leveraging the diffusion model for robot local navigation is not trivial and encounters several under-explored challenges: (1) Data Urgency. The complex conditional distribution in local navigation needs training data to include diverse policy in diverse real-world scenarios; (2) Myopic Observation. Due to the diversity of the perception scenarios, diffusion decisions based on the local perspective of robots may prove suboptimal for completing the entire task, as they often lack foresight. In certain scenarios requiring detours, the robot may become trapped. To address these issues, our approach begins with an exploration of a diverse data generation mechanism that encompasses multiple agents exhibiting distinct preferences through target selection informed by integrated global-local insights. Then, based on this diverse training data, a diffusion agent is obtained, capable of excellent collision avoidance in diverse scenarios. Subsequently, we augment our Local Diffusion Planner, also known as LDP by incorporating global observations in a lightweight manner. This enhancement broadens the observational scope of LDP, effectively mitigating the risk of becoming ensnared in local optima and promoting more robust navigational decisions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. X. Xiao, B. Liu, G. Warnell, and P. Stone, “Motion planning and control for mobile robot navigation using machine learning: A survey,” Autonomous Robots, 2022.
  2. Z. J. Cui, Y. Wang, N. M. M. Shafiullah, and L. Pinto, “From play to policy: Conditional behavior generation from uncurated robot data,” arXiv preprint arXiv:2210.10047, 2022.
  3. C. Chi, S. Feng, Y. Du, Z. Xu, E. Cousineau, B. Burchfiel, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,” arXiv preprint arXiv:2303.04137, 2023.
  4. M. Wang and J. N. Liu, “Fuzzy logic-based real-time robot navigation in unknown environment with dead ends,” RAS, 2008.
  5. W. Yu, J. Peng, Q. Qiu, H. Wang, L. Zhang, and J. Ji, “Pathrl: An end-to-end path generation method for collision avoidance via deep reinforcement learning,” arXiv preprint arXiv:2310.13295, 2023.
  6. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in NIPS, 2020.
  7. J. Peng, Y. Chen, Y. Duan, Y. Zhang, J. Ji, and Y. Zhang, “Towards an online rrt-based path planning algorithm for ackermann-steering vehicles,” in IEEE ICRA, 2021.
  8. K. Kurzer, “Path planning in unstructured environments: A real-time hybrid a* implementation for fast and deterministic path generation for the kth research concept vehicle,” Master’s thesis, 2016.
  9. D. Harabor and A. Grastien, “Online graph pruning for pathfinding on grid maps,” in AAAI, 2011.
  10. L. E. Kavraki, P. Svestka, J.-C. Latombe, and M. H. Overmars, “Probabilistic roadmaps for path planning in high-dimensional configuration spaces,” IEEE TRO, 1996.
  11. S. M. LaValle, J. J. Kuffner, B. Donald, et al., “Rapidly-exploring random trees: Progress and prospects,” Algorithmic and computational robotics: new directions, 2001.
  12. C. Rösmann, W. Feiten, T. Wösch, F. Hoffmann, and T. Bertram, “Trajectory modification considering dynamic constraints of autonomous robots,” in ROBOTIK, 2012.
  13. S. Yao, G. Chen, Q. Qiu, J. Ma, X. Chen, and J. Ji, “Crowd-aware robot navigation for pedestrians with multiple collision avoidance strategies via map-based deep reinforcement learning,” in IEEE/RSJ IROS, 2021.
  14. Q. Qiu, S. Yao, J. Wang, J. Ma, G. Chen, and J. Ji, “Learning to socially navigate in pedestrian-rich environments with interaction capacity,” arXiv preprint arXiv:2203.16154, 2022.
  15. C. Chi, Z. Xu, C. Pan, E. Cousineau, B. Burchfiel, S. Feng, R. Tedrake, and S. Song, “Universal manipulation interface: In-the-wild robot teaching without in-the-wild robots,” in arXiv, 2024.
  16. D. A. Pomerleau, “Alvinn: An autonomous land vehicle in a neural network,” Advances in NIPS, 1988.
  17. G. You, X. Chu, Y. Duan, J. Peng, J. Ji, Y. Zhang, and Y. Zhang, “P 3 o: Transferring visual representations for reinforcement learning via prompting,” in IEEE ICME, 2023.
  18. B. et al., “End to end learning for self-driving cars,” arXiv:1604.07316 [cs], 2016.
  19. Y. Du and I. Mordatch, “Implicit generation and generalization in energy-based models,” arXiv preprint arXiv:1903.08689, 2019.
  20. J. Carvalho, A. T. Le, M. Baierl, D. Koert, and J. Peters, “Motion planning diffusion: Learning and planning of robot motions with diffusion models,” in IEEE/RSJ IROS, 2023.
  21. A. Sridhar, D. Shah, C. Glossop, and S. Levine, “Nomad: Goal masked diffusion policies for navigation and exploration,” arXiv preprint arXiv:2310.07896, 2023.
  22. M. Janner, Y. Du, J. B. Tenenbaum, and S. Levine, “Planning with diffusion for flexible behavior synthesis,” arXiv preprint arXiv:2205.09991, 2022.
  23. Z. Zhu, H. Zhao, H. He, Y. Zhong, S. Zhang, Y. Yu, and W. Zhang, “Diffusion models for reinforcement learning: A survey,” arXiv preprint arXiv:2311.01223, 2023.
  24. H. He, C. Bai, K. Xu, Z. Yang, W. Zhang, D. Wang, B. Zhao, and X. Li, “Diffusion model is an effective planner and data synthesizer for multi-task reinforcement learning,” Advances in NIPS, 2024.
  25. J. Ho and T. Salimans, “Classifier-free diffusion guidance,” arXiv preprint arXiv:2207.12598, 2022.
  26. A. Nichol, P. Dhariwal, A. Ramesh, P. Shyam, P. Mishkin, B. McGrew, I. Sutskever, and M. Chen, “Glide: Towards photorealistic image generation and editing with text-guided diffusion models,” arXiv preprint arXiv:2112.10741, 2021.
  27. A. Mandlekar, D. Xu, J. Wong, S. Nasiriany, C. Wang, R. Kulkarni, L. Fei-Fei, S. Savarese, Y. Zhu, and R. Martín-Martín, “What matters in learning from offline human demonstrations for robot manipulation,” arXiv preprint arXiv:2108.03298, 2021.
  28. P. Florence, C. Lynch, A. Zeng, O. A. Ramirez, A. Wahid, L. Downs, A. Wong, J. Lee, I. Mordatch, and J. Tompson, “Implicit behavioral cloning,” in CoRL.   PMLR, 2022.
  29. L. Chen, K. Lu, A. Rajeswaran, K. Lee, A. Grover, M. Laskin, P. Abbeel, A. Srinivas, and I. Mordatch, “Decision transformer: Reinforcement learning via sequence modeling,” Advances in NIPS, 2021.
  30. N. Yokoyama, S. Ha, and D. Batra, “Success weighted by completion time: A dynamics-aware evaluation criteria for embodied navigation,” in IEEE/RSJ IROS, 2021.
  31. X. Liu, C. Gong, and Q. Liu, “Flow straight and fast: Learning to generate and transfer data with rectified flow,” arXiv preprint arXiv:2209.03003, 2022.
  32. Y. Song, P. Dhariwal, M. Chen, and I. Sutskever, “Consistency models,” 2023.
Citations (8)

Summary

  • The paper presents a novel local diffusion planner (LDP) that leverages conditional diffusion models to integrate global path guidance with local planning for improved collision avoidance in dynamic environments.
  • The method combines expert data collection and Soft Actor-Critic reinforcement learning across diverse scenarios to achieve higher success rates and superior SPL metrics compared to baseline approaches.
  • Empirical results from both simulation and real-world tests demonstrate LDP’s robust performance in avoiding obstacles and overcoming local minima, paving the way for advanced autonomous navigation systems.

Summary of "LDP: A Local Diffusion Planner for Efficient Robot Navigation and Collision Avoidance" (2407.01950)

Introduction

The complexity inherent in real-world navigation scenarios poses significant challenges for developing robust robot navigation policies. This paper introduces a Local Diffusion Planner (LDP) that leverages conditional diffusion models to address the demand for efficient robot navigation and collision avoidance. The proposed method capitalizes on diffusion models to capture the intricate conditional distribution of robotic policies within dynamic environments fraught with obstacles. The diffusion model's utility lies primarily in tackling two pivotal obstacles: Data Urgency and Myopic Observation, which impede the effective deployment of navigation policies through local observations alone. Figure 1

Figure 1: The diagram illustrates the execution of our method. Obstacles are denoted by black circles and rectangles, while the trajectories of pedestrians are represented by green circles. The navigation target is marked by a yellow pentagram, and a brown dashed line delineates the global path from the robot's starting point to its target.

Methodology

The LDP approach involves data collection from expert policies across diverse scenarios, using reinforcement learning strategies such as Soft Actor-Critic (SAC). The expert data is gathered across three distinct environmental paradigms: static, dynamic with pedestrian interactions, and maze-like environments. Subsequently, a diffusion agent is engineered, integrating global path information into the local planning framework, thereby broadening the observational scope and improving navigation robustness.

The novelty in LDP's architecture lies in its conditional guidance mechanism, whereby the diffusion model utilizes global paths as a condition in the denoise process, thereby fostering a more informed trajectory generation. The network's underlying structure capitalizes on the DDPM paradigm, enhanced through classifier-free guidance to optimize the policy's capacity to generalize across scenarios with mixed preferences. Figure 2

Figure 2: An in-depth depiction of the entire process and the architecture of the local diffusion planner.

Experimental Results

Empirical evaluations of LDP demonstrate its superiority over existing baseline navigation solutions such as LSTM-GMM, IBC, and Decision Transformer (DT). LDP's performance metrics underscore its proficiency in achieving higher success rates and improved SPL metrics across both training scenarios and unseen environments, emphasizing the policy's robust generalization capabilities. Figure 3

Figure 3

Figure 3

Figure 3

Figure 3: Four different simulation scenarios are displayed. The black rectangles and circles are obstacles, the green dots represent pedestrian trajectories, and the blue box on the right shows the robot's local sensor map.

In ablation studies, the inclusion of global path conditions significantly enhances LDP's decision-making efficiency, particularly in maze-like scenarios where the risk of local minima is high. Experimental comparisons also highlight the benefits of training on expert data with mixed preferences, which enriches the policy's efficacy. Figure 4

Figure 4: Global Path Influence: Navigation Success vs. Failure in One Scene.

Practical Implications and Future Work

The real-world deployment of LDP on an Ackerman-steering robot illustrates its practical viability, with promising results in terms of collision avoidance and navigation efficiency. Figure 5

Figure 5: Schematic diagram of real robots and test scenarios.

In future endeavors, expanding LDP's applicability involves improving its real-time performance and integrating higher quality datasets for training. Transitioning to flow-based diffusion models could enhance sampling speed, thereby offering substantial improvements in the real-world deployment of autonomous navigation systems.

Conclusion

The LDP framework presents a significant advancement in the field of robot navigation by integrating diffusion models with real-time motion planning. LDP not only exceeds traditional approaches in versatility and robustness but also paves the way for future research in complex dynamic environments. The methodology's success in addressing key challenges in robot navigation through seamless integration of global and local planning insights marks a notable contribution to the domain.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 34 likes about this paper.