Papers

Topics

Authors

Recent

View all

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 77 tok/s

Gemini 2.5 Pro 52 tok/s Pro

GPT-5 Medium 30 tok/s Pro

GPT-5 High 31 tok/s Pro

GPT-4o 91 tok/s Pro

Kimi K2 178 tok/s Pro

GPT OSS 120B 385 tok/s Pro

Claude Sonnet 4 38 tok/s Pro

2000 character limit reached

NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance (2505.08712v2)

Published 13 May 2025 in cs.RO

Abstract: Learning navigation in dynamic open-world environments is an important yet challenging skill for robots. Most previous methods rely on precise localization and mapping or learn from expensive real-world demonstrations. In this paper, we propose the Navigation Diffusion Policy (NavDP), an end-to-end framework trained solely in simulation and can zero-shot transfer to different embodiments in diverse real-world environments. The key ingredient of NavDP's network is the combination of diffusion-based trajectory generation and a critic function for trajectory selection, which are conditioned on only local observation tokens encoded from a shared policy transformer. Given the privileged information of the global environment in simulation, we scale up the demonstrations of good quality to train the diffusion policy and formulate the critic value function targets with contrastive negative samples. Our demonstration generation approach achieves about 2,500 trajectories/GPU per day, 20$\times$ more efficient than real-world data collection, and results in a large-scale navigation dataset with 363.2km trajectories across 1244 scenes. Trained with this simulation dataset, NavDP achieves state-of-the-art performance and consistently outstanding generalization capability on quadruped, wheeled, and humanoid robots in diverse indoor and outdoor environments. In addition, we present a preliminary attempt at using Gaussian Splatting to make in-domain real-to-sim fine-tuning to further bridge the sim-to-real gap. Experiments show that adding such real-to-sim data can improve the success rate by 30\% without hurting its generalization capability.

Summary

The paper proposes a novel approach for robot navigation in complex environments through the Navigation Diffusion Policy (NavDP), which is trained entirely on simulated data and adeptly transferred to real-world environments across different robot embodiments. NavDP is characterized by an innovative integration of diffusion-based trajectory generation and a critic function, modelling trajectories purely based on local observations encoded by a unified policy transformer.

Key Contributions

End-to-End Training in Simulation: NavDP is orchestrated without reliance on real-world trajectory data, mitigating the constraints of traditional methods that either demand precise localization and mapping or extensive real-world demonstrations. This approach leverages the diversity of simulation data, significantly enhancing the scalability and flexibility of training.
Efficient Data Generation: The data generation pipeline employed achieves around 2,500 trajectories per GPU per day, which is 20 times more efficient than real-world data collection. This culminates in a substantial dataset comprising 363.2 km trajectories across 1,244 scenes, enhancing learning capabilities and broadening generalization.
Two-Stage Inference Framework: NavDP optimally selects safe navigation routes via a diffusion-based trajectory generation head and a critic head. The critic function, conditioned on simulated privileged information, offers spatial understanding to reinforce navigation safety through evaluating contrasting negative trajectory samples.

Numerical Results and Performance

NavDP shows superior generalization capabilities over previous methods, accomplishing remarkable zero-shot cross-embodiment navigation in diverse environments including indoor and outdoor settings on quadruped, wheeled, and humanoid robots. Experimental results highlight that the integration of simulated data with a proportion of real-to-sim data enhances the success rate by 30% in the targeted real-world scenes without derogating its generalization capacity.

Practical and Theoretical Implications

NavDP establishes a promising precedent for efficiently training navigation systems devoid of direct real-world interaction while retaining high adaptability and safety across various robotic morphologies. The implications of this work herald advancements in robotic autonomy and flexibility, empowering robots to navigate unstructured, dynamic environments with minimal localization and mapping inputs.

Theoretically, the employment of a critic function exemplifies a robust model for safety assessment, addressing sequential decision-making challenges and trajectory prediction errors. Moreover, the application of Gaussian Splatting for real-to-sim fine-tuning signals a progressive avenue for bridging the sim-to-real gap, facilitating more photorealistic and congruent evaluation platforms.

Future Directions

The paper opens avenues for exploration into language-instructed navigation, enhancing human-robot interaction capabilities, and incorporating embodiment encoding to further refine collision avoidance strategies. Additionally, a seamless integration of navigation and locomotion policies could be investigated for scenarios demanding complex three-dimensional navigation paths.

Overall, NavDP exemplifies a significant stride towards autonomous navigation that is more efficient and adaptable, with future potential burgeoned by refining data diversity, tuning real-to-sim proportions, and integrating multimodal training objectives.