- The paper introduces BADGR, a novel self-supervised system that autonomously learns navigational affordances from raw sensor data without human labeling.
- The methodology leverages a time-correlated random walk for off-policy data collection combined with predictive modeling to plan optimal actions in diverse terrains.
- Experimental results validate BADGR's superior performance over geometric methods, demonstrating robust learning in both urban and off-road environments.
Autonomous Navigation in Real-World Environments Through Self-Supervised Learning: A Study
Introduction
The domain of mobile robot navigation has traditionally leaned heavily on geometric mapping and planning. This approach, while proven effective in various settings, often falters in complex real-world scenarios where purely geometric perceptions do not suffice, for instance, in distinguishing traversable tall grass from genuine obstacles. This paper presents a shift from the prevailing geometric-centric view, introducing BADGR, a system that leverages self-supervised learning for autonomous navigation. BADGR, or Berkeley Autonomous Driving Ground Robot, is a comprehensive end-to-end learning-based system, capable of navigating real-world terrains by learning physical navigational affordances directly from experience, without dependency on human labeling, simulated data, or expensive sensory equipment.
System Overview
BADGR is designed as a self-improving system that can gather, label, and learn from off-policy data autonomously. It operates by first collecting data using a time-correlated random walk policy. This data, comprising raw sensor readings and executed actions, is enriched with self-supervised labels for relevant navigational events such as collisions and terrain quality. Using this enriched dataset, BADGR trains a predictive model to forecast future navigational events based on current observations and proposed action sequences. The system's effectiveness is demonstrated through the accomplishment of tasks involving goal-reaching while avoiding obstacles and preferring smoother terrains, emphasizing learning from real-world experiences.
Mobile Robot Platform
A Clearpath Jackal is utilized as the mobile platform, equipped with a minimal but robust sensor suite including forward-facing cameras, LIDAR, IMU, GPS, and a compass. This setup, managed by an NVIDIA Jetson TX2, ensures the robot's long-term operational autonomy and data collection in diverse environments.
Data Collection and Labeling
The data collection approach focuses on off-policy data, allowing the robot to utilize all collected data efficiently for training. A time-correlated random walk policy encourages exploration, while collisions and immobility trigger automatic resets. Event labels such as collisions and terrain bumpiness are retrospectively applied in a self-supervised manner, utilizing onboard sensor readings.
Predictive Model and Planning
The predictive model, central to BADGR's operation, inputs current observations and a sequence of future actions to predict navigational outcomes. Trained on the self-supervised dataset, this model enables the robot to plan actions that maximize a user-defined reward function, encompassing objectives like goal-reaching and terrain preference. An innovative planning algorithm, leveraging a zeroth-order stochastic optimizer, iteratively refines action sequences towards optimal performance.
Experimental Validation
BADGR's capabilities are empirically validated in both urban and off-road settings, demonstrating superior performance in navigating towards goals while avoiding hazardous terrains and obstacles. Compared against traditional LIDAR-based and naïve strategies, BADGR showcases a distinct ability to learn and adapt to environmental nuances, such as the traversability of grassy areas, which geometric methods typically misinterpret. Notably, the system also exhibits potential for continual self-improvement and generalization across unseen environments.
Conclusion and Future Directions
BADGR exemplifies the potential of self-supervised learning in transcending the limitations of geometric-centric approaches for mobile robot navigation. The system’s ability to assimilate and adapt to real-world complexity signifies a crucial step towards autonomous learning and operation in varying terrains. Future enhancements could further reduce human intervention, enable online adaptation, and extend applicability to dynamic environments featuring other moving agents. Through continued refinement, self-supervised learning systems like BADGR hold promise for achieving broader autonomy in robot navigation.
Acknowledgments
Gratitude is extended to the support from ARL, NSF, DARPA, and Berkeley Deep Drive, alongside the NSF Graduate Research Fellowship awarded to Gregory Kahn. This collective backing underscores the pivotal role of interdisciplinary collaboration in advancing autonomous navigation research.