Mobi-$π$: Mobilizing Your Robot Learning Policy (2505.23692v1)

Published 29 May 2025 in cs.RO, cs.CV, and cs.LG

Abstract: Learned visuomotor policies are capable of performing increasingly complex manipulation tasks. However, most of these policies are trained on data collected from limited robot positions and camera viewpoints. This leads to poor generalization to novel robot positions, which limits the use of these policies on mobile platforms, especially for precise tasks like pressing buttons or turning faucets. In this work, we formulate the policy mobilization problem: find a mobile robot base pose in a novel environment that is in distribution with respect to a manipulation policy trained on a limited set of camera viewpoints. Compared to retraining the policy itself to be more robust to unseen robot base pose initializations, policy mobilization decouples navigation from manipulation and thus does not require additional demonstrations. Crucially, this problem formulation complements existing efforts to improve manipulation policy robustness to novel viewpoints and remains compatible with them. To study policy mobilization, we introduce the Mobi-$\pi$ framework, which includes: (1) metrics that quantify the difficulty of mobilizing a given policy, (2) a suite of simulated mobile manipulation tasks based on RoboCasa to evaluate policy mobilization, (3) visualization tools for analysis, and (4) several baseline methods. We also propose a novel approach that bridges navigation and manipulation by optimizing the robot's base pose to align with an in-distribution base pose for a learned policy. Our approach utilizes 3D Gaussian Splatting for novel view synthesis, a score function to evaluate pose suitability, and sampling-based optimization to identify optimal robot poses. We show that our approach outperforms baselines in both simulation and real-world environments, demonstrating its effectiveness for policy mobilization.

Summary

An Analysis of Mobi- $\pi$ : Mobilizing Robot Learning Policies

The paper "Mobi- $\pi$ : Mobilizing Your Robot Learning Policy" introduces a novel framework called Mobi- $\pi$ aimed at optimizing robot learning policies in mobile settings, which have traditionally been constrained by fixed camera viewpoints during training. This approach is particularly pertinent given the constraints of existing robot learning frameworks that typically assume stationary base and camera configurations, thereby limiting their deployment on mobile platforms for handling tasks demanding precision, such as pressing buttons or turning faucets.

Problem Formulation

The central concept, termed as "policy mobilization", seeks to address the mismatch between fixed training setups and dynamic mobile applications. The authors propose a decoupling of navigation and manipulation, focusing on finding in-distribution base poses for effective policy deployment without requiring retraining with additional demonstrations. This paradigm is positioned as complementary to existing robustness enhancement techniques, enabling synergy rather than competition.

Methodologies

The Mobi- $\pi$ framework is multifaceted, encompassing metrics for assessing mobilization difficulty, a suite of mobile manipulation tasks for evaluation, visualization tools, and baseline methods for comparison. Key to this framework is the introduction of a novel approach leveraging 3D Gaussian Splatting for novel view synthesis. This technique facilitates finding optimal robot poses by robust rendering of new views and applying a score function for pose evaluation, combined with sampling-based optimization to navigate the configuration space efficiently.

Results

The implementation of Mobi- $\pi$ demonstrates its utility both in simulated environments and in real-world contexts. Experimental results underscore its superiority over traditional baselines, particularly in scenarios where policies are executed in untrained or unseen environments. The paper notes specific improvements in simulation tasks such as the "Turn on Stove" and "Close Drawer," where traditional, non-policy-aware methods often fail due to unsuitable robot base pose configurations for policy execution.

Implications and Future Directions

The implications of this work are notable in both theoretical and practical domains. Theoretically, the decoupling of navigation and manipulation introduces a fresh perspective on reinforcement learning-based policy training and deployment. Practically, it opens avenues for deploying pre-trained manipulation policies across varied mobile platforms without the cumbersome need for extensive retraining.

In terms of future developments, there is significant potential for enhancing the Mobi- $\pi$ framework by incorporating more advanced scene representation techniques or integrating dynamic environmental updating capabilities. Furthermore, expanding the problem to include multi-task policies or varying base-to-camera configurations could broaden the applicability of the framework, making it more versatile in highly dynamic or multi-objective environments.

Conclusion

Mobi- $\pi$ , with its structured approach to policy mobilization, represents a significant step towards unleashing the full potential of robot learning policies in mobile applications. This paper provides a comprehensive foundation for further research and development, encouraging more nuanced integrations between navigation and policy execution paradigms. Its contributions lie not just in immediate performance enhancements but also in setting a new direction for mobile robotics research.