Unsupervised Feature Learning for Manipulation with Contrastive Domain Randomization (2103.11144v2)

Published 20 Mar 2021 in cs.LG, cs.AI, and cs.RO

Abstract: Robotic tasks such as manipulation with visual inputs require image features that capture the physical properties of the scene, e.g., the position and configuration of objects. Recently, it has been suggested to learn such features in an unsupervised manner from simulated, self-supervised, robot interaction; the idea being that high-level physical properties are well captured by modern physical simulators, and their representation from visual inputs may transfer well to the real world. In particular, learning methods based on noise contrastive estimation have shown promising results. To robustify the simulation-to-real transfer, domain randomization (DR) was suggested for learning features that are invariant to irrelevant visual properties such as textures or lighting. In this work, however, we show that a naive application of DR to unsupervised learning based on contrastive estimation does not promote invariance, as the loss function maximizes mutual information between the features and both the relevant and irrelevant visual properties. We propose a simple modification of the contrastive loss to fix this, exploiting the fact that we can control the simulated randomization of visual properties. Our approach learns physical features that are significantly more robust to visual domain variation, as we demonstrate using both rigid and non-rigid objects.

Citations (3)

View on Semantic Scholar

Collections

Sign up for free to add this paper to one or more collections.

Sign Up

Summary

The paper introduces Contrastive Domain Randomization (CDR), an unsupervised feature learning method that combines contrastive learning with domain randomization to enhance sim-to-real transfer for robot manipulation.
Experiments demonstrated that CDR representations significantly outperform baselines in information retrieval tasks, showing improved learning of domain-invariant features essential for manipulation.
In real-world robot planning tasks, using CDR features led to superior performance and reduced final-state errors compared to baselines, indicating practical utility and robustness for real-world applications.

Unsupervised Feature Learning for Manipulation with Contrastive Domain Randomization

This paper presents an innovative approach to unsupervised feature learning within robotic manipulation tasks by proposing a method termed Contrastive Domain Randomization (CDR). The approach aims to address and enhance domain-invariant representation learning by integrating contrastive learning with domain randomization, focusing on enhancing feature transferability from simulated environments to real-world applications.

Core Contributions

The paper constructs a method that extends traditional contrastive learning—a technique popular in self-supervised learning—by incorporating domain randomization, a strategy typically used for domain adaptation. The authors argue that conventional application of domain randomization in a contrastive learning framework does not inherently promote invariance to irrelevant visual properties (e.g., textures and lighting variations within simulated environments). To remedy this, the authors introduce a refined contrastive loss that promotes invariance by independently managing the simulated randomizations of visual features.

Methodological Insights

Contrastive Domain Randomization (CDR): The fundamental enhancement CDR introduces involves manipulating the contrastive loss function. It ensures that while relevant features (such as object positions and orientations critical to the manipulation tasks) are learned, the effect of irrelevant domain-specific features is minimized. By randomizing these irrelevant features independently in both past and future observations within the training, the model robustly learns to focus only on physically relevant features. This is said to ensure a more principled and scalable learning process for general physical features transfer from simulation to reality.

Experimental Setup and Results

The methodology was validated through experiments conducted in both controlled and uncontrolled environments, using both rigid and deformable objects. The experiments revealed that CDR substantially outperformed baselines in both information retrieval tasks—where the efficacy of the representation of physical properties was directly tested—and in planning tasks on real robots, where the learned representations were used directly to achieve manipulation objectives.

Key Numerical Results:

CDR showed marked improvement in retrieving simulations of real-world configurations, with a considerable gap in Intersection Over Union (IoU) metrics between CDR and naive domain randomization approaches.
The proposed method also significantly reduced the mean squared error in feature representations, affirming the efficacy in learning domain-invariant features.
In real-world planning tasks using a robot arm, CDR achieved lower final-state-to-goal-state Euclidean distances compared to the baselines, demonstrating superior transferability and practical utility.

Implications and Future Directions

The implications of this work are multifaceted. Practically, it paves the way for more robust robot learning systems capable of navigating the nuances of simulated versus real-world discrepancies. Theoretically, it enlarges the discussion on self-supervised learning by underscoring the necessity of addressing domain variance more explicitly within learning frameworks.

Future work could explore further refining the interplay between domain adaptation and feature invariance learning. Experimental comparisons in additional complex manipulation domains could broaden understanding of CDR's capabilities and limitations. Additionally, examining the integration of CDR into more diverse types of neural architectures and its scalability with larger datasets or more complex environments may provide deeper insights into its generalization potential across diverse robotics applications.

PDF Markdown

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Related Papers

Authors (3)

YouTube

Show All Videos