ACNMP: Skill Transfer and Task Extrapolation through Learning from Demonstration and Reinforcement Learning via Representation Sharing (2003.11334v3)

Published 25 Mar 2020 in cs.RO, cs.AI, and cs.LG

Abstract: To equip robots with dexterous skills, an effective approach is to first transfer the desired skill via Learning from Demonstration (LfD), then let the robot improve it by self-exploration via Reinforcement Learning (RL). In this paper, we propose a novel LfD+RL framework, namely Adaptive Conditional Neural Movement Primitives (ACNMP), that allows efficient policy improvement in novel environments and effective skill transfer between different agents. This is achieved through exploiting the latent representation learned by the underlying Conditional Neural Process (CNP) model, and simultaneous training of the model with supervised learning (SL) for acquiring the demonstrated trajectories and via RL for new trajectory discovery. Through simulation experiments, we show that (i) ACNMP enables the system to extrapolate to situations where pure LfD fails; (ii) Simultaneous training of the system through SL and RL preserves the shape of demonstrations while adapting to novel situations due to the shared representations used by both learners; (iii) ACNMP enables order-of-magnitude sample-efficient RL in extrapolation of reaching tasks compared to the existing approaches; (iv) ACNMPs can be used to implement skill transfer between robots having different morphology, with competitive learning speeds and importantly with less number of assumptions compared to the state-of-the-art approaches. Finally, we show the real-world suitability of ACNMPs through real robot experiments that involve obstacle avoidance, pick and place and pouring actions.

View on arXiv

Authors (6)

M. Tuluhan Akbulut (2 papers)
Erhan Oztop (22 papers)
M. Yunus Seker (7 papers)
Honghu Xue (4 papers)
Ahmet E. Tekden (3 papers)
Emre Ugur (37 papers)

Citations (2)

View on Semantic Scholar

Summary

Overview of ACNMP: Skill Transfer and Task Extrapolation through LfD and RL

The paper "ACNMP: Skill Transfer and Task Extrapolation through Learning from Demonstration and Reinforcement Learning via Representation Sharing," authored by Akbulut et al., presents an innovative approach for enhancing robotic skills. The primary objective is to enable robots to effectively transfer and extrapolate skills using an integrative framework that combines Learning from Demonstration (LfD) with Reinforcement Learning (RL), termed Adaptive Conditional Neural Movement Primitives (ACNMP).

The ACNMP framework is built on the foundation of Conditional Neural Processes (CNPs), which provide a robust latent representation that is crucial for adapting skills between tasks and varying robotic morphologies. This setup allows for simultaneous training via Supervised Learning (SL) and RL, thereby maintaining the integrity of demonstrated skills while adapting to novel environments and constraints.

Key Contributions

Efficient Extrapolation and Skill Transfer: The ACNMP framework allows for highly efficient extrapolation of skills to new contexts. Through rigorous simulation experiments, it has been demonstrated that ACNMP requires significantly fewer samples to adapt to new environments compared to existing methodologies. For instance, it achieves task adaptation with an order of magnitude greater sample efficiency than state-of-the-art adaptive Movement Primitives such as ProMPs.
Maintaining Demonstration Fidelity: The dual-training approach leveraging SL and RL ensures the retention of qualitative characteristics of learned demonstrations while exploring new task parameters. The paper provides numerical evidence that the system combines the exploration capabilities of RL with the fidelity of LfD, preserving the shape of demonstrations.
Cross-Morphology Skill Transfer: One of the remarkable aspects of ACNMP is its ability to facilitate skill transfer between robots with differing morphologies. This is achieved by forming a common representation space that allows a robot to learn new tasks from another robot's execution, leveraging shared latent spaces for trajectory generalization.
Real-World Applicability: The framework's real-world applicability has been validated through various robotic experiments, including tasks involving obstacle avoidance and object manipulation. These practical demonstrations underline ACNMP's capability to integrate seamlessly with existing robotic platforms for complex tasks like pouring and pick-and-place operations.

Experimental Validation

Simulation experiments were conducted to test ACNMP's performance on extrapolation tasks such as obstacle avoidance and pushing tasks in a simulated environment. ACNMP's adaptation capabilities were compared against ProMPs, highlighting the significantly reduced number of required trials for task success in both interpolation and extrapolation scenarios.

Transfer learning capabilities were analyzed in a task involving two differently built robot arms, showing that ACNMP could achieve task adaptation through inter-robot latent space alignment without explicit state matching. This approach requires fewer assumptions and exhibits faster learning convergence than existing transfer methodologies.

Implications and Future Directions

The ACNMP framework offers a significant leap in autonomous robotic learning, particularly in tasks requiring dynamic adaptation and skill transfer in complex environments. Its efficiency and effectiveness in sample usage make it particularly attractive for real-world applications where trial-and-error learning is cost-prohibitive.

Future research may focus on further refining the balance between exploration and skill preservation, possibly integrating more advanced RL techniques or exploring alternative neural architectures to enhance robustness further. Additionally, expanding ACNMP's capabilities to handle tasks involving intricate interactions and environmental changes would be a logical progression.

In conclusion, ACNMP sets a promising direction for skill transfer and task adaptation in robotics, leveraging the strengths of both LfD and RL to create a versatile and efficient learning framework.

PDF Markdown

Related Papers

Find Related Papers

YouTube

Show All Videos