Skill-based Multi-objective Reinforcement Learning of Industrial Robot Tasks with Planning and Knowledge Integration (2203.10033v1)

Published 18 Mar 2022 in cs.RO and cs.LG

Abstract: In modern industrial settings with small batch sizes it should be easy to set up a robot system for a new task. Strategies exist, e.g. the use of skills, but when it comes to handling forces and torques, these systems often fall short. We introduce an approach that provides a combination of task-level planning with targeted learning of scenario-specific parameters for skill-based systems. We propose the following pipeline: (1) the user provides a task goal in the planning language PDDL, (2) a plan (i.e., a sequence of skills) is generated and the learnable parameters of the skills are automatically identified. An operator then chooses (3) reward functions and hyperparameters for the learning process. Two aspects of our methodology are critical: (a) learning is tightly integrated with a knowledge framework to support symbolic planning and to provide priors for learning, (b) using multi-objective optimization. This can help to balance key performance indicators (KPIs) such as safety and task performance since they can often affect each other. We adopt a multi-objective Bayesian optimization approach and learn entirely in simulation. We demonstrate the efficacy and versatility of our approach by learning skill parameters for two different contact-rich tasks. We show their successful execution on a real 7-DOF KUKA-iiwa manipulator and outperform the manual parameterization by human robot operators.

PDF Abstract

Overview of Skill-based Multi-objective Reinforcement Learning for Industrial Robot Tasks

This paper introduces a novel approach that integrates symbolic planning and reinforcement learning for the execution of industrial robotic tasks. Specifically, it addresses the challenges faced in modern industrial settings characterized by small batch sizes and the necessity for adaptable robotic systems. The integration of task-level planning and scenario-specific parameter learning within skill-based robotic systems is proposed to bridge the gap often encountered when handling complex force and torque requirements.

Core Contributions

Hybrid Learning and Planning Pipeline: The paper presents a pipeline that begins with a user providing a task goal in the Planning Domain Definition Language (PDDL). The system then autonomously generates a plan comprising a sequence of skills whose learnable parameters are identified. This process involves an operator defining reward functions and hyperparameters, which are critical for the learning process.
Multi-objective Optimization Framework: A central aspect of the methodology is the incorporation of multi-objective optimization to balance key performance indicators (KPIs) such as safety and task performance. The usage of multi-objective Bayesian optimization is emphasized for its statistical efficiency in learning task parameters through simulation.
Knowledge Integration: The system tightly integrates learning with a knowledge framework that supports symbolic planning, offering priors that aid in skill optimization. This integration facilitates the automatic generation of skill-based plans with relevance to the task.
Empirical Validation: The approach is validated through simulation, focusing on two contact-rich tasks—a pushing task and a peg-in-hole task. The results demonstrate the method's ability to outperform manual parameter tuning by experienced human operators when implemented on a 7-DOF KUKA-iiwa manipulator.

Numerical Results and Claims

The research empirically highlights significant performance improvements in the execution of contact-rich tasks over manually set parameters. Specifically, the multi-objective optimization approach achieved higher success rates and reduced interaction forces in the tested scenarios, indicating superior efficiency and safety traits compared to human-calibrated parameters.

Implications and Future Directions

This work substantially contributes to the field of industrial robotics by offering a framework that effectively combines deductive (planning-based) and inductive (learning-based) paradigms to improve task autonomy. The implications extend to enhancing the adaptability and safety of robotic operations in dynamic industrial environments.

Looking forward, future research could explore the extension of this approach to more complex, multi-agent environments and investigate multi-fidelity learning strategies to further bridge the simulation-to-reality gap. Additionally, leveraging parameter priors in optimization algorithms could refine the efficiency of policy search, thus enhancing the practical applicability in varied industrial contexts.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Matthias Mayr (22 papers)
Faseeh Ahmad (11 papers)
Konstantinos Chatzilygeroudis (21 papers)
Luigi Nardi (36 papers)
Volker Krueger (17 papers)

Citations (25)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos