Logic-Skill Programming: An Optimization-based Approach to Sequential Skill Planning

Published 7 May 2024 in cs.RO | (2405.04082v3)

Abstract: Recent advances in robot skill learning have unlocked the potential to construct task-agnostic skill libraries, facilitating the seamless sequencing of multiple simple manipulation primitives (aka. skills) to tackle significantly more complex tasks. Nevertheless, determining the optimal sequence for independently learned skills remains an open problem, particularly when the objective is given solely in terms of the final geometric configuration rather than a symbolic goal. To address this challenge, we propose Logic-Skill Programming (LSP), an optimization-based approach that sequences independently learned skills to solve long-horizon tasks. We formulate a first-order extension of a mathematical program to optimize the overall cumulative reward of all skills within a plan, abstracted by the sum of value functions. To solve such programs, we leverage the use of tensor train factorization to construct the value function space, and rely on alternations between symbolic search and skill value optimization to find the appropriate skill skeleton and optimal subgoal sequence. Experimental results indicate that the obtained value functions provide a superior approximation of cumulative rewards compared to state-of-the-art reinforcement learning methods. Furthermore, we validate LSP in three manipulation domains, encompassing both prehensile and non-prehensile primitives. The results demonstrate its capability to identify the optimal solution over the full logic and geometric path. The real-robot experiments showcase the effectiveness of our approach to cope with contact uncertainty and external disturbances in the real world.

Abstract PDF HTML Upgrade to Chat

References (41)

Citations (4)

View on Semantic Scholar

Summary

The paper introduces LSP, a novel approach that combines symbolic search with skill value optimization to sequence robot manipulation skills.
It employs Monte Carlo Tree Search and Tensor Train-based Approximate Dynamic Programming to efficiently optimize pathways to geometric goals.
Extensive experiments in non-prehensile, partly-prehensile, and prehensile domains demonstrate LSP’s superior cumulative rewards and robustness compared to baselines.

Logic-Skill Programming: An Optimization-based Approach to Sequential Skill Planning

This paper addresses the complexity associated with sequencing learned robot skills to execute lengthy and intricate manipulation tasks, focusing on optimizing these sequences based solely on the final geometric configuration. The proposed methodology offers a structured approach without relying on predefined symbolic goals, leveraging Logic-Skill Programming (LSP) in optimization-based sequential skill planning.

Sequential Skill Planning Problem

In robotics, creating a sequence of skills (such as pushing, pulling, pivoting) to accomplish complex tasks remains a challenging problem. Traditional methods often rely on generating plans that meet symbolic goals, which may not suit scenarios where tasks are defined by reaching a specific geometric state. The paper introduces LSP as an innovative method that doesn't require symbolic goal specification, offering a solution to determining optimal sequences by integrating planning with optimization principles.

Figure 1: Overview of the proposed approach: Given the evaluation function Psi of the final configuration, along with the initial symbolic state s_0 and geometric state $\overline{\bm{x}_0$, the objective of LSP is to find a solution that can accomplish the task with minimal control costs.

Methodology: Logic-Skill Programming

LSP involves an alternating structure between symbolic logic-based searches and skill value function optimizations:

Symbolic Search

This phase uses Planning Domain Definition Language (PDDL) to represent task domains, employing Monte Carlo Tree Search (MCTS) to traverse potential skill sequences. MCTS accommodates symbolic transitions based on skill preconditions/effects and optimizes through Upper Confidence Bound selection. Crucially, this approach allows a diverse exploration of sequence lengths and configurations without needing explicit target goals.

Skill Value Optimization

The optimization level queries whether sequences fulfill path/switch constraints and reduces to optimizing cumulative rewards. To ensure precision, the paper utilizes Tensor Train-based Approximate Dynamic Programming, approximating value functions across entire state spaces effectively. By leveraging Tensor Train (TT) decomposition, this stage resolves mixed-integer programming complexities via Cross-Entropy Method (CEM), yielding optimal subgoal sequences contributing to achievements of tasks focused by LSP.

Experimental Validation

The robustness of LSP is validated through extensive tests in simulation domains encompassing:

Non-Prehensile Manipulation (NPM): Involves manipulation of non-graspable objects via contacts and planar primitives.
Partly-Prehensile Manipulation (PPM): Encompasses mixed strategy for objects graspable in limited orientations.
Prehensile Manipulation (PM): Objects requiring direct grasp strategies, tackling beyond-reach tasks.

The experiments exhibit how LSP efficiently plans skill trajectories, optimizing paths for overall reward peaks while achieving final configuration objectives across diverse manipulation scenarios.

Figure 2: Non-Prehensile Manipulation domain.

Comparison with Baselines

Comparative analyses against frameworks like STAP highlight that, although faster, STAP depends on symbolic goal descriptions to work. LSP, contrastingly, optimizes across logic-geometric paths, yielding solutions with higher cumulative rewards. This characteristic is notably advantageous in scenarios defined by environmental uncertainties and complex dynamics.

Figure 3: Non-Prehensile Manipulation.

Real-World Applications

Real-world robot experiments confirm the reliability of LSP in handling contact-rich tasks under real-time constraints using a pre-trained skill library. Through reactive manipulations, the robot smoothly accomplished task sequences amidst disturbances, evidencing LSP's practical robustness.

Conclusion

Logic-Skill Programming (LSP) represents a significant advancement in sequential skill planning, substituting symbolic goal dependencies with direct reward optimizations. Its unique alternating framework between search and optimization provides versatility and optimization depth, proving crucial for future developments in adaptive problem-solving within AI-driven robotics. Future directions may include scaling Tensor Train elements or integrating planning interfaces with symbolic learning for enhanced real-world adaptation.

Markdown

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Logic-Skill Programming: An Optimization-based Approach to Sequential Skill Planning

Summary

Logic-Skill Programming: An Optimization-based Approach to Sequential Skill Planning

Sequential Skill Planning Problem

Methodology: Logic-Skill Programming

Symbolic Search

Skill Value Optimization

Experimental Validation

Comparison with Baselines

Real-World Applications

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (4)

Collections

Tweets

Logic-Skill Programming: An Optimization-based Approach to Sequential Skill Planning

Summary

Logic-Skill Programming: An Optimization-based Approach to Sequential Skill Planning

Sequential Skill Planning Problem

Methodology: Logic-Skill Programming

Symbolic Search

Skill Value Optimization

Experimental Validation

Comparison with Baselines

Real-World Applications

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (4)

Collections

Tweets