Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EXPLORER: Exploration-guided Reasoning for Textual Reinforcement Learning (2403.10692v1)

Published 15 Mar 2024 in cs.CL, cs.AI, and cs.LO

Abstract: Text-based games (TBGs) have emerged as an important collection of NLP tasks, requiring reinforcement learning (RL) agents to combine natural language understanding with reasoning. A key challenge for agents attempting to solve such tasks is to generalize across multiple games and demonstrate good performance on both seen and unseen objects. Purely deep-RL-based approaches may perform well on seen objects; however, they fail to showcase the same performance on unseen objects. Commonsense-infused deep-RL agents may work better on unseen data; unfortunately, their policies are often not interpretable or easily transferable. To tackle these issues, in this paper, we present EXPLORER which is an exploration-guided reasoning agent for textual reinforcement learning. EXPLORER is neurosymbolic in nature, as it relies on a neural module for exploration and a symbolic module for exploitation. It can also learn generalized symbolic policies and perform well over unseen data. Our experiments show that EXPLORER outperforms the baseline agents on Text-World cooking (TW-Cooking) and Text-World Commonsense (TWC) games.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Kinjal Basu (49 papers)
  2. Keerthiram Murugesan (38 papers)
  3. Subhajit Chaudhury (40 papers)
  4. Murray Campbell (27 papers)
  5. Kartik Talamadupula (38 papers)
  6. Tim Klinger (23 papers)
Citations (2)

Summary

  • The paper proposes a novel neuro-symbolic architecture that combines deep learning with symbolic reasoning to navigate complex text-based games.
  • The paper leverages rule generalization using commonsense knowledge from WordNet to enhance performance on unseen objects and out-of-distribution tasks.
  • The paper demonstrates superior performance compared to state-of-the-art RL agents, offering improved interpretability and adaptive policy learning.

Neuro-Symbolic Approach for Textual Reinforcement Learning in EXPLORER

Introduction

Text-based games (TBGs) offer a unique challenge in the field of NLP and Reinforcement Learning (RL), requiring an agent to comprehend natural language inputs and make decisions based on this information. These games serve as an attractive testbed for RL agents due to their requirement for both understanding language and reasoning. However, the performance of current agents on TBGs varies, especially when encountering unseen objects or concepts, which limits their practical application. In the paper titled "EXPLORER: Exploration-guided Reasoning for Textual Reinforcement Learning", the authors present a novel neuro-symbolic agent, EXPLORER, that integrates neural networks with symbolic reasoning to achieve superior performance in TBGs while maintaining interpretability of the learned policies.

Methodology

Neuro-Symbolic Architecture

EXPLORER employs a hybrid approach that leverages both deep learning and symbolic reasoning. The neural component focuses on exploring the textual environment, collecting action-state-reward pairs, and identifying useful entities and actions. Contrarily, the symbolic component, grounded in logic and commonsense reasoning, is engaged for exploitation; it operates by learning rules in an Answer Set Programming (ASP) framework from the observations made by the neural component.

Symbolic Policy Learning

EXPLORER learns symbolic policies iteratively, using Inductive Logic Programming (ILP) to derive logical rules from action-reward pairs gathered during gameplay. These rules offer a clear interpretative advantage, detailing the rationale behind decision-making in a human-readable format. The system also learns exceptions to these rules to handle non-monotonic reasoning, further enriching its decision-making process.

Rule Generalization and Generalized Rule Learner

A vital contribution of EXPLORER is its rule generalization capability, enabling it to handle unseen objects effectively by leveraging commonsense knowledge from WordNet. Through dynamic rule generalization, based on information gain and hypernym-hyponym relationships, EXPLORER extends its learning beyond specific instances to general concepts, significantly improving its performance on out-of-distribution (OOD) test sets.

Experimental Evaluation

EXPLORER was evaluated on two benchmark datasets: TW-Cooking and Text-World Commonsense (TWC), to test its performance across scenarios requiring a broad range of language understanding and reasoning capabilities. The experiments demonstrate EXPLORER’s superiority in both seen and unseen environments, attributing its success to the neuro-symbolic integration and the rule generalization mechanism, particularly its innovative use of hypernyms for policy lifting.

The performance of EXPLORER was compared against state-of-the-art (SOTA) neural and rule-based RL agents as well as existing neuro-symbolic models. In all tested scenarios, EXPLORER outperformed the baselines, showcasing the efficacy of its exploration-guided reasoning approach. This is especially notable in out-of-distribution (OOD) scenarios, where EXPLORER's ability to generalize proved most beneficial.

Discussion and Future Work

The results from EXPLORER introduce promising avenues for future research in neuro-symbolic integration within RL and NLP. The neuro-symbolic architecture not only improves performance across both familiar and novel environments but also offers a robust framework for developing interpretable and adaptable RL agents. Further research could explore optimization techniques for symbolic rule learning, enhancing the efficiency of the ASP solver, and extending the rule generalization algorithm to incorporate broader commonsense knowledge bases.

Additionally, addressing the computational overhead introduced by the symbolic reasoning component and exploring dynamic strategies for balancing between the neural and symbolic components could further refine EXPLORER's capabilities.

Conclusion

EXPLORER represents a significant step forward in the development of intelligent, adaptable, and interpretable RL agents for text-based games. By integrating neural exploration with symbolic exploitation and employing commonsense knowledge for rule generalization, EXPLORER sets a new standard for performance and versatility in TBGs. Its achievements underscore the potential of neuro-symbolic approaches in advancing artificial intelligence research, highlighting the importance of interpretability and generalizability in complex decision-making tasks.

Reddit Logo Streamline Icon: https://streamlinehq.com