Combining Learning from Human Feedback and Knowledge Engineering to Solve Hierarchical Tasks in Minecraft

Published 7 Dec 2021 in cs.LG, cs.AI, and cs.HC | (2112.03482v2)

Abstract: Real-world tasks of interest are generally poorly defined by human-readable descriptions and have no pre-defined reward signals unless it is defined by a human designer. Conversely, data-driven algorithms are often designed to solve a specific, narrowly defined, task with performance metrics that drives the agent's learning. In this work, we present the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge: Learning from Human Feedback in Minecraft, which challenged participants to use human data to solve four tasks defined only by a natural language description and no reward function. Our approach uses the available human demonstration data to train an imitation learning policy for navigation and additional human feedback to train an image classifier. These modules, combined with an estimated odometry map, become a powerful state-machine designed to utilize human knowledge in a natural hierarchical paradigm. We compare this hybrid intelligence approach to both end-to-end machine learning and pure engineered solutions, which are then judged by human evaluators. Codebase is available at https://github.com/viniciusguigo/kairos_minerl_basalt.

Abstract PDF Upgrade to Chat

Citations (7)

View on Semantic Scholar

Summary

The paper demonstrates a hybrid approach that integrates a state classifier and imitation learning module with knowledge engineering for effective hierarchical task solving.
The machine learning modules leverage human feedback and demonstrations to enhance state recognition and navigation in a complex Minecraft environment.
The knowledge engineering components—including a state-machine and odometry estimation—organize task execution and improve spatial decision-making.

Analysis of Hybrid Intelligence in the MineRL BASALT Competition

The paper presents an innovative approach to the MineRL BASALT competition tasks by leveraging a combination of ML and knowledge engineering, a paradigm known as hybrid intelligence. The authors detail the architecture of their system, which incorporates two ML modules and three knowledge engineering modules to address the complexities inherent in the specified tasks.

The ML component is bifurcated into two distinct modules. The first module, a state classifier, is trained using additional human feedback to discern relevant states within the environment. This classification aims to enhance the system's contextual understanding of the game world. The second module employs imitation learning to teach navigation subtasks using a human demonstration dataset provided by the competition. This division allows the system to tackle specific navigational challenges associated with each task effectively.

Complementary to the ML modules, the knowledge engineering component is articulated through three modules that provide a structured framework for task execution. Firstly, a state-machine is designed, utilizing the relevant states recognized by the state classifier, supplemented by task knowledge. This state-machine orchestrates the execution order of subtasks by creating a hierarchical subtask structure, determining which subtask should be executed at each time step. Secondly, to address subtasks that were not amenable to direct learning from data, the researchers devised engineered submodules. These complex subtasks required a more explicit incorporation of human-crafted rules and logic. Lastly, an odometry estimation module was formulated, offering supplementary spatial information about the agent's position and the relevant states' locations, crucial for executing the engineered subtasks.

The system's design offers a promising structure for task execution by amalgamating ML's adaptability with the precision of knowledge engineering. This hybrid approach facilitates the resolution of tasks that might be challenging for a purely machine learning-based system, especially considering the sparse reward structures and complex state spaces characteristic of the competition's environment.

The implications of this research are significant both practically and theoretically. On a practical level, the methodology demonstrates the effectiveness of hybrid intelligence systems in dynamic and multi-faceted environments. Theoretically, it underlines the potential synergy between machine learning's flexibility and knowledge engineering's structured decision-making.

Speculating on future developments, integrating more advanced state classifiers and enhancing the fidelity of the knowledge engineering modules could yield further improvements. Moreover, exploring the applicability of this hybrid system architecture in other domains with complex decision-making requirements could facilitate broader advancements in AI. This paper sets a precedent for future research in task-oriented AI systems, emphasizing the integration of diverse computational paradigms to address multifaceted challenges.

Markdown