Towards Interactive Reinforcement Learning with Intrinsic Feedback (2112.01575v3)

Published 2 Dec 2021 in cs.AI, cs.HC, and cs.LG

Abstract: Reinforcement learning (RL) and brain-computer interfaces (BCI) have experienced significant growth over the past decade. With rising interest in human-in-the-loop (HITL), incorporating human input with RL algorithms has given rise to the sub-field of interactive RL. Adjacently, the field of BCI has long been interested in extracting informative brain signals from neural activity for use in human-computer interactions. A key link between these fields lies in the interpretation of neural activity as feedback such that interactive RL approaches can be employed. We denote this new and emerging medium of feedback as intrinsic feedback. Despite intrinsic feedback's ability to be conveyed automatically and even unconsciously, proper exploration surrounding this key link has largely gone unaddressed by both communities. Thus, to help facilitate a deeper understanding and a more effective utilization, we provide a tutorial-style review covering the motivations, approaches, and open problems of intrinsic feedback and its foundational concepts.

Citations (2)

View on Semantic Scholar

Summary

The paper's main contribution integrates intrinsic neural feedback into interactive RL to automatically assess action errors using error-related potentials.
It proposes a methodology that fuses biological signal processing with reinforcement learning to accelerate human-guided training.
The study highlights practical challenges in interpreting non-stationary neural signals and opens avenues for advanced brain-computer interfaces.

Introduction

Reinforcement Learning (RL) and Brain-Computer Interfaces (BCI) are both at the forefront of technological development and have individually seen tremendous growth. When combined, these two fields hold the promise of creating advanced human-in-the-loop systems, where humans can directly interact with and guide learning algorithms. A sub-domain of RL, interactive Reinforcement Learning (interactive RL), allows a reinforcement learning agent to adapt its actions based not only on the rewards and penalties of an environment but also according to suggestions, guidance, or feedback from a human user.

The Emergence of Intrinsic Feedback

A compelling development in interactive RL is the concept of "intrinsic feedback", which is feedback automatically generated by biological signals. This term is broad and can include neural activity measured, for instance, using electroencephalograms (EEGs). The intrinsic feedback is typically captured unconsciously, without the user needing to perform any explicit action or give direct oral or manual feedback. The paper under discussion argues that such intrinsic feedback links BCI and interactive RL, offering a new frontier for human-computer interaction where humans may even provide feedback unwittingly.

Reinforcement Learning and Human Involvement

At its core, RL is concerned with how an agent can learn a set of actions in an environment to maximize a numerical reward. Traditionally, the agent explores its environment through trial and error. Human input in RL can significantly augment this process, offering an information-rich signal to guide the agent's learning. This is the essence of interactive RL, where human inputs such as demonstrations or advice can quickly align the agent's actions with human expectations, enhancing the agent’s aptitude for learning and its alignment with desired outcomes. The paper explores the taxonomy of human inputs, discussing both demonstrations (learning by observing human actions) and advice (providing suggestions or assessments).

Intrinsic Feedback in Interactive RL

Integrating intrinsic feedback into interactive RL brings its own set of challenges. The paper reviews foundational concepts for learning from feedback and methods for integrating human feedback into RL algorithms. It highlights traditional mediums of explicit and implicit feedback before turning its focus onto intrinsic feedback. Intrinsic feedback, particularly error-related brain signals or ErrPs, offers a unique angle; such signals imply feedback about an action's "goodness" or "badness" intrinsically through brain activity. This makes ErrPs a potentially powerful tool for signaling when an RL agent makes a mistake.

Application Potential and Challenges Ahead

The paper discusses anticipated benefits, such as utilizing rich information from the brain and the automatic nature of intrinsic feedback. However, it also points out the obstacles that remain, such as identifying and interpreting the complex neural signatures underpinning this feedback. Furthermore, the challenges of implementing intrinsic feedback, ranging from the technical (such as making use of non-stationary neural signals) to the conceptual (like ensuring agent alignment with human intentions), are flagged as areas requiring deeper exploration.

Conclusion

The integration of intrinsic feedback into RL may mark a pivotal advancement in the field, creating systems that improve through real-time human brain signal analysis. This synergy could enhance the naturalness and efficiency of human-agent interactions and pave the way for innovations in how we command and communicate with autonomous agents. However, the field is nascent, and much work lies ahead to realize its full potential and address the many challenges that stand in the way.

PDF Markdown