Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Social Interpretable Reinforcement Learning (2401.15480v2)

Published 27 Jan 2024 in cs.LG, cs.AI, and cs.MA

Abstract: Reinforcement Learning (RL) bears the promise of being a game-changer in many applications. However, since most of the literature in the field is currently focused on opaque models, the use of RL in high-stakes scenarios, where interpretability is crucial, is still limited. Recently, some approaches to interpretable RL, e.g., based on Decision Trees, have been proposed, but one of the main limitations of these techniques is their training cost. To overcome this limitation, we propose a new method, called Social Interpretable RL (SIRL), that can substantially reduce the number of episodes needed for training. Our method mimics a social learning process, where each agent in a group learns to solve a given task based both on its own individual experience as well as the experience acquired together with its peers. Our approach is divided into the following two phases. (1) In the collaborative phase, all the agents in the population interact with a shared instance of the environment, where each agent observes the state and independently proposes an action. Then, voting is performed to choose the action that will actually be deployed in the environment. (2) In the individual phase, then, each agent refines its individual performance by interacting with its own instance of the environment. This mechanism makes the agents experience a larger number of episodes with little impact on the computational cost of the process. Our results (on 6 widely-known RL benchmarks) show that SIRL not only reduces the computational cost by a factor varying from a minimum of 43% to a maximum 76%, but it also increases the convergence speed and, often, improves the quality of the solutions.

Citations (2)

Summary

We haven't generated a summary for this paper yet.