Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MAVIPER: Learning Decision Tree Policies for Interpretable Multi-Agent Reinforcement Learning (2205.12449v2)

Published 25 May 2022 in cs.LG and cs.MA

Abstract: Many recent breakthroughs in multi-agent reinforcement learning (MARL) require the use of deep neural networks, which are challenging for human experts to interpret and understand. On the other hand, existing work on interpretable reinforcement learning (RL) has shown promise in extracting more interpretable decision tree-based policies from neural networks, but only in the single-agent setting. To fill this gap, we propose the first set of algorithms that extract interpretable decision-tree policies from neural networks trained with MARL. The first algorithm, IVIPER, extends VIPER, a recent method for single-agent interpretable RL, to the multi-agent setting. We demonstrate that IVIPER learns high-quality decision-tree policies for each agent. To better capture coordination between agents, we propose a novel centralized decision-tree training algorithm, MAVIPER. MAVIPER jointly grows the trees of each agent by predicting the behavior of the other agents using their anticipated trees, and uses resampling to focus on states that are critical for its interactions with other agents. We show that both algorithms generally outperform the baselines and that MAVIPER-trained agents achieve better-coordinated performance than IVIPER-trained agents on three different multi-agent particle-world environments.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Stephanie Milani (23 papers)
  2. Zhicheng Zhang (76 papers)
  3. Nicholay Topin (17 papers)
  4. Zheyuan Ryan Shi (12 papers)
  5. Charles Kamhoua (24 papers)
  6. Evangelos E. Papalexakis (49 papers)
  7. Fei Fang (103 papers)
Citations (12)

Summary

We haven't generated a summary for this paper yet.