Alternative compressed environments yielding improved regret bounds for Compressed-MAIDS
Construct compressed environments for Information-Directed Sampling in two-player zero-sum Markov games that satisfy either the expectation-based distortion constraint E[d_{Φ_A, Φ_B}(E, \tilde{E})] ≤ ε or the almost-sure distortion constraint P(d_{Φ_A, Φ_B}(E, \tilde{E}) > ε) = 0, and demonstrate that such constructions lead to Bayesian regret bounds strictly better than those obtained using the current hard-compression partition-based approach.
References
While MARL is more complicated, we still conjecture that it is possible to construct alternative compressed environments satisfying~eq:constraint or~eq:constraint2 that can lead to better regret bounds. This opens up an intriguing avenue for future research.
— Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning
(2404.19292 - Zhang et al., 30 Apr 2024) in Section 6: Learning Compressed Environments — Advantages of Compressed-MAIDS paragraph