Modify PBVI or sampling-based methods to operate with particle interactive beliefs in I-POMDPs
Determine how to modify the point-based value iteration (PBVI) algorithm or other sampling-based POMDP solvers so they can operate on a particle-based representation of interactive beliefs within interactive partially observable Markov decision processes (I-POMDPs), enabling tractable planning with interactive belief particles.
Sponsor
References
Unfortunately, I-PF fails to address the curse of history and it is not clear how PBVI or other sampling-based algorithm can be modified to work with a particle representation of interactive beliefs, whereas I-PBVI suffers from the curse of dimensionality because its dimension of interactive belief grows exponentially with the length of planning horizon of the other agent (Section~\ref{sect:iap}).
— Interactive POMDP Lite: Towards Practical Planning to Predict and Exploit Intentions for Interacting with Self-Interested Agents
(1304.5159 - Hoang et al., 2013) in Section 1 Introduction