Pareto–Nash equilibria in individual-reward settings with unknown utilities
Establish general algorithms and theoretical guarantees to identify the Pareto–Nash set of joint policies in multi-objective multi-agent decision-making models with individual reward functions and unknown utility functions, including (but not limited to) multi-objective normal-form games, multi-objective stochastic games, and multi-objective partially observable stochastic games. Specifically, characterize conditions for existence and provide methods to compute undominated joint policies when agents’ scalarisation functions are unknown and symmetry or other structural assumptions are not imposed.
References
We note that there is little work so far on the individual reward setting with unknown utility functions, so this more general setting remains an important open challenge in MOMARL.