Preference elicitation and alignment among strategic agents in MOMARL
Develop mechanisms and learning frameworks for multi-objective multi-agent reinforcement learning in which agents concurrently learn their users’ utility functions and optimal policies despite misaligned incentives and strategic behaviour (including hiding or misrepresenting preferences). Specifically, design negotiation, communication, or social-contract protocols and accompanying algorithms that elicit and align preferences across agents and stakeholders in individual-utility settings, ensuring robust performance when agents may benefit from not sharing preferences openly.
References
Overcoming the difficulties posed by misalignment of preferences, as well as the fact that it might no longer be in the agents' best interest to share their preferences openly (on the contrary, it might even be better to actively hide this information) are still very much open challenges.