LLM Cooperation Without Harmful Collusion
Develop methodologies for generating and deploying large language model (LLM) agents that can culturally evolve cooperative behaviors when such cooperation benefits human society, while ensuring that these agents refuse to collude against human norms, laws, or human interests during multi-agent interactions across iterative generations.
References
Therefore, we end by highlighting a crucial open question: how can we generate LLM agents which are capable of evolving cooperation when it is beneficial to human society, but which refuse to collude against the norms, laws or interests of humans?
                — Cultural Evolution of Cooperation among LLM Agents
                
                (2412.10270 - Vallinder et al., 13 Dec 2024) in Discussion (final paragraph)