Enhancing Human Experience in Human-Agent Collaboration: A Human-Centered Modeling Approach Based on Positive Human Gain (2401.16444v1)
Abstract: Existing game AI research mainly focuses on enhancing agents' abilities to win games, but this does not inherently make humans have a better experience when collaborating with these agents. For example, agents may dominate the collaboration and exhibit unintended or detrimental behaviors, leading to poor experiences for their human partners. In other words, most game AI agents are modeled in a "self-centered" manner. In this paper, we propose a "human-centered" modeling scheme for collaborative agents that aims to enhance the experience of humans. Specifically, we model the experience of humans as the goals they expect to achieve during the task. We expect that agents should learn to enhance the extent to which humans achieve these goals while maintaining agents' original abilities (e.g., winning games). To achieve this, we propose the Reinforcement Learning from Human Gain (RLHG) approach. The RLHG approach introduces a "baseline", which corresponds to the extent to which humans primitively achieve their goals, and encourages agents to learn behaviors that can effectively enhance humans in achieving their goals better. We evaluate the RLHG agent in the popular Multi-player Online Battle Arena (MOBA) game, Honor of Kings, by conducting real-world human-agent tests. Both objective performance and subjective preference results show that the RLHG agent provides participants better gaming experience.
- Be considerate: Avoiding negative side effects in reinforcement learning. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, pp. 18–26, 2022.
- A framework for behavioural cloning. In Machine Intelligence 15, pp. 103–129, 1995.
- The complexity of decentralized control of markov decision processes. Mathematics of operations research, 27(4):819–840, 2002.
- On the utility of learning about humans for human-ai coordination. Advances in Neural Information Processing Systems, 32, 2019.
- Martin Cerny. Sarah and sally: Creating a likeable and competent ai sidekick for a videogame. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, volume 11, pp. 2–8, 2015.
- Net2net: Accelerating learning via knowledge transfer. arXiv preprint arXiv:1511.05641, 2015.
- Deep reinforcement learning from human preferences. Advances in neural information processing systems, 30, 2017.
- Cooperating with machines. Nature Communications, 9(1):233, 2018.
- Ave: Assistance via empowerment. Advances in Neural Information Processing Systems, 33:4560–4571, 2020.
- Human-level play in the game of diplomacy by combining language models with strategic reasoning. Science, 378(6624):1067–1074, 2022.
- Pragmatic-pedagogic value alignment. In Robotics Research: The 18th International Symposium ISRR, pp. 49–57. Springer, 2020.
- Learning diverse policies in moba games via macro-goals. Advances in Neural Information Processing Systems, 34:16171–16182, 2021.
- Towards effective and interpretable human-agent collaboration in moba games: A communication perspective. In The Eleventh International Conference on Learning Representations, 2022.
- Generative adversarial imitation learning. Advances in neural information processing systems, 29, 2016.
- Long short-term memory. Neural Computation, 9(8):1735–1780, 1997.
- “other-play” for zero-shot coordination. In International Conference on Machine Learning, pp. 4399–4410. PMLR, 2020.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Evaluating the robustness of collaborative agents. arXiv preprint arXiv:2101.05507, 2021.
- Trajectory diversity for zero-shot coordination. In International conference on machine learning, pp. 7204–7213. PMLR, 2021.
- Quantifying the effects of environment and population diversity in multi-agent reinforcement learning. Autonomous Agents and Multi-Agent Systems, 36(1):21, 2022.
- Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pp. 1928–1937. PMLR, 2016.
- Algorithms for inverse reinforcement learning. In International Conference on Machine Learning, volume 1, pp. 2, 2000.
- Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680, 2019.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
- Machine theory of mind. In International Conference on Machine Learning, pp. 4218–4227. PMLR, 2018.
- High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438, 2015.
- Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
- Victor do Nascimento Silva and Luiz Chaimowicz. Moba: A new arena for game ai. arXiv preprint arXiv:1705.10443, 2017.
- Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016.
- Mastering the game of go without human knowledge. nature, 550(7676):354–359, 2017.
- Collaborating with humans without human data. Advances in Neural Information Processing Systems, 34, 2021.
- Reinforcement learning: An introduction. MIT press, 2018.
- Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems, 12, 1999.
- Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature, 575(7782):350–354, 2019.
- Honor of kings arena: An environment for generalization in competitive reinforcement learning. Advances in Neural Information Processing Systems, 35:11881–11892, 2022.
- Human+ machine: Reimagining work in the age of AI. Harvard Business Press, 2018a.
- Collaborative intelligence: Humans and ai are joining forces. Harvard Business Review, 96(4):114–123, 2018b.
- Bin Wu. Hierarchical macro strategy model for moba game ai. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pp. 1206–1213, 2019.
- Towards playing full moba games with deep reinforcement learning. Advances in Neural Information Processing Systems, 33:621–632, 2020a.
- Supervised learning achieves human-level performance in moba games: A case study of honor of kings. IEEE Transactions on Neural Networks and Learning Systems, 2020b.
- Mastering complex control in moba games with deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp. 6672–6679, 2020c.
- Learning zero-shot cooperation with humans, assuming humans are biased. In The Eleventh International Conference on Learning Representations, 2022.
- Maximum entropy inverse reinforcement learning. In Association for the Advancement of Artificial Intelligence, volume 8, pp. 1433–1438. Chicago, IL, USA, 2008.
- Yiming Gao (26 papers)
- Feiyu Liu (3 papers)
- Liang Wang (512 papers)
- Zhenjie Lian (4 papers)
- Dehua Zheng (3 papers)
- Weixuan Wang (20 papers)
- Wenjin Yang (10 papers)
- Siqin Li (4 papers)
- Xianliang Wang (4 papers)
- Wenhui Chen (51 papers)
- Jing Dai (5 papers)
- Qiang Fu (159 papers)
- Wei Yang (349 papers)
- Lanxiao Huang (16 papers)
- Wei Liu (1135 papers)