Deep W-Networks: Solving Multi-Objective Optimisation Problems With Deep Reinforcement Learning (2211.04813v2)
Abstract: In this paper, we build on advances introduced by the Deep Q-Networks (DQN) approach to extend the multi-objective tabular Reinforcement Learning (RL) algorithm W-learning to large state spaces. W-learning algorithm can naturally solve the competition between multiple single policies in multi-objective environments. However, the tabular version does not scale well to environments with large state spaces. To address this issue, we replace underlying Q-tables with DQN, and propose an addition of W-Networks, as a replacement for tabular weights (W) representations. We evaluate the resulting Deep W-Networks (DWN) approach in two widely-accepted multi-objective RL benchmarks: deep sea treasure and multi-objective mountain car. We show that DWN solves the competition between multiple policies while outperforming the baseline in the form of a DQN solution. Additionally, we demonstrate that the proposed algorithm can find the Pareto front in both tested environments.
- Dynamic weights in multi-objective deep reinforcement learning. In International Conference on Machine Learning, pages 11–20. PMLR.
- Learning run-time compositions of interacting adaptations. SEAMS ’20, page 108–114, New York, NY, USA. Association for Computing Machinery.
- Maximizing renewable energy use with decentralized residential demand response. In 2015 IEEE First International Smart Cities Conference (ISC2), pages 1–6.
- A novel joint radio resource management approach with reinforcement learning mechanisms. In IEEE International Performance, Computing, and Communications Conference (IPCCC), pages 621–626. Phoenix, AZ, USA.
- Energy Aware Deep Reinforcement Learning Scheduling for Sensors Correlated in Time and Space. IEEE Internet of Things Journal, 9(9):6732–6744.
- Humphrys, M. (1995). W-learning: Competition among selfish Q-learners.
- Pareto-Based Multiobjective Machine Learning: An Overview and Case Studies. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(3):397–415.
- Karlsson, J. (1997). Learning to solve multiple goals. University of Rochester.
- Kauten, C. (2018). Super Mario Bros for OpenAI Gym. GitHub: github.com/Kautenja/gym-super-mario-bros.
- Spatial-temporal traffic flow control on motorways using distributed multi-agent reinforcement learning. Mathematics - Special Issue Advances in Artificial Intelligence: Models, Optimization, and Machine Learning, 9(23).
- Deep learning. nature, 521(7553):436–444.
- Multiobjective Reinforcement Learning: A Comprehensive Overview. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 45(3):385–398.
- Playing Atari With Deep Reinforcement Learning. arXiv preprint arXiv:1312.5602.
- Multi-Objective Deep Reinforcement Learning. arXiv preprint arXiv:1610.02707.
- A multi-objective deep reinforcement learning framework. Engineering Applications of Artificial Intelligence, 96:103915.
- Prioritized experience replay. Presented at International Conference on Learning Representations (ICLR), San Diego, CA, May 7–9, 2015. arXiv preprint 1511.05952.
- Multiple-goal reinforcement learning with modular sarsa(0). In 18th Int. Joint Conf. Artif. Intell., page 1445–1447.
- Tajmajer, T. (2018). Modular multi-objective deep reinforcement learning with decision values. In 2018 Federated conference on computer science and information systems (FedCSIS), pages 85–93. IEEE.
- Empirical evaluation methods for multiobjective reinforcement learning algorithms. Machine learning, 84(1):51–80.
- Multi-objective reinforcement learning for infectious disease control with application to COVID-19 spread. arXiv preprint arXiv:2009.04607.
- Dueling network architectures for deep reinforcement learning. In Proceedings of Machine Learning Research (PMLR), vol.48, pages 1995–2003. New York, USA.