Grid-Interactive Multi-Zone Building Control Using Reinforcement Learning with Global-Local Policy Search (2010.06718v1)
Abstract: In this paper, we develop a grid-interactive multi-zone building controller based on a deep reinforcement learning (RL) approach. The controller is designed to facilitate building operation during normal conditions and demand response events, while ensuring occupants comfort and energy efficiency. We leverage a continuous action space RL formulation, and devise a two-stage global-local RL training framework. In the first stage, a global fast policy search is performed using a gradient-free RL algorithm. In the second stage, a local fine-tuning is conducted using a policy gradient method. In contrast to the state-of-the-art model predictive control (MPC) approach, the proposed RL controller does not require complex computation during real-time operation and can adapt to non-linear building models. We illustrate the controller performance numerically using a five-zone commercial building.