Multi-Agent Deep Reinforcement Learning for HVAC Control in Commercial Buildings (2006.14156v2)

Published 25 Jun 2020 in eess.SY, cs.LG, and cs.SY

Abstract: In commercial buildings, about 40%-50% of the total electricity consumption is attributed to Heating, Ventilation, and Air Conditioning (HVAC) systems, which places an economic burden on building operators. In this paper, we intend to minimize the energy cost of an HVAC system in a multi-zone commercial building under dynamic pricing with the consideration of random zone occupancy, thermal comfort, and indoor air quality comfort. Due to the existence of unknown thermal dynamics models, parameter uncertainties (e.g., outdoor temperature, electricity price, and number of occupants), spatially and temporally coupled constraints associated with indoor temperature and CO2 concentration, a large discrete solution space, and a non-convex and non-separable objective function, it is very challenging to achieve the above aim. To this end, the above energy cost minimization problem is reformulated as a Markov game. Then, an HVAC control algorithm is proposed to solve the Markov game based on multi-agent deep reinforcement learning with attention mechanism. The proposed algorithm does not require any prior knowledge of uncertain parameters and can operate without knowing building thermal dynamics models. Simulation results based on real-world traces show the effectiveness, robustness and scalability of the proposed algorithm.

View on arXiv

Authors (7)

Liang Yu (80 papers)
Yi Sun (146 papers)
Zhanbo Xu (5 papers)
Chao Shen (168 papers)
Dong Yue (9 papers)
Tao Jiang (274 papers)
Xiaohong Guan (62 papers)

Citations (191)

View on Semantic Scholar

Summary

Multi-Agent Deep Reinforcement Learning for HVAC Control in Commercial Buildings

The manuscript delineates an advanced methodological framework for minimizing the energy costs associated with Heating, Ventilation, and Air Conditioning (HVAC) systems in multi-zone commercial buildings, incorporating considerations of zone occupancy, thermal comfort, and indoor air quality. By proposing a multi-agent deep reinforcement learning (MADRL) approach, the authors tackle the intricate challenges arising from unknown thermal dynamics, parameter uncertainties, large discrete solution spaces, and the inherently non-convex and non-separable nature of the optimization objective.

Summary of Key Components

The authors effectively recast the energy cost minimization challenge as a Markov game, establishing a robust foundation for utilizing multi-agent reinforcement learning techniques. Within this framework, each zone of a commercial building is represented by an agent, culminating in a system where both local actions (zone-specific airflow rates) and global actions (AHU damper positions) are coordinated to balance energy expenditures and comfort constraints. This formulation includes the following critical components:

State Representation: The system state captures vital indicators such as zone temperatures, outdoor conditions, energy prices, and occupancy, enabling each agent to make informed decisions based on both local observations and spatial interactions.
Action Space: The discrete control actions encompass zone-specific air supply rates and the damper positions of the air handling unit, both of which are influenced by decisions from multiple agents in the system.
Reward Mechanism: A composite reward function incorporates penalties for energy consumption, thermal deviation, and CO₂ concentration violations, with weight parameters allowing for prioritization between energy savings and comfort.

Numerical Results and Analysis

The simulation results, which leverage real-world data on electricity pricing, occupancy patterns, and ambient conditions, demonstrate that the proposed algorithm significantly outperforms heuristic-based and rule-based control strategies. Specifically, the MADRL approach achieves an energy cost reduction ranging from 56.50% to 75.25% compared to conventional baselines while maintaining acceptable thermal comfort and air quality levels across varied building environments.

The robustness of this approach is further validated by its performance in scenarios with unknown thermal dynamics involving non-zero disturbances, affirming its applicability without detailed parametric knowledge of building models. Moreover, the scalability is highlighted through experiments with increased numbers of zones—showing that as the complexity of the environment grows, the learning algorithm maintains its efficacy with regard to convergence and decision-making quality.

Implications and Future Directions

This paper makes substantial contributions to HVAC control in smart buildings, offering a flexible, robust solution that adapts to dynamic and large-scale environments without relying on explicit prediction models or offline optimization techniques. The introduction of attention mechanisms in the reinforcement learning paradigm allows for adjustable coordination among agents, which is a key factor in maneuvering the complexities of multi-agent dynamics.

Looking forward, this work paves the way for future explorations into integrating physical distribution network constraints and enhancing the granularity of real-time demand response strategies. Additional potential lies in extending these concepts to encompass entire microgrids, optimizing not merely at the building level but on broader scales that consider generation sources, storage capabilities, and interactions with the wider energy infrastructure.

This paper provides a robust foundation for researchers by delivering a comprehensive framework coupled with promising empirical results, highlighting significant strides towards intelligent and autonomous HVAC control in commercial buildings.

PDF Markdown

Related Papers

Find Related Papers