BuildingsBench: A Large-Scale Dataset of 900K Buildings and Benchmark for Short-Term Load Forecasting (2307.00142v3)
Abstract: Short-term forecasting of residential and commercial building energy consumption is widely used in power systems and continues to grow in importance. Data-driven short-term load forecasting (STLF), although promising, has suffered from a lack of open, large-scale datasets with high building diversity. This has hindered exploring the pretrain-then-fine-tune paradigm for STLF. To help address this, we present BuildingsBench, which consists of: 1) Buildings-900K, a large-scale dataset of 900K simulated buildings representing the U.S. building stock; and 2) an evaluation platform with over 1,900 real residential and commercial buildings from 7 open datasets. BuildingsBench benchmarks two under-explored tasks: zero-shot STLF, where a pretrained model is evaluated on unseen buildings without fine-tuning, and transfer learning, where a pretrained model is fine-tuned on a target building. The main finding of our benchmark analysis is that synthetically pretrained models generalize surprisingly well to real commercial buildings. An exploration of the effect of increasing dataset size and diversity on zero-shot commercial building performance reveals a power-law with diminishing returns. We also show that fine-tuning pretrained models on real commercial and residential buildings improves performance for a majority of target buildings. We hope that BuildingsBench encourages and facilitates future research on generalizable STLF. All datasets and code can be accessed from https://github.com/NREL/BuildingsBench.
- Theory and applications of hvac control systems–a review of model predictive control (mpc). Building and Environment, 72:343–355, 2014.
- A review of data-driven building energy consumption prediction studies. Renewable and Sustainable Energy Reviews, 81:1192–1205, 2018. ISSN 13640321. doi: 10.1016/j.rser.2017.04.095. URL https://linkinghub.elsevier.com/retrieve/pii/S1364032117306093.
- skforecast, 5 2023. URL https://github.com/JoaquinAmatRodrigo/skforecast.
- 6-second load measurement dataset, 2020. URL https://doi.org/10.5683/SP2/R4SVBF.
- Smart*: An open data set and tools for enabling research in sustainable homes. SustKDD, August, 111(112):108, 2012.
- An analysis of transformations. Journal of the Royal Statistical Society: Series B (Methodological), 26(2):211–243, 1964.
- One model fits all: Individualized household energy demand forecasting with a single deep learning model. In Proceedings of the Twelfth ACM International Conference on Future Energy Systems, pp. 466–474. ACM, 2021. ISBN 978-1-4503-8333-2. doi: 10.1145/3447555.3466587. URL https://dl.acm.org/doi/10.1145/3447555.3466587.
- Energyplus: creating a new-generation building energy simulation program. Energy and buildings, 33(4):319–331, 2001.
- All you need to know about model predictive control for buildings. Annual Reviews in Control, 50:190–232, 2020.
- EIA. Energy information administration may 2023 monthly energy review. 2023. URL https://www.eia.gov/totalenergy/data/monthly/archive/00352305.pdf.
- Gage, P. A new algorithm for data compression. C Users Journal, 12(2):23–38, 1994.
- Datasheets for datasets. Communications of the ACM, 64(12):86–92, 2021.
- Probabilistic forecasting. Annual Review of Statistics and Its Application, 1:125–151, 2014.
- Transferrable model-agnostic meta-learning for short-term household load forecasting with limited training data. IEEE Transactions on Power Systems, 37(4):3177–3180, 2022. doi: 10.1109/TPWRS.2022.3169389.
- Individual household electric power consumption. UCI Machine Learning Repository, 2012. DOI: 10.24432/C58K54.
- Bridging nonlinearities and stochastic regularizers with gaussian error linear units. arXiv preprint arXiv:1606.08415, 2016.
- Transformer neural networks for building load forecasting. Tackling Climate Change with Machine Learning: workshop at NeurIPS 2022, 2022.
- Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
- Probabilistic electric load forecasting: A tutorial review. International Journal of Forecasting, 32(3):914–938, 2016. ISSN 01692070. doi: 10.1016/j.ijforecast.2015.11.011. URL https://linkinghub.elsevier.com/retrieve/pii/S0169207015001508.
- Global energy forecasting competition 2012. International Journal of Forecasting, 30(2):357–363, 2014. ISSN 0169-2070. doi: https://doi.org/10.1016/j.ijforecast.2013.07.001. URL https://www.sciencedirect.com/science/article/pii/S0169207013000745.
- Short-term electricity load forecasting based on temporal fusion transformer model. IEEE Access, 10:106296–106304, 2022. doi: 10.1109/ACCESS.2022.3211941. URL https://ieeexplore.ieee.org/document/9910162/.
- IEA. Buildings. 2022. URL https://www.iea.org/reports/buildings.
- Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3):535–547, 2019.
- Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30, 2017.
- Algorithms for mining distancebased outliers in large datasets. In Proceedings of the international conference on very large data bases, pp. 392–403. Citeseer, 1998.
- Decoupled weight decay regularization. In International Conference on Learning Representations, 2019.
- The building data genome project 2, energy meter data from the ASHRAE great energy predictor III competition. Scientific Data, 7(1):368, 2020. ISSN 2052-4463. doi: 10.1038/s41597-020-00712-x. URL https://www.nature.com/articles/s41597-020-00712-x.
- Deep learning for estimating building energy consumption. Sustainable Energy, Grids and Networks, 6:91–99, 2016.
- National Renewable Energy Lab. Energyplus. 2023. URL https://energyplus.net/.
- Networks, U. P. Smartmeter energy consumption data in london households, 2018. URL https://data.london.gov.uk/dataset/smartmeter-energy-use-data-in-london-households.
- A time series is worth 64 words: Long-term forecasting with transformers. In International Conference on Learning Representations, 2023.
- Sliding window-based lightgbm model for electric load forecasting using anomaly repair. The Journal of Supercomputing, 77(11):12857–12878, Nov 2021. ISSN 1573-0484. doi: 10.1007/s11227-021-03787-4. URL https://doi.org/10.1007/s11227-021-03787-4.
- PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, pp. 8024–8035. Curran Associates, Inc., 2019. URL http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
- The ideal household energy dataset, electricity, gas, contextual sensor data and survey data for 255 uk homes. Scientific Data, 8(1):146, 2021.
- Improving language understanding by generative pre-training. 2018.
- Specialist vs generalist: A transformer architecture for global forecasting energy time series. In 2022 15th International Conference on Human System Interaction (HSI), pp. 1–5. IEEE, 2022.
- Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems, 32, 2019.
- DeepAR: Probabilistic forecasting with autoregressive recurrent networks. arXiv:1704.04110 [cs, stat]. URL http://arxiv.org/abs/1704.04110.
- Incorporating practice theory in sub-profile models for short term aggregated residential load forecasting. IEEE Transactions on Smart Grid, 8(4):1591–1598, 2015.
- Street, P. Pecan street dataport, 2021. URL https://www.pecanstreet.org/dataport.
- Trindade, A. ElectricityLoadDiagrams20112014. UCI Machine Learning Repository, 2015. DOI: 10.24432/C58C86.
- Daily load forecasting based on previous day load. In 6th Seminar on Neural Network Applications in Electrical Engineering, pp. 83–86, 2002. doi: 10.1109/NEUREL.2002.1057973.
- Attention is all you need. arXiv:1706.03762 [cs], 2017. URL http://arxiv.org/abs/1706.03762.
- Deep reinforcement learning for building hvac control. In Proceedings of the 54th annual design automation conference 2017, pp. 1–6, 2017.
- End-use load profiles for the us building stock: Methodology and results of model calibration, validation, and uncertainty quantification. Technical report, National Renewable Energy Lab (NREL), 2021. URL https://www.nrel.gov/docs/fy22osti/80889.pdf.
- Timesnet: Temporal 2d-variation modeling for general time series analysis. In International Conference on Learning Representations, 2023.
- Deep transformer models for time series forecasting: The influenza prevalence case. arXiv:2001.08317, 2020. URL http://arxiv.org/abs/2001.08317.
- A hybrid transfer learning model for short-term electric load forecasting. Electrical Engineering, 102(3):1371–1381, 2020. doi: 10.1007/s00202-020-00930-x. URL http://link.springer.com/10.1007/s00202-020-00930-x.
- Are transformers effective for time series forecasting? arXiv preprint arXiv:2205.13504, 2022.
- Short-term electrical load forecasting based on time augmented transformer. International Journal of Computational Intelligence Systems, 15(1):67, 2022.
- High-resolution hourly surrogate modeling framework for physics-based large-scale building stock modeling. Sustainable Cities and Society, 75:103292, 2021a.
- A review of machine learning in building load prediction. Applied Energy, 285:116452, 2021b. ISSN 03062619. doi: 10.1016/j.apenergy.2021.116452. URL https://linkinghub.elsevier.com/retrieve/pii/S0306261921000209.