Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Data augmentation through multivariate scenario forecasting in Data Centers using Generative Adversarial Networks (2201.06147v2)

Published 12 Jan 2022 in cs.LG and cs.AI

Abstract: The Cloud paradigm is at a critical point in which the existing energy-efficiency techniques are reaching a plateau, while the computing resources demand at Data Center facilities continues to increase exponentially. The main challenge in achieving a global energy efficiency strategy based on Artificial Intelligence is that we need massive amounts of data to feed the algorithms. This paper proposes a time-series data augmentation methodology based on synthetic scenario forecasting within the Data Center. For this purpose, we will implement a powerful generative algorithm: Generative Adversarial Networks (GANs). Specifically, our work combines the disciplines of GAN-based data augmentation and scenario forecasting, filling the gap in the generation of synthetic data in DCs. Furthermore, we propose a methodology to increase the variability and heterogeneity of the generated data by introducing on-demand anomalies without additional effort or expert knowledge. We also suggest the use of Kullback-Leibler Divergence and Mean Squared Error as new metrics in the validation of synthetic time series generation, as they provide a better overall comparison of multivariate data distributions. We validate our approach using real data collected in an operating Data Center, successfully generating synthetic data helpful for prediction and optimization models. Our research will help optimize the energy consumed in Data Centers, although the proposed methodology can be employed in any similar time-series-like problem.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Synthetic Sensor Data for Human Activity Recognition. In 2020 International Joint Conference on Neural Networks (IJCNN), pages 1–9, July 2020. ISSN: 2161-4407.
  2. SenseGen: A deep learning architecture for synthetic sensor data generation. In 2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), pages 188–193, March 2017.
  3. Wasserstein generative adversarial networks. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 214–223, International Convention Centre, Sydney, Australia, 06–11 Aug 2017. PMLR.
  4. Improving the accuracy of global forecasting models using time series data augmentation. Pattern Recognition, 120:108148, 2021.
  5. Assessing ict global emissions footprint: Trends to 2040 & recommendations. Journal of Cleaner Production, 177:448–463, 2018.
  6. An Unsupervised Deep Learning Approach for Scenario Forecasts. In 2018 Power Systems Computation Conference (PSCC), pages 1–7, June 2018.
  7. Soumith Chintala. NIPS 2016 Workshop on Adversarial Training: How to train a GAN, February 2017.
  8. Cisco. Global Cloud Index: Forecast and Methodology, 2016–2021. Technical report, Cisco, 2018.
  9. Cisco. Annual Internet Report (2018–2023) White Paper. Technical report, Cisco, 2020.
  10. Day-ahead electricity price forecasting using the wavelet transform and ARIMA models. IEEE Transactions on Power Systems, 20(2):1035–1042, May 2005.
  11. Dell. Intergenerational Energy Efficiency of Dell EMC PowerEdge Servers. Technical report, DellEMC white paper, 2018.
  12. Kenneth Duemig. Accelerating time-to-market with fabricated test data, 2017. IBM Big Data & Analytics Hub.
  13. Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs. arXiv:1706.02633 [cs, stat], December 2017.
  14. DeepMind AI Reduces Google Data Centre Cooling Bill by 40%, 2016. DeepMind Blog.
  15. Generating energy data for machine learning with recurrent generative adversarial networks. Energies, 13(1), 2020.
  16. Score: Simulator for cloud optimization of resources and energy consumption. Simulation Modelling Practice and Theory, 82:160–173, 2018.
  17. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, volume 27, pages 2672–2680. Curran Associates, Inc., 2014.
  18. Improved training of Wasserstein GANs. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30, pages 5767–5777. Curran Associates, Inc., 2017.
  19. Biosignal Generation and Latent Variable Analysis With Recurrent Generative Adversarial Networks. IEEE Access, 7:144292–144302, 2019.
  20. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  21. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30, pages 6626–6637. Curran Associates, Inc., 2017.
  22. Fault diagnosis of hydraulic systems based on deep learning model with multirate data samples. IEEE Transactions on Neural Networks and Learning Systems, pages 1–13, 2021.
  23. Uptime Institute. Annual Data Center Survey Results 2020. Technical report, Uptime Institute, Intelligence Department, 2020.
  24. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, ICML’15, page 448–456. JMLR.org, 2015.
  25. An empirical survey of data augmentation for time series classification with neural networks, 2020. arXiv preprint arXiv:2007.15951.
  26. Scenario Generation for Wind Power Using Improved Generative Adversarial Networks. IEEE Access, 6:62193–62203, 2018.
  27. Simson L. Garfinkel John M. Abowd, Gary L. Benedetto. Modernization of Statistical Disclosure Limitation at US Census Bureau. Technical report, US Census Bureau, 2020.
  28. Nicola Jones. How to stop data centres from gobbling up the world’s electricity. Nature, 561(7722):163–166, September 2018. Publisher: Nature Publishing Group.
  29. Adam: A method for stochastic optimization. Computing Research Repository, abs/1412.6980, 2015.
  30. Generating Diverse Synthetic Medical Image Data for Training Machine Learning Models, 2020. Google AI Blog.
  31. Demand Side Data Generating Based on Conditional Generative Adversarial Networks. Energy Procedia, 152:1188–1193, October 2018.
  32. Rev Lebaredian. Synthetic Data will Drive Next Wave of Business Applications | GTC Silicon Valley 2019, 2019.
  33. Tsa-gan: A robust generative adversarial networks for time series augmentation. In 2021 International Joint Conference on Neural Networks (IJCNN), pages 1–8, 2021.
  34. Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions. In Proceedings of the ACM Internet Measurement Conference, IMC ’20, pages 464–483, New York, NY, USA, October 2020. Association for Computing Machinery.
  35. Recalibrating global data center energy-use estimates. Science, 367(6481):984–986, February 2020. Publisher: American Association for the Advancement of Science Section: Policy Forum.
  36. Spectral normalization for generative adversarial networks. In International Conference on Learning Representations, 2018.
  37. Conditional sig-wasserstein gans for time series generation. arXiv preprint arXiv:2006.05421, 2020.
  38. Synthetic Sensor Data Generation for Health Applications: A Supervised Deep Learning Approach. In 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 1164–1167, July 2018. ISSN: 1558-4615.
  39. Thermal prediction for immersion cooling data centers based on recurrent neural networks. In International Conference on Intelligent Data Engineering and Automated Learning, pages 491–498. Springer, 2018.
  40. Predictive gpu-based adas management in energy-conscious smart cities. In 2019 ieee international smart cities conference (isc2), pages 349–354. IEEE, 2019.
  41. Applied Bayesian forecasting and time series analysis. Chapman and Hall/CRC, 2018.
  42. Jaime Pérez. Code GAN scenario forecasting - GitHub, 2021. https://github.com/jaimeperezsanchez/GAN_Scenario_Forecasting.
  43. A complete model for modular simulation of data centre power load. arXiv preprint arXiv:1804.00703, 2018.
  44. T-cgan: Conditional generative adversarial network for data augmentation in noisy time series with irregular sampling. arXiv preprint arXiv:1811.08295, 2018.
  45. Sandvine. Global Internet Phenomena. Technical report, Sandvine, 2019.
  46. A survey on Image Data Augmentation for Deep Learning. Journal of Big Data, 6(1):60, July 2019.
  47. A comparison of arima and lstm in forecasting time series. In 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pages 1394–1401. IEEE, 2018.
  48. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research, 15(56):1929–1958, 2014.
  49. Energy Star. Hot Aisle/Cold Aisle Layout, 2012. https://www.energystar.gov/products/low_carbon_it_campaign/12_ways_save_energy_data_center/hot_aisle_cold_aisle_layout.
  50. Improved mixed-example data augmentation. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1262–1270, 2019.
  51. European Commission Team FPFIS. Trends in data centre energy consumption under the European Code of Conduct for data centre energy efficiency. Technical report, European Commission, September 2017.
  52. P. Tsilingiris. Thermophysical and transport properties of humid air at temperature range between 0 and 100 °c. Energy Conversion and Management, 49:1098–1110, 2008.
  53. Data augmentation of wearable sensor data for parkinson’s disease monitoring using convolutional neural networks. In Proceedings of the 19th ACM International Conference on Multimodal Interaction, ICMI ’17, page 216–220, New York, NY, USA, 2017. Association for Computing Machinery.
  54. The effectiveness of data augmentation in image classification using deep learning. Convolutional Neural Networks Vis. Recognit, 11:1–8, 2017.
  55. Scenario Reduction With Submodular Optimization. IEEE Transactions on Power Systems, 32(3):2479–2480, May 2017.
  56. Mike West. Bayesian forecasting of multivariate time series: scalability, structure uncertainty and decisions. Annals of the Institute of Statistical Mathematics, 72(1):1–31, February 2020.
  57. A deep multivariate time series multistep forecasting network. Applied Intelligence, November 2021.
  58. Time-series Generative Adversarial Networks. In Advances in Neural Information Processing Systems, volume 32, pages 5508–5518. Curran Associates, Inc., 2019.
  59. A simulation study on heat recovery of data center: A case study in harbin, china. Renewable energy, 130:154–173, 2019.
  60. Unsupervised Representation Learning with Deep Convolutional Neural Network for Remote Sensing Images. In Yao Zhao, Xiangwei Kong, and David Taubman, editors, Image and Graphics, Lecture Notes in Computer Science, pages 97–108, Cham, 2017. Springer International Publishing.
  61. Generative Adversarial Network for Synthetic Time Series Data Generation in Smart Grids. In 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), pages 1–6, October 2018.
  62. Typical wind power scenario generation for multiple wind farms using conditional improved Wasserstein generative adversarial network. International Journal of Electrical Power & Energy Systems, 114:105388, January 2020.
  63. Adabelief optimizer: Adapting stepsizes by the belief in observed gradients. Conference on Neural Information Processing Systems, 2020.
  64. Hidden Markov Models for Time Series: An Introduction Using R, Second Edition. CRC Press, December 2017.
Citations (13)

Summary

We haven't generated a summary for this paper yet.