Papers
Topics
Authors
Recent
2000 character limit reached

Green AI: A Preliminary Empirical Study on Energy Consumption in DL Models Across Different Runtime Infrastructures

Published 21 Feb 2024 in cs.SE and cs.LG | (2402.13640v2)

Abstract: Deep Learning (DL) frameworks such as PyTorch and TensorFlow include runtime infrastructures responsible for executing trained models on target hardware, managing memory, data transfers, and multi-accelerator execution, if applicable. Additionally, it is a common practice to deploy pre-trained models on environments distinct from their native development settings. This led to the introduction of interchange formats such as ONNX, which includes its runtime infrastructure, and ONNX Runtime, which work as standard formats that can be used across diverse DL frameworks and languages. Even though these runtime infrastructures have a great impact on inference performance, no previous paper has investigated their energy efficiency. In this study, we monitor the energy consumption and inference time in the runtime infrastructures of three well-known DL frameworks as well as ONNX, using three various DL models. To have nuance in our investigation, we also examine the impact of using different execution providers. We find out that the performance and energy efficiency of DL are difficult to predict. One framework, MXNet, outperforms both PyTorch and TensorFlow for the computer vision models using batch size 1, due to efficient GPU usage and thus low CPU usage. However, batch size 64 makes PyTorch and MXNet practically indistinguishable, while TensorFlow is outperformed consistently. For BERT, PyTorch exhibits the best performance. Converting the models to ONNX yields significant performance improvements in the majority of cases. Finally, in our preliminary investigation of execution providers, we observe that TensorRT always outperforms CUDA.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Carbontracker: Tracking and predicting the carbon footprint of training deep learning models. arXiv preprint arXiv:2007.03051 (2020).
  2. Compute and energy consumption trends in deep learning inference. arXiv preprint arXiv:2109.05472 (2021).
  3. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
  4. A model-based framework for the analysis of software energy consumption. In Proceedings of the XXXIII Brazilian Symposium on Software Engineering. 67–72.
  5. Energy Efficiency Considerations for Popular AI Benchmarks. arXiv preprint arXiv:2304.08359 (2023).
  6. Estimation of energy consumption in machine learning. J. Parallel and Distrib. Comput. 134 (2019), 75–88.
  7. Green ai: Do deep learning frameworks have different costs?. In Proceedings of the 44th International Conference on Software Engineering. 1082–1094.
  8. Analysing the energy impact of different optimisations for machine learning models. In 2022 international conference on ICT for sustainability (ICT4S). IEEE, 46–52.
  9. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
  10. Towards the systematic reporting of the energy and carbon footprints of machine learning. The Journal of Machine Learning Research 21, 1 (2020), 10039–10081.
  11. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).
  12. INRIA, University of Lille. 2023. Python Running Average Power Limit. https://pyrapl.readthedocs.io/en/latest/index.html. Last accessed November 20th, 2023..
  13. A Study on the Battery Usage of Deep Learning Frameworks on iOS Devices. In 2024 IEEE/ACM 11th International Conference on Mobile Software Engineering and Systems (MOBILESoft).
  14. RAPL in Action: Experiences in Using RAPL for Power Measurements. ACM Trans. Model. Perform. Evaluation Comput. Syst. 3, 2 (2018), 9:1–9:26.
  15. Quantifying the Carbon Emissions of Machine Learning. arXiv:1910.09700 [cs.CY]
  16. On Haskell and energy efficiency. J. Syst. Softw. 149 (2019), 554–580.
  17. Producing wrong data without doing anything obviously wrong!. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2009, Washington, DC, USA, March 7-11, 2009, Mary Lou Soffa and Mary Jane Irwin (Eds.). ACM, 265–276.
  18. NVIDIA Corporation. 2007. CUDA. https://developer.nvidia.com/cuda-zone. Last accessed November 20th, 2023..
  19. NVIDIA Corporation. 2021. TensorRT. https://developer.nvidia.com/tensorrt. Last accessed November 20th, 2023..
  20. NVIDIA Corporation. 2023. System Management Interface SMI. https://developer.nvidia.com/nvidia-system-management-interface. Last accessed November 20th, 2023..
  21. NVIDIA Corporation, Facebook. 2023. NVIDIA NCG Containers. https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch. Last accessed November 20th, 2023..
  22. Improving energy-efficiency by recommending Java collections. Empirical Software Engineering 26, 3 (2021), 1–45.
  23. Rafiullah Omar. 2023. AI And Energy Efficiency. In 2023 IEEE 20th International Conference on Software Architecture Companion (ICSA-C). IEEE, 141–144.
  24. An empirical study of challenges in converting deep learning models. In 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 13–23.
  25. Gustavo Pinto and Fernando Castor. 2017. Energy efficiency: a new concern for application software developers. Commun. ACM 60, 12 (2017), 68–75.
  26. FECoM: A Step towards Fine-Grained Energy Measurement for Deep Learning. arXiv preprint arXiv:2308.12264 (2023).
  27. Green AI; 2019. arXiv preprint arXiv:1907.10597 (2019).
  28. An Approach Using Performance Models for Supporting Energy Analysis of Software Systems. In European Workshop on Performance Engineering. Springer, 249–263.
  29. Energy and Policy Considerations for Deep Learning in NLP. arXiv:1906.02243 [cs.CL]
  30. The Linux Foundation. 2019. Open Neural Network Exchange. https://onnx.ai/. Last accessed November 20th, 2023..
  31. Anne E. Trefethen and Jeyarajan Thiyagalingam. 2013. Energy-aware software: Challenges, opportunities and strategies. J. Comput. Sci. 4, 6 (2013), 444–449.
  32. Data-centric green ai an exploratory empirical study. In 2022 international conference on ICT for sustainability (ICT4S). IEEE, 35–45.
  33. Werner Vogels. 2016. MXNet - Deep Learning Framework of Choice at AWS. https://www.allthingsdistributed.com/2016/11/mxnet-default-framework-deep-learning-aws.html. Last accessed November 20th, 2023..
  34. A systematic literature review on the use of deep learning in software engineering research. ACM Transactions on Software Engineering and Methodology (TOSEM) 31, 2 (2022), 1–58.
  35. Energy Efficiency of Training Neural Network Architectures: An Empirical Study. arXiv preprint arXiv:2302.00967 (2023).
  36. Batching for Green AI–An Exploratory Study on Inference. arXiv preprint arXiv:2307.11434 (2023).
  37. Uncovering Energy-Efficient Practices in Deep Learning Training: Preliminary Steps Towards Green AI. In 2nd IEEE/ACM International Conference on AI Engineering - Software Engineering for AI, CAIN 2023, Melbourne, Australia, May 15-16, 2023. IEEE, 25–36.
  38. Uncovering Energy-Efficient Practices in Deep Learning Training: Preliminary Steps Towards Green AI. In 2023 IEEE/ACM 2nd International Conference on AI Engineering–Software Engineering for AI (CAIN). IEEE, 25–36.
Citations (6)

Summary

  • The paper presents empirical findings comparing energy consumption across DL frameworks during model inference.
  • It details methodical evaluations using ResNet, MobileNet, and BERT across varying batch sizes and execution providers.
  • The study highlights ONNX conversion’s role in improving energy efficiency, with TensorRT outperforming CUDA consistently.

Green AI: Energy Consumption in DL Models Across Runtime Infrastructures

The paper "Green AI: A Preliminary Empirical Study on Energy Consumption in DL Models Across Different Runtime Infrastructures" investigates the energy efficiency of Deep Learning (DL) models during inference across various runtime infrastructures. This study offers valuable insights into the energy consumption patterns of prominent DL frameworks and aims to contribute to the broader discourse on Green AI, emphasizing the need for energy-efficient DL approaches.

Introduction

The initial focus of the paper is on the energy demands of DL frameworks, addressing their substantial financial and environmental impacts. DL models are typically deployed in runtime environments distinct from the settings in which they were developed. As such, runtime infrastructures like ONNX have emerged as standardized solutions to optimize cross-framework performance and efficiency. Although frameworks are generally assessed based on accuracy, this paper uniquely evaluates their energy consumption during inference, a critical aspect of their real-world application.

Methodology

The authors conducted an empirical study using three DL models—ResNet, MobileNet, and BERT—with different batch sizes, across runtime infrastructures for three well-known DL frameworks: PyTorch, TensorFlow, and MXNet. They focused on energy efficiency and performance by recording GPU utilization, power usage, inference time, and total energy consumed during model inference. Additionally, the study examines ONNX's role in enhancing energy efficiency by converting models from these frameworks and using two execution providers: CUDA and TensorRT.

Results

Energy Efficiency Across Frameworks

In MobileNet and ResNet with a batch size of 1, MXNet consistently utilized the GPU more efficiently, demonstrated by lower CPU energy usage and overall energy consumption when compared to TensorFlow and PyTorch. For BERT, PyTorch outperformed both MXNet and TensorFlow in terms of energy and time efficiency, although MXNet achieved the highest accuracy. This variability underscores the necessity for ML engineers to experiment with different frameworks to optimize energy efficiency based on specific DL tasks.

Impact of ONNX Conversion

The conversion to ONNX typically improved performance and reduced energy consumption across different batch sizes for most models. Notably, TensorFlow's inefficiency at batch size 1 was mitigated following conversion. Despite this, for batch size 64, converted models derived from MXNet and PyTorch exhibited increased energy usage and inference time, signifying that optimization through ONNX does not uniformly result in improvements.

Execution Providers: CUDA and TensorRT

TensorRT consistently outperformed CUDA as an execution provider for ONNX, managing better GPU utilization and reducing energy consumption across all tested models. These findings highlight TensorRT’s potential as a more energy-efficient solution for deploying DL models on GPUs.

Conclusion

The paper concludes that no single DL framework consistently proves optimal across varying models, batch sizes, and runtime configurations. Different frameworks excel under different conditions, underscoring the importance of targeted experimentation by ML developers. While ONNX conversion typically enhances performance and energy efficiency, results can vary significantly dependent on model and configuration specifics.

Future research should extend the examination of execution providers across more DL models and explore language-specific runtime overheads. Additionally, further studies could investigate the effects of runtime infrastructure optimizations on energy consumption in DL processes.

Overall, this paper contributes to a nuanced understanding of the complexities surrounding energy efficiency in DL model deployment, offering actionable insights for the development of Green AI initiatives focused on reducing environmental impact.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 0 likes about this paper.