The Power of Training: How Different Neural Network Setups Influence the Energy Demand
Abstract: This work offers a heuristic evaluation of the effects of variations in machine learning training regimes and learning paradigms on the energy consumption of computing, especially HPC hardware with a life-cycle aware perspective. While increasing data availability and innovation in high-performance hardware fuels the training of sophisticated models, it also fosters the fading perception of energy consumption and carbon emission. Therefore, the goal of this work is to raise awareness about the energy impact of general training parameters and processes, from learning rate over batch size to knowledge transfer. Multiple setups with different hyperparameter configurations are evaluated on three different hardware systems. Among many results, we have found out that even with the same model and hardware to reach the same accuracy, improperly set training hyperparameters consume up to 5 times the energy of the optimal setup. We also extensively examined the energy-saving benefits of learning paradigms including recycling knowledge through pretraining and sharing knowledge through multitask training.
- Carbontracker: Tracking and predicting the carbon footprint of training deep learning models. arXiv preprint arXiv:2007.03051.
- Eco2ai: carbon emissions tracking of machine learning models as the first step towards sustainable ai. In Doklady Mathematics, volume 106, S118–S128. Springer.
- cloud-carbon footprint. 2023. Cloud Carbon Footprint. https://github.com/cloud-carbon-footprint/cloud-carbon-footprint.
- Code-Carbon. 2023. Code Carbon. https://github.com/mlco2/codecarbon.
- Control batch size and learning rate to generalize well: Theoretical and empirical evidence. Advances in neural information processing systems, 32.
- Towards the systematic reporting of the energy and carbon footprints of machine learning. The Journal of Machine Learning Research, 21(1): 10039–10081.
- Quantifying the carbon emissions of machine learning. arXiv preprint arXiv:1910.09700.
- Green algorithms: quantifying the carbon footprint of computation. Advanced science, 8(12): 2100707.
- Deep Learning Face Attributes in the Wild. In Proceedings of International Conference on Computer Vision (ICCV).
- Marcelino, P. 2018. Transfer learning from pre-trained models. Towards data science, 10: 23.
- The carbon footprint of machine learning training will plateau, then shrink. Computer, 55(7): 18–28.
- Introducing a new benchmarked dataset for activity monitoring. In 2012 16th international symposium on wearable computers, 108–109. IEEE.
- TASKED: Transformer-based Adversarial learning for human activity recognition using wearable sensors via Self-KnowledgE Distillation. Knowledge-Based Systems, 260: 110143.
- Multi-task convolutional neural network for pose-invariant face recognition. IEEE Transactions on Image Processing, 27(2): 964–975.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.