Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
98 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Training with Mixed-Precision Floating-Point Assignments (2301.13464v2)

Published 31 Jan 2023 in cs.LG

Abstract: When training deep neural networks, keeping all tensors in high precision (e.g., 32-bit or even 16-bit floats) is often wasteful. However, keeping all tensors in low precision (e.g., 8-bit floats) can lead to unacceptable accuracy loss. Hence, it is important to use a precision assignment -- a mapping from all tensors (arising in training) to precision levels (high or low) -- that keeps most of the tensors in low precision and leads to sufficiently accurate models. We provide a technique that explores this memory-accuracy tradeoff by generating precision assignments for convolutional neural networks that (i) use less memory and (ii) lead to more accurate convolutional networks at the same time, compared to the precision assignments considered by prior work in low-precision floating-point training. We evaluate our technique on image classification tasks by training convolutional networks on CIFAR-10, CIFAR-100, and ImageNet. Our method typically provides > 2x memory reduction over a baseline precision assignment while preserving training accuracy, and gives further reductions by trading off accuracy. Compared to other baselines which sometimes cause training to diverge, our method provides similar or better memory reduction while avoiding divergence.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. NVIDIA Hopper Architecture In-Depth. https://developer.nvidia.com/blog/nvidia-hopper-architecture-in-depth/, 2022.
  2. Scalable methods for 8-bit training of neural networks. In NeurIPS, pp.  5151–5159, 2018.
  3. Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision. In ICML, pp.  980–991, 2021.
  4. Shifted and Squeezed 8-bit Floating Point format for Low-Precision Training of Deep Neural Networks. In ICLR, 2020.
  5. Rigorous floating-point mixed-precision tuning. In POPL, pp.  300–315, 2017.
  6. Neural gradients are near-lognormal: improved quantized and sparse training. In ICLR, 2021.
  7. PACT: Parameterized Clipping Activation for Quantized Neural Networks. arXiv:1805.06085, 2018.
  8. BinaryConnect: Training Deep Neural Networks with binary weights during propagations. In NeurIPS, pp.  3123–3131, 2015.
  9. Mixed Precision Training of Convolutional Neural Networks using Integer Operations. In ICLR, 2018.
  10. Training DNNs with Hybrid Block Floating Point. In NeurIPS, pp.  451–461, 2018.
  11. A Block Minifloat Representation for Training Deep Neural Networks. In ICLR, 2021.
  12. A Survey of Quantization Methods for Efficient Neural Network Inference. In Low-Power Computer Vision: Improving the Efficiency of Artificial Intelligence, pp.  291–326. CRC Press, 2022. URL https://arxiv.org/abs/2103.13630.
  13. Exploiting Community Structure for Floating-Point Precision Tuning. In ISSTA, pp.  333–343, 2018.
  14. Deep Learning with Limited Numerical Precision. In ICML, pp.  1737–1746, 2015.
  15. Deep Residual Learning for Image Recognition. In CVPR, pp.  770–778, 2016.
  16. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv:1704.04861, 2017.
  17. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size. arXiv:1602.07360, 2016.
  18. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. In CVPR, pp.  2704–2713, 2018.
  19. Beyond Data and Model Parallelism for Deep Neural Networks. In MLSys, pp.  1–13, 2019.
  20. A Study of BFLOAT16 for Deep Learning Training. arXiv:1905.12322, 2019.
  21. Richard M. Karp. Reducibility Among Combinatorial Problems. In Complexity of Computer Computations, pp.  85–103, 1972.
  22. Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, 2009. URL https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
  23. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. In ECCV, pp.  122–138, 2018.
  24. ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning. In SC, pp.  48:1–48:13, 2018.
  25. Mixed Precision Training. In ICLR, 2018.
  26. FP8 formats for deep learning. arXiv:2209.05433, 2022.
  27. A White Paper on Neural Network Quantization. arXiv:2106.08295, 2021.
  28. Nvidia. Documentation of apex.amp. https://nvidia.github.io/apex/amp.html, 2019.
  29. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In NeurIPS, pp.  8024–8035, 2019.
  30. PyTorch. Documentation of torch.amp. https://pytorch.org/docs/stable/amp.html, 2022.
  31. Awesome Model Quantization. https://github.com/htqin/awesome-model-quantization, 2022.
  32. Multi-Precision Policy Enforced Training (MuPPET): A Precision-Switching Strategy for Quantised Fixed-Point Training of CNNs. In ICML, pp.  7943–7952, 2020.
  33. Precimonious: Tuning Assistant for Floating-Point Precision. In SC, pp.  27:1–27:12, 2013.
  34. Floating-Point Precision Tuning Using Blame Analysis. In ICSE, pp.  1074–1085, 2016.
  35. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3):211–252, 2015.
  36. High-Accuracy Low-Precision Training. arXiv:1803.03383, 2018.
  37. Per-Tensor Fixed-Point Quantization of the Back-Propagation Algorithm. In ICLR, 2019.
  38. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In CVPR, pp.  4510–4520, 2018.
  39. Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks. In NeurIPS, pp.  4901–4910, 2019.
  40. Ultra-Low Precision 4-bit Training of Deep Neural Networks. In NeurIPS, pp.  1796–1807, 2020.
  41. Unity: Accelerating DNN Training Through Joint Optimization of Algebraic Transformations and Parallelization. In OSDI, pp.  267–284, 2022.
  42. Training Deep Neural Networks with 8-bit Floating Point Numbers. In NeurIPS, pp.  7686–7695, 2018.
  43. Training and Inference with Integers in Deep Neural Networks. In ICLR, 2018.
  44. How Low Can We Go: Trading Memory for Error in Low-Precision Training. In ICLR, 2022.
  45. SWALP : Stochastic Weight Averaging in Low Precision Training. In ICML, pp.  7015–7024, 2019a.
  46. Quantization Networks. In CVPR, pp.  7308–7316, 2019b.
  47. Revisiting BFloat16 Training. arXiv:2010.06192, 2020.
  48. QPyTorch: A Low-Precision Arithmetic Simulation Framework. arXiv:1910.04540, 2019.
  49. Fixed-Point Back-Propagation Training. In CVPR, pp.  2327–2335, 2020.
  50. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. arXiv:1606.06160, 2016.
Citations (2)

Summary

We haven't generated a summary for this paper yet.