Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust Neural Network Inference (2308.04753v2)

Published 9 Aug 2023 in cs.CV

Abstract: Deep neural networks (DNNs) demonstrate outstanding performance across most computer vision tasks. Some critical applications, such as autonomous driving or medical imaging, also require investigation into their behavior and the reasons behind the decisions they make. In this vein, DNN attribution consists in studying the relationship between the predictions of a DNN and its inputs. Attribution methods have been adapted to highlight the most relevant weights or neurons in a DNN, allowing to more efficiently select which weights or neurons can be pruned. However, a limitation of these approaches is that weights are typically compared within each layer separately, while some layers might appear as more critical than others. In this work, we propose to investigate DNN layer importance, i.e. to estimate the sensitivity of the accuracy w.r.t. perturbations applied at the layer level. To do so, we propose a novel dataset to evaluate our method as well as future works. We benchmark a number of criteria and draw conclusions regarding how to assess DNN layer importance and, consequently, how to budgetize layers for increased DNN efficiency (with applications for DNN pruning and quantization), as well as robustness to hardware failure (e.g. bit swaps).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Sanity checks for saliency maps. NeurIPS, 31.
  2. A convergence theory for deep learning via over-parameterization. In ICML, 242–252.
  3. What is the state of neural network pruning? arXiv preprint arXiv:2003.03033.
  4. A survey on the explainability of supervised machine learning. Journal of Artificial Intelligence Research, 70: 245–317.
  5. Rethinking differentiable search for mixed-precision neural networks. In CVPR, 2349–2358.
  6. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In WACV, 839–847. IEEE.
  7. Chen, L.-C.; et al. 2017. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. TPAMI, 834–848.
  8. Delving Deep into Interpreting Neural Nets with Piece-Wise Affine Representation. In ICIP.
  9. community, T. 2021. MLPerf Tiny Inference Benchmark. TinyML.
  10. Cong, G.; et al. 2022. SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation. ICLR.
  11. Hawq: Hessian aware quantization of neural networks with mixed-precision. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 293–302.
  12. A survey of deep learning techniques for autonomous driving. Journal of Field Robotics, 37(3): 362–386.
  13. Mask r-cnn. In ICCV, 2961–2969.
  14. Deep residual learning for image recognition. In CVPR, 770–778.
  15. Deep networks with stochastic depth. In ECCV.
  16. Joy, M. 1999. On the global convergence of a class of functional differential equations with applications in neural network theory. Journal of Mathematical Analysis and Applications, 232(1): 61–81.
  17. Guided integrated gradients: An adaptive path method for removing noise. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 5050–5058.
  18. Krishnamoorthi, R. 2018. Quantizing deep convolutional networks for efficient inference: A whitepaper. arXiv preprint arXiv:1806.08342.
  19. Nethammer: Inducing rowhammer faults through network requests. In EuroS&PW, 710–719. IEEE.
  20. Liss, J. 1986. A software technique for diagnosing and correcting memory errors. IEEE transactions on reliability, 35(1): 12–18.
  21. Importance estimation for neural network pruning. In CVPR, 11264–11272.
  22. Up or down? adaptive rounding for post-training quantization. In ICML, 7197–7206. PMLR.
  23. Data-free quantization through weight equalization and bias correction. In ICCV, 1325–1334.
  24. Wrapnet: Neural net inference with ultra-low-resolution arithmetic. ICLR.
  25. Making Sense of Dependence: Efficient Black-box Explanations Using Dependence Measure. NeurIPS.
  26. Scikit-learn: Machine learning in Python. the Journal of machine Learning research, 12: 2825–2830.
  27. A Practical Mixed Precision Algorithm for Post-Training Quantization. arXiv e-prints, arXiv–2302.
  28. Grad-cam: Visual explanations from deep networks via gradient-based localization. In ICCV, 618–626.
  29. Grad-CAM: Why did you say that? arXiv preprint arXiv:1611.07450.
  30. Structural pruning via latency-saliency knapsack. NeurIPS.
  31. Learning important features through propagating activation differences. In ICML, 3145–3153. PMLR.
  32. Not just a black box: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713.
  33. Very deep convolutional networks for large-scale image recognition. BMVC 2014.
  34. Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825.
  35. Axiomatic attribution for deep networks. In ICML, 3319–3328. PMLR.
  36. Suzuki, K. 2017. Overview of deep learning in medical imaging. Radiological physics and technology, 10(3): 257–273.
  37. Haq: Hardware-aware automated quantization with mixed precision. In CVPR, 8612–8620.
  38. IDGI: A Framework to Eliminate Explanation Noise from Integrated Gradients. CVPR.
  39. RED: Looking for Redundancies for Data-Free Structured Compression of Deep Neural Networks. NeurIPS.
  40. SInGE: Sparsity via Integrated Gradients Estimation of Neuron Relevance. NeurIPS.
  41. PowerQuant: Automorphism Search for Non-Uniform Quantization. ICLR.
  42. Visualizing and understanding convolutional networks. In ECCV.
  43. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068.
  44. A local convergence theory for mildly over-parameterized two-layer neural network. In CoLT, 4577–4632. PMLR.

Summary

We haven't generated a summary for this paper yet.