Papers
Topics
Authors
Recent
Search
2000 character limit reached

The Simpler The Better: An Entropy-Based Importance Metric To Reduce Neural Networks' Depth

Published 27 Apr 2024 in cs.LG | (2404.18949v2)

Abstract: While deep neural networks are highly effective at solving complex tasks, large pre-trained models are commonly employed even to solve consistently simpler downstream tasks, which do not necessarily require a large model's complexity. Motivated by the awareness of the ever-growing AI environmental impact, we propose an efficiency strategy that leverages prior knowledge transferred by large models. Simple but effective, we propose a method relying on an Entropy-bASed Importance mEtRic (EASIER) to reduce the depth of over-parametrized deep neural networks, which alleviates their computational burden. We assess the effectiveness of our method on traditional image classification setups. Our code is available at https://github.com/VGCQ/EASIER.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. On auxiliary losses for semi-supervised semantic segmentation. In ECML PKDD, 2020.
  2. Multi-label image classification with multi-scale global-local semantic graph network. In ECML PKDD. Springer, 2023.
  3. Rethinking the misalignment problem in dense object detection. In ECML PKDD. Springer, 2022.
  4. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
  5. A multimodal deep neural network for human breast cancer prognosis prediction by integrating multi-dimensional data. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 16(3):841–850, 2019.
  6. Deep learning scaling is predictable, empirically. arXiv preprint arXiv:1712.00409, 2017.
  7. Cmt: Convolutional neural networks meet vision transformers. In CVPR, 2022.
  8. Language models are few-shot learners. NeurIPS, 2020.
  9. Llmcarbon: Modeling the end-to-end carbon footprint of large language models. In ICLR, 2023.
  10. Loss-based sensitivity regularization: towards deep sparse neural networks. Neural Networks, 2022.
  11. Learning both weights and connections for efficient neural network. In NeurIPS. Curran Associates, Inc., 2015.
  12. Xnor-net: Imagenet classification using binary convolutional neural networks. In ECCV. Springer, 2016.
  13. Christian H.X. Ali Mehmeti-Göpel and Jan Disselhoff. Nonlinear advantage: Trained networks might not be as complex as you think. In ICML. PMLR, 2023.
  14. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  15. Very deep convolutional networks for large-scale image recognition. ICLR, 2015.
  16. Deep residual learning for image recognition. In CVPR, 2016.
  17. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
  18. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV, 2021.
  19. Illuminating search spaces by mapping elites. arXiv preprint arXiv:1504.04909, 2015.
  20. Neural architecture search with reinforcement learning. In ICLR, 2016.
  21. Darts: Differentiable architecture search. In ICLR, 2018.
  22. A review of neural architecture search. Neurocomputing, 474:82–93, 2022.
  23. Large-scale evolution of image classifiers. In ICML. PMLR, 2017.
  24. Reinforcement learning for neural architecture search: A review. Image and Vision Computing, 89:57–66, 2019.
  25. Rethinking architecture selection in differentiable nas. In ICLR, 2020.
  26. Once-for-all: Train one network and specialize it for efficient deployment. In ICLR, 2019.
  27. Structured pruning for deep convolutional neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  28. Serene: Sensitivity-based regularization of neurons for structured sparsity in neural networks. IEEE Transactions on Neural Networks and Learning Systems, 33(12):7237–7250, 2021.
  29. Learning sparse neural networks through l0subscript𝑙0l_{0}italic_l start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT regularization. In ICLR, 2018.
  30. To prune, or not to prune: Exploring the efficacy of pruning for model compression, 2018.
  31. Snip: Single-shot network pruning based on connection sensitivity. In ICLR, 2019.
  32. What is the state of neural network pruning? MLSys, 2020.
  33. The state of sparsity in deep neural networks. arXiv preprint arXiv:1902.09574, 2019.
  34. On the role of structured pruning for neural network compression. In ICIP. IEEE, 2021.
  35. Can unstructured pruning reduce the depth in deep neural networks? In ICCV, 2023.
  36. Deepreduce: Relu reduction for fast private inference. In ICML. PMLR, 2021.
  37. Selective network linearization for efficient private inference. In ICML. PMLR, 2022.
  38. Layer folding: Neural network depth reduction using activation linearization. BMVC, 2022.
  39. Exact distribution for the product of two correlated gaussian random variables. IEEE Signal Processing Letters, 23(11):1662–1666, 2016.
  40. Cecil C Craig. On the frequency function of xy. The Annals of Mathematical Statistics, 7(1):1–15, 1936.
  41. An approach to distribution of the product of two normal variables. Discussiones Mathematicae Probability and Statistics, 32(1-2):87–99, 2012.
  42. Learning multiple layers of features from tiny images. 2009.
  43. Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015.
  44. In search of lost domain generalization. In ICLR, 2020.
  45. Automated flower classification over a large number of classes. In Indian Conference on Computer Vision, Graphics and Image Processing, Dec 2008.
  46. Describing textures in the wild. In CVPR, 2014.
  47. Fine-grained visual classification of aircraft. Technical report, 2013.
  48. Dsd2: Can we dodge sparse double descent and compress the neural network worry-free? In AAAI, 2024.
  49. Pruning neural networks without any data by iteratively conserving synaptic flow. NeurIPS, 2020.
Citations (4)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 1 like about this paper.