The Simpler The Better: An Entropy-Based Importance Metric To Reduce Neural Networks' Depth
Abstract: While deep neural networks are highly effective at solving complex tasks, large pre-trained models are commonly employed even to solve consistently simpler downstream tasks, which do not necessarily require a large model's complexity. Motivated by the awareness of the ever-growing AI environmental impact, we propose an efficiency strategy that leverages prior knowledge transferred by large models. Simple but effective, we propose a method relying on an Entropy-bASed Importance mEtRic (EASIER) to reduce the depth of over-parametrized deep neural networks, which alleviates their computational burden. We assess the effectiveness of our method on traditional image classification setups. Our code is available at https://github.com/VGCQ/EASIER.
- On auxiliary losses for semi-supervised semantic segmentation. In ECML PKDD, 2020.
- Multi-label image classification with multi-scale global-local semantic graph network. In ECML PKDD. Springer, 2023.
- Rethinking the misalignment problem in dense object detection. In ECML PKDD. Springer, 2022.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
- A multimodal deep neural network for human breast cancer prognosis prediction by integrating multi-dimensional data. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 16(3):841–850, 2019.
- Deep learning scaling is predictable, empirically. arXiv preprint arXiv:1712.00409, 2017.
- Cmt: Convolutional neural networks meet vision transformers. In CVPR, 2022.
- Language models are few-shot learners. NeurIPS, 2020.
- Llmcarbon: Modeling the end-to-end carbon footprint of large language models. In ICLR, 2023.
- Loss-based sensitivity regularization: towards deep sparse neural networks. Neural Networks, 2022.
- Learning both weights and connections for efficient neural network. In NeurIPS. Curran Associates, Inc., 2015.
- Xnor-net: Imagenet classification using binary convolutional neural networks. In ECCV. Springer, 2016.
- Christian H.X. Ali Mehmeti-Göpel and Jan Disselhoff. Nonlinear advantage: Trained networks might not be as complex as you think. In ICML. PMLR, 2023.
- Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
- Very deep convolutional networks for large-scale image recognition. ICLR, 2015.
- Deep residual learning for image recognition. In CVPR, 2016.
- Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
- Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV, 2021.
- Illuminating search spaces by mapping elites. arXiv preprint arXiv:1504.04909, 2015.
- Neural architecture search with reinforcement learning. In ICLR, 2016.
- Darts: Differentiable architecture search. In ICLR, 2018.
- A review of neural architecture search. Neurocomputing, 474:82–93, 2022.
- Large-scale evolution of image classifiers. In ICML. PMLR, 2017.
- Reinforcement learning for neural architecture search: A review. Image and Vision Computing, 89:57–66, 2019.
- Rethinking architecture selection in differentiable nas. In ICLR, 2020.
- Once-for-all: Train one network and specialize it for efficient deployment. In ICLR, 2019.
- Structured pruning for deep convolutional neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- Serene: Sensitivity-based regularization of neurons for structured sparsity in neural networks. IEEE Transactions on Neural Networks and Learning Systems, 33(12):7237–7250, 2021.
- Learning sparse neural networks through l0subscript𝑙0l_{0}italic_l start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT regularization. In ICLR, 2018.
- To prune, or not to prune: Exploring the efficacy of pruning for model compression, 2018.
- Snip: Single-shot network pruning based on connection sensitivity. In ICLR, 2019.
- What is the state of neural network pruning? MLSys, 2020.
- The state of sparsity in deep neural networks. arXiv preprint arXiv:1902.09574, 2019.
- On the role of structured pruning for neural network compression. In ICIP. IEEE, 2021.
- Can unstructured pruning reduce the depth in deep neural networks? In ICCV, 2023.
- Deepreduce: Relu reduction for fast private inference. In ICML. PMLR, 2021.
- Selective network linearization for efficient private inference. In ICML. PMLR, 2022.
- Layer folding: Neural network depth reduction using activation linearization. BMVC, 2022.
- Exact distribution for the product of two correlated gaussian random variables. IEEE Signal Processing Letters, 23(11):1662–1666, 2016.
- Cecil C Craig. On the frequency function of xy. The Annals of Mathematical Statistics, 7(1):1–15, 1936.
- An approach to distribution of the product of two normal variables. Discussiones Mathematicae Probability and Statistics, 32(1-2):87–99, 2012.
- Learning multiple layers of features from tiny images. 2009.
- Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015.
- In search of lost domain generalization. In ICLR, 2020.
- Automated flower classification over a large number of classes. In Indian Conference on Computer Vision, Graphics and Image Processing, Dec 2008.
- Describing textures in the wild. In CVPR, 2014.
- Fine-grained visual classification of aircraft. Technical report, 2013.
- Dsd2: Can we dodge sparse double descent and compress the neural network worry-free? In AAAI, 2024.
- Pruning neural networks without any data by iteratively conserving synaptic flow. NeurIPS, 2020.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.