Compressive Feature Selection for Remote Visual Multi-Task Inference
Abstract: Deep models produce a number of features in each internal layer. A key problem in applications such as feature compression for remote inference is determining how important each feature is for the task(s) performed by the model. The problem is especially challenging in the case of multi-task inference, where the same feature may carry different importance for different tasks. In this paper, we examine how effective is mutual information (MI) between a feature and a model's task output as a measure of the feature's importance for that task. Experiments involving hard selection and soft selection (unequal compression) based on MI are carried out to compare the MI-based method with alternative approaches. Multi-objective analysis is provided to offer further insight.
- W. Su, L. Li, F. Liu, M. He, and X. Liang, “AI on the edge: A comprehensive review,” Artif. Intell. Rev., vol. 55, no. 8, pp. 6125–6183, Dec. 2022.
- Y. Kang, J. H. C. Gao, A. Rovinski, T. Mudge, J. Mars, and L. Tang, “Neurosurgeon: Collaborative intelligence between the cloud and mobile edge,” SIGARCH Comput. Archit. News, vol. 45, no. 1, pp. 615–629, Apr. 2017.
- N. Shlezinger and I. V. Bajić, “Collaborative inference for AI-empowered IoT devices,” IEEE Internet of Things Magazine, vol. 5, no. 4, pp. 92–98, 2022.
- A. E. Eshratifar and M. Pedram, “Energy and performance efficient computation offloading for deep neural networks in a mobile cloud computing environment,” in Proc. ACM Great Lakes Symp. on VLSI (GLSVLSI’18), 2018, pp. 111–116.
- Y. Matsubara, M. Levorato, and F. Restuccia, “Split computing and early exiting for deep learning applications: Survey and research challenges,” ACM Comput. Surv., vol. 55, no. 5, pp. 1–30, Dec. 2022.
- H. Choi and I. V. Bajić, “Deep feature compression for collaborative object detection,” in Proc. IEEE ICIP, 2018.
- Z. Chen, K. Fan, S. Wang, L. Duan, W. Lin, and A. C. Kot, “Toward intelligent sensing: Intermediate deep feature compression,” IEEE Trans. Image Processing, vol. 29, 2020.
- ISO/IEC, “Draft call for evidence for video coding for machines,” ISO/IEC JTC1/SC29/WG11/w19508, Jul. 2020.
- ——, “Common test and training conditions for FCM,” ISO/IEC JTC1/SC29/WG04/N0427, Oct. 2023.
- J. Ascenso, E. Alshina, and T. Ebrahimi, “The JPEG AI standard: Providing efficient human and machine visual data consumption,” IEEE MultiMedia, vol. 30, no. 1, 2023.
- X. Liu, P. He, W. Chen, and J. Gao, “Multi-task deep neural networks for natural language understanding,” in Proc. ACL, Jul. 2019, pp. 4487–4496.
- M. Crawshaw, “Multi-task learning with deep neural networks: A survey,” arXiv preprint arXiv: arXiv:2009.09796, 2020.
- Y. Fang, W. Wang, B. Xie, Q. Sun, L. Wu, X. Wang, T. Huang, X. Wang, and Y. Cao, “EVA: Exploring the limits of masked visual representation learning at scale,” in CVPR, 2023, pp. 19 358–19 369.
- I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” JMLR, 2003.
- J. R. Vergara and P. A. Estévez, “A review of feature selection methods based on mutual information,” Neural Comput. & Appl., 2013.
- N. Hoque, D. K. Bhattacharyya, and J. K. Kalita, “MIFS-ND: a mutual information-based feature selection method,” Expert Syst. Appl., vol. 41, no. 14, pp. 6371–6385, 2014.
- M. Beraha, A. M. Metelli, M. Papini, A. Tirinzoni, and M. Restelli, “Feature selection via mutual information: New theoretical insights,” in IJCNN, 2019.
- M. R. Ganesh, J. J. Corso, and S. Y. Sekeh, “MINT: deep network compression via mutual information-based neuron trimming,” arXiv preprint arXiv:2003.08472, 2020.
- M. K. Lee, S. Lee, S. H. Lee, and B. C. Song, “Channel pruning via gradient of mutual information for light-weight convolutional neural networks,” in Proc. IEEE ICIP, 2020.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in CVPR’16, 2016, pp. 770–778.
- Y. He, G. Kang, X. Dong, Y. Fu, and Y. Yang, “Soft filter pruning for accelerating deep convolutional neural networks,” in Proc. IJCAI, 2018, pp. 2234–2240.
- H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning filters for efficient convnets,” in ICLR, 2017.
- Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang, “Learning efficient convolutional networks through network slimming,” in ICCV, 2017, pp. 2736–2744.
- X. Chen, Y. Wang, Y. Zhang, P. Du, C. Xu, and C. Xu, “Multi-task pruning for semantic segmentation networks,” arXiv preprint arXiv:2007.08386, 2020.
- Y. He, P. Liu, Z. Wang, Z. Hu, and Y. Yang, “Filter pruning via geometric median for deep convolutional neural networks acceleration,” in CVPR, 2019, pp. 4340–4349.
- P. Singh, V. K. Verma, P. Rai, and V. P. Namboodiri, “Leveraging filter correlations for deep model compression,” in Proc. IEEE WACV, 2020, pp. 824–833.
- H. Zhuo, X. Qian, Y. Fu, H. Yang, and X. Xue, “Scsp: Spectral clustering filter pruning with soft self-adaption manners,” arXiv preprint arXiv:1806.05320, 2018.
- A. Polyak and L. Wolf, “Channel-level acceleration of deep face representations,” IEEE Access, vol. 3, 2015.
- H. Li, C. Ma, W. Xu, and X. Liu, “Feature statistics guided efficient filter pruning,” arXiv preprint arXiv:2005.12193, 2020.
- J. Luo, J. Wu, and W. Lin, “Thinet: A filter level pruning method for deep neural network compression,” in ICCV, 2017.
- M. Lin, R. Ji, Y. Wang, Y. Zhang, B. Zhang, Y. Tian, and L. Shao, “HRank: filter pruning using high-rank feature map,” in CVPR, 2020, pp. 1526–1535.
- K. Yamamoto and K. Maeno, “Pcas: Pruning channels with attention statistics for deep network compression,” arXiv preprint arXiv:1806.05382, 2018.
- L. Paninski, “Estimation of entropy and mutual information,” Neural Computation, vol. 15, no. 6, pp. 1191–1253, 2003.
- M. I. Belghazi, A. Baratin, S. Rajeswar, S. Ozair, Y. Bengio, A. Courville, and R. D. Hjelm, “MINE: Mutual information neural estimation,” PMLR, vol. 80, pp. 531–540, 2018.
- D. McAllester and K. Stratos, “Formal limitations on the measurement of mutual information,” in Proc. Int. Conf. Artificial Intelligence and Statistics (AISTATS), Aug. 2020, pp. 875–884.
- A. M. Saxe, Y. Bansal, J. Dapello, M. Advani, A. Kolchinsky, B. D. Tracey, and D. D. Cox, “On the information bottleneck theory of deep learning,” in ICLR, 2018.
- J. Boets, K. D. Cock, and B. D. Moor, “A mutual information based distance for multivariate gaussian processes,” in Modeling, Estimation and Control. Springer, 2007, pp. 15–33.
- A. Likas, N. Vlassis, and J. J. Verbeek, “The global k-means clustering algorithm,” Pattern Recognition, vol. 36, no. 2, pp. 451–461, 2003.
- S. R. Alvar and I. V. Bajić, “Bit allocation for multi-task collaborative intelligence,” in Proc. IEEE ICASSP, 2020.
- E. Shelhamer, J. Long, and T. Darrell, “Fully convolutional networks for semantic segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 4, pp. 640–651, April 2017.
- M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The Cityscapes dataset for semantic urban scene understanding,” in CVPR, 2016.
- J. Redmon and A. Farhadi, “YOLOv3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.