Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Structurally Regularized CNN Architecture via Adaptive Subband Decomposition (2306.16604v1)

Published 29 Jun 2023 in eess.IV

Abstract: We propose a generalized convolutional neural network (CNN) architecture that first decomposes the input signal into subbands by an adaptive filter bank structure, and then uses convolutional layers to extract features from each subband independently. Fully connected layers finally combine the extracted features to perform classification. The proposed architecture restrains each of the subband CNNs from learning using the entire input signal spectrum, resulting in structural regularization. Our proposed CNN architecture is fully compatible with the end-to-end learning mechanism of typical CNN architectures and learns the subband decomposition from the input dataset. We show that the proposed CNN architecture has attractive properties, such as robustness to input and weight-and-bias quantization noise, compared to regular full-band CNN architectures. Importantly, the proposed architecture significantly reduces computational costs, while maintaining state-of-the-art classification accuracy. Experiments on image classification tasks using the MNIST, CIFAR-10/100, Caltech-101, and ImageNet-2012 datasets show that the proposed architecture allows accuracy surpassing state-of-the-art results. On the ImageNet-2012 dataset, we achieved top-5 and top-1 validation set accuracy of 86.91% and 69.73%, respectively. Notably, the proposed architecture offers over 90% reduction in computation cost in the inference path and approximately 75% reduction in back-propagation (per iteration) with just a single-layer subband decomposition. With a 2-layer subband decomposition, the computational gains are even more significant with comparable accuracy results to the single-layer decomposition.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.
  2. J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” 2015.[Online]. Available: arXiv:1411.4038.
  3. O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, “Show and tell: A neural image caption generator,” 2015.[Online]. Available: arXiv:1411.4555.
  4. A. Toshev and C. Szegedy, “DeepPose: Human pose estimation via deep neural networks,” in 2014 IEEE Conference on Computer Vision and Pattern Recognition.   IEEE, jun 2014. [Online]. Available: https://doi.org/10.1109%2Fcvpr.2014.214
  5. Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” in Proceedings of the IEEE, 1998, pp. 2278–2324.
  6. M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” 2013.[Online]. Available: arXiv:1311.2901.
  7. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2015.[Online]. Available: arXiv:1409.1556.
  8. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” 2014.[Online]. Available: arXiv:1409.4842.
  9. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” 2015.[Online]. Available: arXiv:1512.03385.
  10. M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, “Spatial transformer networks,” 2016.[Online]. Available: arXiv:1506.02025.
  11. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in IEEE Conf. Computer Vision and Pattern Recog., CVPR, 2015, pp. 1–9.
  12. K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 9, pp. 1904–1916, 2015.
  13. G. Koch, R. Zemel, and R. Salakhutdinov, “Siamese neural networks for one-shot image recognition,” in 32nd International Conference on Machine Learning, 2015.
  14. F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <0.5mb model size,” 2016.[Online]. Available: arXiv:1602.07360.
  15. O. M. Parkhi, A. Vedaldi, and A. Zisserman, “Deep face recognition,” in BMVC, 2015.
  16. Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, “Vggface2: A dataset for recognising faces across pose and age,” 2018.[Online]. Available: arXiv:1710.08092.
  17. F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A unified embedding for face recognition and clustering,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).   IEEE, jun 2015.
  18. Y. Liu, Y.-H. Wu, G. Sun, L. Zhang, A. Chhatkuli, and L. V. Gool, “Vision transformers with hierarchical attention,” 2022.[Online]. Available: arXiv:2106.03180.
  19. K. Crammer, A. Kulesza, and M. Dredze, “Adaptive regularization of weight vectors,” in Advances in Neural Information Processing Systems 22.   Curran Associates, Inc., 2009, pp. 414–422.
  20. X. Sun, “Structure regularization for structured prediction: Theories and experiments,” 2015.[Online]. Available: arXiv:1411.6243.
  21. A. Tong, D. van Dijk, J. S. Stanley, III, M. Amodio, G. Wolf, and S. Krishnaswamy, “Graph Spectral Regularization for Neural Network Interpretability,” ArXiv e-prints, Sep. 2018.
  22. V. N. Ekambaram, G. Fanti, B. Ayazifar, and K. Ramchandran, “Wavelet-regularized graph semi-supervised learning,” in 2013 IEEE Global Conference on Signal and Information Processing, Dec 2013, pp. 423–426.
  23. S.-C. B. Lo, H. Li, J.-S. Lin, A. Hasegawa, C. Y. Wu, M. T. Freedman, and S. K. Mun, “Artificial convolution neural network with wavelet kernels for disease pattern recognition,” Proc.SPIE, vol. 2434, pp. 1 – 10, 1995.
  24. E. Kang, J. Min, and J. C. Ye, “A deep convolutional neural network using directional wavelets for low-dose x-ray CT reconstruction,” Medical Physics, vol. 44, no. 10, pp. e360–e375, oct 2017.
  25. T. Williams and R. Li, “Advanced image classification using wavelets and convolutional neural networks,” in 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), 2016, pp. 233–239.
  26. S. Fujieda, K. Takayama, and T. Hachisuka, “Wavelet convolutional neural networks,” 2018.[Online]. Available: arXiv:1805.08620.
  27. M. Ulicny, V. A. Krylov, and R. Dahyot, “Harmonic convolutional networks based on discrete cosine transform,” 2022.[Online]. Available: arXiv:2001.06570.
  28. E. Oyallon, E. Belilovsky, S. Zagoruyko, and M. Valko, “Compressing the input for CNNs with the first-order scattering transform,” in Computer Vision – ECCV 2018.   Springer International Publishing, 2018, pp. 305–320.
  29. J. Bruna and S. Mallat, “Invariant scattering convolution networks,” 2012.[Online]. Available: arXiv:1203.1513.
  30. P. Sinha, I. Psaromiligkos, and Z. Zilic, “A structurally regularized convolutional neural network for image classification using wavelet-based subband decomposition,” in 2019 IEEE International Conference on Image Processing (ICIP), 2019, pp. 649–653.
  31. L. D. Milić and J. D. ćertić, “Recursive digital filters and two-channel filter banks: Frequency-response masking approach,” in 2009 9th International Conference on Telecommunication in Modern Satellite, Cable, and Broadcasting Services, 2009, pp. 177–184.
  32. D. I, “The wavelet transform, time-frequency localization and signal analysis,” IEEE Transactions on Information Theory, vol. 36, no. 5, pp. 961–1005, 1990.
  33. K. He and J. Sun, “Convolutional neural networks at constrained time cost,” 2014.[Online]. Available: arXiv:1412.1710.
  34. T. Hoefler, D. Alistarh, T. Ben-Nun, N. Dryden, and A. Peste, “Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks,” 2021.
  35. S. Kim, C. Lee, H. Park, J. Wang, S. Park, and C. S. Park, “Optimizations of scatter network for sparse cnn accelerators,” in 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2019, pp. 256–257.
  36. S. Changpinyo, M. Sandler, and A. Zhmoginov, “The power of sparsity in convolutional neural networks,” CoRR, vol. abs/1702.06257, 2017.
  37. Z.-G. Liu, P. N. Whatmough, and M. Mattina, “Systolic tensor array: An efficient structured-sparse gemm accelerator for mobile cnn inference,” IEEE Computer Architecture Letters, vol. 19, no. 1, pp. 34–37, 2020.
  38. S. Changpinyo, M. Sandler, and A. Zhmoginov, “The power of sparsity in convolutional neural networks,” 2017.[Online]. Available: arXiv:1702.06257.
  39. C. Liu, Y. Wu, Y. Lin, and S. Chien, “A kernel redundancy removing policy for convolutional neural network,” CoRR, vol. abs/1705.10748, 2017.
  40. H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning filters for efficient convnets,” CoRR, vol. abs/1608.08710, 2016. [Online]. Available: http://arxiv.org/abs/1608.08710
  41. V. Sze, Y. Chen, T. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,” CoRR, vol. abs/1703.09039, 2017.
  42. M. Flierl and B. Girod, “Generalized b pictures and the draft h.264/avc video-compression standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 587–597, 2003.
  43. Y. Xu and Y. Zhou, “H.264 video communication based refined error concealment schemes,” IEEE Transactions on Consumer Electronics, vol. 50, no. 4, pp. 1135–1141, 2004.
  44. D. D. Lin, S. S. Talathi, and V. S. Annapureddy, “Fixed point quantization of deep convolutional networks,” in Intl. Conf. on Machine Learning, ser. ICML’16.   JMLR.org, 2016, pp. 2849–2858.
  45. A. Zhou, A. Yao, Y. Guo, L. Xu, and Y. Chen, “Incremental network quantization: Towards lossless cnns with low-precision weights,” 2017.[Online]. Available: arXiv:1702.03044.
  46. Y. Zhou, S.-M. Moosavi-Dezfooli, N.-M. Cheung, and P. Frossard, “Adaptive quantization for deep neural network,” 2017.[Online]. Available: arXiv:1712.01048.
  47. Y. LeCun and C. Cortes, “MNIST handwritten digit database,” 2010. [Online]. Available: http://yann.lecun.com/exdb/mnist/
  48. A. Krizhevsky and G. Hinton, “Learning multiple layers of features from tiny images,” University of Toronto, Toronto, Ontario, Tech. Rep. 0, 2009.
  49. L. Fei-Fei, R. Fergus, and P. Perona, “Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories,” in 2004 Conf. on Computer Vision and Pattern Recog. Workshop, June 2004, pp. 178–178.
  50. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “ImageNet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision (IJCV), vol. 115, no. 3, pp. 211–252, 2015.
  51. B. Xu, N. Wang, T. Chen, and M. Li, “Empirical evaluation of rectified activations in convolutional network,” 2015.[Online]. Available: arXiv:1505.00853.
  52. R. Graham, “Snow removal–a noise-stripping process for picture signals,” IRE Transactions on Information Theory, vol. 8, no. 2, pp. 129–144, 1962.
  53. A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” 2017.[Online]. Available: arXiv:1704.04861.
  54. M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” 2019.[Online]. Available: arXiv:1801.04381.
  55. T. Pappas and D. Neuhoff, “Printer models and error diffusion,” IEEE Transactions on Image Processing, vol. 4, no. 1, pp. 66–80, 1995.
  56. L. Akarun, Y. Yardunci, and A. Cetin, “Adaptive methods for dithering color images,” IEEE Transactions on Image Processing, vol. 6, no. 7, pp. 950–955, 1997.
  57. R. Goyal, J. Vanschoren, V. van Acht, and S. Nijssen, “Fixed-point quantization of convolutional neural networks for quantized inference on embedded platforms,” 2021.[Online]. Available: arXiv:2102.02147.
Citations (2)

Summary

We haven't generated a summary for this paper yet.