Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

WaveNets: Wavelet Channel Attention Networks (2211.02695v2)

Published 4 Nov 2022 in cs.CV and cs.AI

Abstract: Channel Attention reigns supreme as an effective technique in the field of computer vision. However, the proposed channel attention by SENet suffers from information loss in feature learning caused by the use of Global Average Pooling (GAP) to represent channels as scalars. Thus, designing effective channel attention mechanisms requires finding a solution to enhance features preservation in modeling channel inter-dependencies. In this work, we utilize Wavelet transform compression as a solution to the channel representation problem. We first test wavelet transform as an Auto-Encoder model equipped with conventional channel attention module. Next, we test wavelet transform as a standalone channel compression method. We prove that global average pooling is equivalent to the recursive approximate Haar wavelet transform. With this proof, we generalize channel attention using Wavelet compression and name it WaveNet. Implementation of our method can be embedded within existing channel attention methods with a couple of lines of code. We test our proposed method using ImageNet dataset for image classification task. Our method outperforms the baseline SENet, and achieves the state-of-the-art results. Our code implementation is publicly available at https://github.com/hady1011/WaveNet-C.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. “A review on the attention mechanism of deep learning,” Neurocomputing, vol. 452, pp. 48–62, 2021.
  2. J. Sun, J. Jiang, and Y. Liu, “An introductory survey on attention mechanisms in computer vision problems,” in 2020 6th International Conference on Big Data and Information Analytics (BigDIA), 2020, pp. 295–300.
  3. H. Salman and J. Zhan, “Similarity metric for millions of unlabeled face images,” in 2020 10th Annual Computing and Communication Workshop and Conference (CCWC), 2020, pp. 1033–1040.
  4. M. Guo, T. Xu, J. Liu, Z. Liu, P. Jiang, T. Mu, S. Zhang, R. R. Martin, M. Cheng, and S. Hu, “Attention mechanisms in computer vision: A survey,” CoRR, vol. abs/2111.07624, 2021. [Online]. Available: https://arxiv.org/abs/2111.07624
  5. H. Salman and J. Zhan, “Semi-supervised learning and feature fusion for multi-view data clustering,” in 2020 IEEE International Conference on Big Data (Big Data), 2020, pp. 645–650.
  6. J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 7132–7141.
  7. Z. Yang, L. Zhu, Y. Wu, and Y. Yang, “Gated channel transformation for visual recognition,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 11 791–11 800.
  8. Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, and Q. Hu, “Eca-net: Efficient channel attention for deep convolutional neural networks,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 11 531–11 539.
  9. H. Chen, X. He, L. Qing, S. Xiong, and T. Q. Nguyen, “Dpw-sdnet: Dual pixel-wavelet domain deep cnns for soft decoding of jpeg-compressed images,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, pp. 824–82 409.
  10. Q. Li, L. Shen, S. Guo, and Z. Lai, “Wavecnet: Wavelet integrated cnns to suppress aliasing effect for noise-robust image classification,” IEEE Transactions on Image Processing, vol. 30, pp. 7074–7089, 2021.
  11. M. Fu, H. Liu, Y. Yu, J. Chen, and K. Wang, “Dw-gan: A discrete wavelet transform gan for nonhomogeneous dehazing,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2021, pp. 203–212.
  12. P. Liu, H. Zhang, K. Zhang, L. Lin, and W. Zuo, “Multi-level wavelet-cnn for image restoration,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018.
  13. S. Kushlev and R. P. Mironov, “Analysis for watermark in medical image using watermarking with wavelet transform and dct,” in 2020 55th International Scientific Conference on Information, Communication and Energy Systems and Technologies (ICEST), 2020, pp. 185–188.
  14. M. J. Shensa et al., “The discrete wavelet transform: wedding the a trous and mallat algorithms,” IEEE Transactions on signal processing, vol. 40, no. 10, pp. 2464–2482, 1992.
  15. G. Othman and D. Q. Zeebaree, “The applications of discrete wavelet transform in image processing: A review,” Journal of Soft Computing and Data Mining, vol. 1, no. 2, p. 31–43, Dec. 2020. [Online]. Available: https://publisher.uthm.edu.my/ojs/index.php/jscdmarticle/view/7215
  16. C. Tian, Y. Xu, Z. Li, W. Zuo, L. Fei, and H. Liu, “Attention-guided cnn for image denoising,” Neural Networks, vol. 124, pp. 117–129, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0893608019304241
  17. Y. Li, J. Zeng, S. Shan, and X. Chen, “Occlusion aware facial expression recognition using cnn with attention mechanism,” IEEE Transactions on Image Processing, vol. 28, no. 5, pp. 2439–2450, 2019.
  18. L. Li, M. Xu, X. Wang, L. Jiang, and H. Liu, “Attention based glaucoma detection: A large-scale database and cnn model,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
  19. M. Wu, D. Huang, Y. Guo, and Y. Wang, “Distraction-aware feature learning for human attribute recognition via coarse-to-fine attention mechanism,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, pp. 12 394–12 401, Apr. 2020. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/6925
  20. B. Li, Z. Liu, S. Gao, J.-N. Hwang, J. Sun, and Z. Wang, “Cspa-dn: Channel and spatial attention dense network for fusing pet and mri images,” in 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 8188–8195.
  21. Z. Wang, L. Liu, and F. Li, “Taan: Task-aware attention network for few-shot classification,” in 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 9130–9136.
  22. S.-B. Chen, Q.-S. Wei, W.-Z. Wang, J. Tang, B. Luo, and Z.-Y. Wang, “Remote sensing scene classification via multi-branch local attention network,” IEEE Transactions on Image Processing, vol. 31, pp. 99–109, 2022.
  23. H. Song, Y. Song, and Y. Zhang, “Sca net: Sparse channel attention module for action recognition,” in 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 1189–1196.
  24. Y. Ding, Z. Ma, S. Wen, J. Xie, D. Chang, Z. Si, M. Wu, and H. Ling, “Ap-cnn: Weakly supervised attention pyramid convolutional neural network for fine-grained visual classification,” IEEE Transactions on Image Processing, vol. 30, pp. 2826–2836, 2021.
  25. X. Cun and C.-M. Pun, “Improving the harmony of the composite image by spatial-separated attention module,” IEEE Transactions on Image Processing, vol. 29, pp. 4759–4771, 2020.
  26. S. Li, B. Xie, Q. Lin, C. H. Liu, G. Huang, and G. Wang, “Generalized domain conditioned adaptation network,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–1, 2021.
  27. X. Xue, S.-i. Kamata, and D. Luo, “Skin lesion classification using weakly-supervised fine-grained method,” in 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 9083–9090.
  28. R. K. Srivastava, K. Greff, and J. Schmidhuber, “Highway networks,” arXiv preprint arXiv:1505.00387, 2015.
  29. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE Conf. Comput. Vis. Pattern Recog., 2016, pp. 770–778.
  30. J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual attention network for scene segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 3146–3154.
  31. X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7794–7803.
  32. Y. Cao, J. Xu, S. Lin, F. Wei, and H. Hu, “Gcnet: Non-local networks meet squeeze-excitation networks and beyond,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, Oct 2019.
  33. H. Peng, X. Chen, and J. Zhao, “Residual pixel attention network for spectral reconstruction from rgb images,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2020.
  34. W. Li, X. Zhu, and S. Gong, “Harmonious attention network for person re-identification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 2285–2294.
  35. S. Woo, J. Park, J.-Y. Lee, and I. So Kweon, “Cbam: Convolutional block attention module,” in Eur. Conf. Comput. Vis., 2018, pp. 3–19.
  36. Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, and Q. Hu, “Eca-net: Efficient channel attention for deep convolutional neural networks,” in IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 11 534–11 542.
  37. N. Vosco, A. Shenkler, and M. Grobman, “Tiled squeeze-and-excite: Channel attention with local spatial context,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, October 2021, pp. 345–353.
  38. Z. Qin, P. Zhang, F. Wu, and X. Li, “Fcanet: Frequency channel attention networks,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2021, pp. 783–792.
  39. W. Ma, Z. Pan, J. Guo, and B. Lei, “Achieving super-resolution remote sensing images via the wavelet transform combined with the recursive res-net,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 6, pp. 3512–3527, 2019.
  40. B. Lowe, H. Salman, and J. Zhan, “Ghm wavelet transform for deep image super resolution,” 2022. [Online]. Available: https://arxiv.org/abs/2204.07862
  41. Q. Li, L. Shen, S. Guo, and Z. Lai, “Wavelet integrated cnns for noise-robust image classification,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 7243–7252.
  42. D. D. N. D. Silva, H. W. M. K. Vithanage, K. S. D. Fernando, and I. T. S. Piyatilake, “Multi-path learnable wavelet neural network for image classification,” in Twelfth International Conference on Machine Vision (ICMV 2019), W. Osten and D. P. Nikolaev, Eds., vol. 11433, International Society for Optics and Photonics.   SPIE, 2020, pp. 459 – 467. [Online]. Available: https://doi.org/10.1117/12.2556535
  43. Y. Yu, F. Zhan, S. Lu, J. Pan, F. Ma, X. Xie, and C. Miao, “Wavefill: A wavelet-based generation network for image inpainting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.
  44. L. Liu, J. Liu, S. Yuan, G. Slabaugh, A. Leonardis, W. Zhou, and Q. Tian, “Wavelet-based dual-branch network for image demoiréing,” in European Conference on Computer Vision.   Springer, 2020, pp. 86–102.
  45. L. Dai, X. Liu, C. Li, and J. Chen, “Awnet: Attentive wavelet network for image isp,” in ECCV Workshops, 2020.
  46. Y.-J. Choi, Y.-W. Lee, and B.-G. Kim, “Wavelet attention embedding networks for video super-resolution,” in 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 7314–7320.
  47. P. Aghdaie, B. Chaudhary, S. Soleymani, J. Dawson, and N. M. Nasrabadi, “Attention aware wavelet-based detection of morphed face images,” in 2021 IEEE International Joint Conference on Biometrics (IJCB), 2021, pp. 1–8.
  48. X. Zhou, Y. Wang, Q. Zhu, J. Mao, C. Xiao, X. Lu, and H. Zhang, “A surface defect detection framework for glass bottle bottom using visual attention model and wavelet transform,” IEEE Transactions on Industrial Informatics, vol. 16, no. 4, pp. 2189–2201, 2020.
  49. H.-H. Yang, C.-H. H. Yang, and Y.-C. F. Wang, “Wavelet channel attention module with a fusion network for single image deraining,” in 2020 IEEE International Conference on Image Processing (ICIP), 2020, pp. 883–887.
  50. H. Lee, H.-E. Kim, and H. Nam, “Srm: A style-based recalibration module for convolutional neural networks,” in Int. Conf. Comput. Vis., 2019, pp. 1854–1862.
  51. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “Imagenet large scale visual recognition challenge,” Int. J. Comput. Vis., pp. 211–252, 2015.
  52. T. He, Z. Zhang, H. Zhang, Z. Zhang, J. Xie, and M. Li, “Bag of tricks for image classification with convolutional neural networks,” in IEEE Conf. Comput. Vis. Pattern Recog., 2019, pp. 558–567.
  53. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library,” in Adv. Neural Inform. Process. Syst., 2019, pp. 8026–8037.
  54. Y. He, X. Zhang, and J. Sun, “Channel pruning for accelerating very deep neural networks,” in Int. Conf. Comput. Vis., 2017, pp. 1389–1397.
  55. Z. Zhuang, M. Tan, B. Zhuang, J. Liu, Y. Guo, Q. Wu, J. Huang, and J. Zhu, “Discrimination-aware channel pruning for deep neural networks,” in Adv. Neural Inform. Process. Syst., 2018, pp. 875–886.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Hadi Salman (27 papers)
  2. Caleb Parks (4 papers)
  3. Shi Yin Hong (3 papers)
  4. Justin Zhan (9 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.