Learning Enriched Features via Selective State Spaces Model for Efficient Image Deblurring (2403.20106v2)
Abstract: Image deblurring aims to restore a high-quality image from its corresponding blurred. The emergence of CNNs and Transformers has enabled significant progress. However, these methods often face the dilemma between eliminating long-range degradation perturbations and maintaining computational efficiency. While the selective state space model (SSM) shows promise in modeling long-range dependencies with linear complexity, it also encounters challenges such as local pixel forgetting and channel redundancy. To address this issue, we propose an efficient image deblurring network that leverages selective state spaces model to aggregate enriched and accurate features. Specifically, we introduce an aggregate local and global information block (ALGBlock) designed to effectively capture and integrate both local invariant properties and non-local information. The ALGBlock comprises two primary modules: a module for capturing local and global features (CLGF), and a feature aggregation module (FA). The CLGF module is composed of two branches: the global branch captures long-range dependency features via a selective state spaces model, while the local branch employs simplified channel attention to model local connectivity, thereby reducing local pixel forgetting and channel redundancy. In addition, we design a FA module to accentuate the local part by recalibrating the weight during the aggregation of the two branches for restoration. Experimental results demonstrate that the proposed method outperforms state-of-the-art approaches on widely used benchmarks.
- A. Karaali and C. R. Jung, “Edge-based defocus blur estimation with adaptive scale selection,” IEEE Transactions on Image Processing, vol. 27, no. 3, pp. 1126–1137, 2017.
- W. Dong, L. Zhang, G. Shi, and X. Wu, “Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization,” IEEE Transactions on Image Processing, vol. 20, no. 7, pp. 1838–1857, 2011.
- Y. Cui, W. Ren, S. Yang, X. Cao, and A. Knoll, “Irnext: Rethinking convolutional network design for image restoration,” in Proceedings of the 40th International Conference on Machine Learning, 2023.
- L. Chen, X. Chu, X. Zhang, and J. Sun, “Simple baselines for image restoration,” ECCV, 2022.
- S. J. Cho, S. W. Ji, J. P. Hong, S. W. Jung, and S. J. Ko, “Rethinking coarse-to-fine approach in single image deblurring,” in ICCV, 2021.
- S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M.-H. Yang, and L. Shao, “Multi-stage progressive image restoration,” in CVPR, 2021.
- L. Chen, X. Lu, J. Zhang, X. Chu, and C. Chen, “Hinet: Half instance normalization network for image restoration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2021, pp. 182–192.
- J. Pan, D. Sun, J. Zhang, J. Tang, J. Yang, Y. W. Tai, and M. H. Yang, “Dual convolutional neural networks for low-level vision,” International Journal of Computer Vision, 2022.
- V. Singh, K. Ramnath, and A. Mittal, “Refining high-frequencies for sharper super-resolution and deblurring,” Computer Vision and Image Understanding, vol. 199, no. C, p. 103034, 2020.
- D. Chen and M. E. Davies, “Deep decomposition learning for inverse imaging problems,” in Proceedings of the European Conference on Computer Vision (ECCV), 2020.
- O. Kupyn, V. Budzan, M. Mykhailych, D. Mishkin, and J. Matas, “Deblurgan: Blind motion deblurring using conditional adversarial networks,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8183–8192, 2017.
- K. Zhang, W. Luo, Y. Zhong, L. Ma, B. Stenger, W. Liu, and H. Li, “Deblurring by realistic blurring,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2734–2743, 2020.
- O. Kupyn, T. Martyniuk, J. Wu, and Z. Wang, “Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better,” 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8877–8886, 2019.
- L. Kong, J. Dong, J. Ge, M. Li, and J. Pan, “Efficient frequency domain-based transformers for high-quality image deblurring,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5886–5895.
- X. Feng, H. Ji, W. Pei, J. Li, G. Lu, and D. Zhang, “U2-former: Nested u-shaped transformer for image restoration via multi-view contrastive learning,” IEEE Transactions on Circuits and Systems for Video Technology, pp. 1–1, 2023.
- S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, and M.-H. Yang, “Restormer: Efficient transformer for high-resolution image restoration,” in CVPR, 2022.
- F.-J. Tsai, Y.-T. Peng, Y.-Y. Lin, C.-C. Tsai, and C.-W. Lin, “Stripformer: Strip transformer for fast image deblurring,” in ECCV, 2022.
- Z. Wang, X. Cun, J. Bao, W. Zhou, J. Liu, and H. Li, “Uformer: A general u-shaped transformer for image restoration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 17 683–17 693.
- J. Dong, J. Pan, Z. Yang, and J. Tang, “Multi-scale residual low-pass filter network for image deblurring,” in 2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 12 311–12 320.
- S. Nah, T. H. Kim, and K. M. Lee, “Deep multi-scale convolutional neural network for dynamic scene deblurring,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 257–265, 2016.
- A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” arXiv preprint arXiv:2312.00752, 2023.
- H. Mehta, A. Gupta, A. Cutkosky, and B. Neyshabur, “Long range language modeling via gated state spaces,” arXiv preprint arXiv:2206.13947, 2022.
- J. T. Smith, A. Warrington, and S. W. Linderman, “Simplified state space layers for sequence modeling,” arXiv preprint arXiv:2208.04933, 2022.
- S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M.-H. Yang, and L. Shao, “Learning enriched features for fast image restoration and enhancement,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022.
- Y. Cui, Y. Tao, Z. Bing, W. Ren, X. Gao, X. Cao, K. Huang, and A. Knoll, “Selective frequency network for image restoration,” in The Eleventh International Conference on Learning Representations, 2023.
- Y. Cui, W. Ren, X. Cao, and A. Knoll, “Image restoration via frequency selection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 2, pp. 1093–1108, 2024.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” arXiv, 2017.
- H. Chen, Y. Wang, T. Guo, C. Xu, Y. Deng, Z. Liu, S. Ma, C. Xu, C. Xu, and W. Gao, “Pre-trained image processing transformer,” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12 294–12 305, 2020.
- X. Zhou, H. Huang, Z. Wang, and R. He, “Ristra: Recursive image super-resolution transformer with relativistic assessment,” IEEE Transactions on Multimedia, pp. 1–12, 2024.
- J. Liang, J. Cao, G. Sun, K. Zhang, L. Van Gool, and R. Timofte, “Swinir: Image restoration using swin transformer,” arXiv preprint arXiv:2108.10257, 2021.
- A. Gu, I. Johnson, K. Goel, K. Saab, T. Dao, A. Rudra, and C. Ré, “Combining recurrent, convolutional, and continuous-time models with linear state space layers,” Advances in neural information processing systems, vol. 34, pp. 572–585, 2021.
- J. T. Smith, A. Warrington, and S. W. Linderman, “Simplified state space layers for sequence modeling,” ICLR, 2023.
- X. Ma, C. Zhou, X. Kong, J. He, L. Gui, G. Neubig, J. May, and Z. Luke, “Mega: Moving average equipped gated attention,” ICLR, 2023.
- D. Y. Fu, E. L. Epstein, E. Nguyen, A. W. Thomas, M. Zhang, T. Dao, A. Rudra, and C. Ré, “Simple hardware-efficient long convolutions for sequence modeling,” in Proceedings of the 40th International Conference on Machine Learning, 2023.
- Y. Li, T. Cai, Y. Zhang, D. Chen, and D. Dey, “What makes convolutional models great on long sequence modeling?” ICLR, 2023.
- L. Zhu, B. Liao, Q. Zhang, X. Wang, W. Liu, and X. Wang, “Vision mamba: Efficient visual representation learning with bidirectional state space model,” 2024.
- Y. Liu, Y. Tian, Y. Zhao, H. Yu, L. Xie, Y. Wang, Q. Ye, and Y. Liu, “Vmamba: Visual state space model,” arXiv preprint arXiv:2401.10166, 2024.
- J. Ma, F. Li, and B. Wang, “U-mamba: Enhancing long-range dependency for biomedical image segmentation,” arXiv preprint arXiv:2401.04722, 2024.
- K. Purohit, M. Suin, A. N. Rajagopalan, and V. N. Boddeti, “Spatially-adaptive image restoration using distortion-guided networks,” CoRR, vol. abs/2108.08617, 2021.
- Y. Zhang, Q. Li, M. Qi, D. Liu, J. Kong, and J. Wang, “Multi-scale frequency separation network for image deblurring,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 10, pp. 5525–5537, 2023.
- D. Park, D. U. Kang, J. Kim, and S. Y. Chun, “Multi-temporal recurrent neural networks for progressive non-uniform single image deblurring with incremental temporal training,” in European Conference on Computer Vision, 2019.
- H. Zhang, Y. Dai, H. Li, and P. Koniusz, “Deep stacked hierarchical multi-patch network for image deblurring,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
- Z. Shen, W. Wang, X. Lu, J. Shen, H. Ling, T. Xu, and L. Shao, “Human-aware motion deblurring,” 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5571–5580, 2019.
- J. Rim, H. Lee, J. Won, and S. Cho, “Real-world blur dataset for learning and benchmarking deblurring algorithms,” in Proceedings of the European Conference on Computer Vision (ECCV), 2020.
- A. Abuolaim and M. S. Brown, “Defocus deblurring using dual-pixel data,” in European Conference on Computer Vision. Springer, 2020, pp. 111–126.
- D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” Computer Science, 2014.
- I. Loshchilov and F. Hutter, “Sgdr: Stochastic gradient descent with warm restarts,” 2016.
- A. Karaali and C. R. Jung, “Edge-based defocus blur estimation with adaptive scale selection,” IEEE Transactions on Image Processing, vol. 27, no. 3, pp. 1126–1137, 2018.
- J. Lee, S. Lee, S. Cho, and S. Lee, “Deep defocus map estimation using domain adaptation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- J. Shi, L. Xu, and J. Jia, “Just noticeable defocus blur detection and estimation,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 657–665.
- H. Son, J. Lee, S. Cho, and S. Lee, “Single image defocus deblurring using kernel-sharing parallel atrous convolutions,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 2642–2650.
- J. Lee, H. Son, J. Rim, S. Cho, and S. Lee, “Iterative filter adaptive network for single image defocus deblurring,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 2034–2042.
- Hu Gao (15 papers)
- Depeng Dang (16 papers)