Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CascadedGaze: Efficiency in Global Context Extraction for Image Restoration (2401.15235v2)

Published 26 Jan 2024 in eess.IV, cs.CV, and cs.LG

Abstract: Image restoration tasks traditionally rely on convolutional neural networks. However, given the local nature of the convolutional operator, they struggle to capture global information. The promise of attention mechanisms in Transformers is to circumvent this problem, but it comes at the cost of intensive computational overhead. Many recent studies in image restoration have focused on solving the challenge of balancing performance and computational cost via Transformer variants. In this paper, we present CascadedGaze Network (CGNet), an encoder-decoder architecture that employs Global Context Extractor (GCE), a novel and efficient way to capture global information for image restoration. The GCE module leverages small kernels across convolutional layers to learn global dependencies, without requiring self-attention. Extensive experimental results show that our computationally efficient approach performs competitively to a range of state-of-the-art methods on synthetic image denoising and single image deblurring tasks, and pushes the performance boundary further on the real image denoising task.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. A high-quality denoising dataset for smartphone cameras. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  1692–1700, 2018.
  2. Token merging: Your vit but faster, 2023.
  3. End-to-end object detection with transformers. In European conference on computer vision, pp.  213–229. Springer, 2020.
  4. Spatial-adaptive network for single image denoising. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXX 16, pp.  171–187. Springer, 2020.
  5. Hinet: Half instance normalization network for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  182–192, 2021.
  6. Simple baselines for image restoration. In European Conference on Computer Vision, pp.  17–33. Springer, 2022a.
  7. Cross aggregation transformer for image restoration. Advances in Neural Information Processing Systems, 35:25478–25490, 2022b.
  8. Nbnet: Noise basis learning for image denoising with subspace projection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  4896–4906, 2021.
  9. Improving image restoration by revisiting global information aggregation. In European Conference on Computer Vision, pp.  53–71. Springer, 2022.
  10. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  11. Image denoising: The deep learning revolution and beyond–a survey paper–. arXiv preprint arXiv:2301.03362, 2023.
  12. Selective residual m-net for real image denoising. In 2022 30th European Signal Processing Conference (EUSIPCO), pp.  469–473. IEEE, 2022.
  13. Raanan Fattal. Image upsampling via imposed edge statistics. In ACM SIGGRAPH 2007 papers, pp.  95–es. 2007.
  14. Rich Franzen. Kodak lossless true color image suite. source: http://r0k. us/graphics/kodak, 4(2):9, 1999.
  15. A survey on vision transformer. IEEE transactions on pattern analysis and machine intelligence, 45(1):87–110, 2022.
  16. Fastervit: Fast vision transformers with hierarchical attention, 2023.
  17. Single image haze removal using dark channel prior. IEEE Transactionson Pattern Analysis and Machine Intelligence, 33(12):2341, 2011.
  18. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  5197–5206, 2015.
  19. Deep photo: Model-based photograph enhancement and viewing. ACM transactions on graphics (TOG), 27(5):1–10, 2008.
  20. Efficient and explicit modelling of image hierarchies for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  18278–18289, 2023.
  21. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF international conference on computer vision, pp.  1833–1844, 2021.
  22. A survey of transformers. AI Open, 2022.
  23. More convnets in the 2020s: Scaling up kernels beyond 51x51 using sparsity. arXiv preprint arXiv:2207.03620, 2022a.
  24. Multi-outputs is all you need for deblur. arXiv preprint arXiv:2208.13029, 2022b.
  25. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pp.  10012–10022, 2021.
  26. A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  11976–11986, 2022c.
  27. Soft: Softmax-free transformer with linear complexity. Advances in Neural Information Processing Systems, 34:21297–21309, 2021.
  28. Luna: Linear unified nested attention. Advances in Neural Information Processing Systems, 34:2441–2453, 2021.
  29. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, volume 2, pp.  416–423. IEEE, 2001.
  30. Nonparametric blind super-resolution. In Proceedings of the IEEE International Conference on Computer Vision, pp.  945–952, 2013.
  31. Deep multi-scale convolutional neural network for dynamic scene deblurring. In CVPR, July 2017.
  32. Spatially-adaptive image restoration using distortion-guided networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  2309–2319, 2021.
  33. Stand-alone self-attention in vision models. Advances in neural information processing systems, 32, 2019.
  34. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp.  234–241, 2015.
  35. Neuron-level interpretation of deep nlp models: A survey. Transactions of the Association for Computational Linguistics, 10:1285–1303, 2022.
  36. A survey of deep learning approaches to image restoration. Neurocomputing, 487:46–65, 2022a.
  37. A survey of deep learning approaches to image restoration. Neurocomputing, 487:46–65, 2022b.
  38. Ghostnetv2: enhance cheap operation with long-range attention. Advances in Neural Information Processing Systems, 35:9969–9982, 2022.
  39. Scale-recurrent network for deep image deblurring. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  8174–8182, 2018.
  40. Training data-efficient image transformers & distillation through attention. In International conference on machine learning, pp.  10347–10357. PMLR, 2021.
  41. Maxim: Multi-axis mlp for image processing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  5769–5780, 2022.
  42. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  43. Linformer: Self-attention with linear complexity. arXiv preprint arXiv:2006.04768, 2020.
  44. Uformer: A general u-shaped transformer for image restoration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  17683–17693, 2022.
  45. Nyströmformer: A nyström-based algorithm for approximating self-attention. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp.  14138–14148, 2021.
  46. Metaformer is actually what you need for vision. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  10819–10829, 2022.
  47. Tokens-to-token vit: Training vision transformers from scratch on imagenet. In Proceedings of the IEEE/CVF international conference on computer vision, pp.  558–567, 2021.
  48. Dual adversarial network: Toward real-world noise removal and noise generation. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16, pp.  41–58. Springer, 2020.
  49. Cycleisp: Real image restoration via improved data synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  2696–2705, 2020.
  50. Multi-stage progressive image restoration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  14821–14831, 2021.
  51. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  5728–5739, 2022.
  52. Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13, pp.  818–833. Springer, 2014.
  53. Accurate image restoration with attention retractable transformer. In The Eleventh International Conference on Learning Representations, 2022.
  54. Deblurring by realistic blurring. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  2737–2746, 2020a.
  55. Color demosaicking by local directional interpolation and nonlocal adaptive thresholding. Journal of Electronic imaging, 20(2):023016–023016, 2011.
  56. Kbnet: Kernel basis network for image restoration, 2023.
  57. Residual dense network for image restoration. IEEE transactions on pattern analysis and machine intelligence, 43(7):2480–2495, 2020b.
  58. Comprehensive and delicate: An efficient transformer for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  14122–14132, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Amirhosein Ghasemabadi (3 papers)
  2. Mohammad Salameh (20 papers)
  3. Muhammad Kamran Janjua (12 papers)
  4. Chunhua Zhou (4 papers)
  5. Fengyu Sun (15 papers)
  6. Di Niu (67 papers)
Citations (4)
Reddit Logo Streamline Icon: https://streamlinehq.com