Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DRCT: Saving Image Super-resolution away from Information Bottleneck (2404.00722v5)

Published 31 Mar 2024 in cs.CV and cs.AI

Abstract: In recent years, Vision Transformer-based approaches for low-level vision tasks have achieved widespread success. Unlike CNN-based models, Transformers are more adept at capturing long-range dependencies, enabling the reconstruction of images utilizing non-local information. In the domain of super-resolution, Swin-transformer-based models have become mainstream due to their capability of global spatial information modeling and their shifting-window attention mechanism that facilitates the interchange of information between different windows. Many researchers have enhanced model performance by expanding the receptive fields or designing meticulous networks, yielding commendable results. However, we observed that it is a general phenomenon for the feature map intensity to be abruptly suppressed to small values towards the network's end. This implies an information bottleneck and a diminishment of spatial information, implicitly limiting the model's potential. To address this, we propose the Dense-residual-connected Transformer (DRCT), aimed at mitigating the loss of spatial information and stabilizing the information flow through dense-residual connections between layers, thereby unleashing the model's potential and saving the model away from information bottleneck. Experiment results indicate that our approach surpasses state-of-the-art methods on benchmark datasets and performs commendably at the NTIRE-2024 Image Super-Resolution (x4) Challenge. Our source code is available at https://github.com/ming053l/DRCT

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. Ntire 2017 challenge on single image super-resolution: Dataset and study. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2017.
  2. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. 2012.
  3. Pre-trained image processing transformer, 2021.
  4. Mixformer: Mixing features across windows and dimensions, 2022a.
  5. Activating more pixels in image super-resolution transformer, 2023a.
  6. Cross aggregation transformer for image restoration. In NeurIPS, 2022b.
  7. Dual aggregation transformer for image super-resolution. In ICCV, 2023b.
  8. Second-order attention network for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 11065–11074, 2019.
  9. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009.
  10. Image super-resolution using deep convolutional networks, 2015.
  11. An image is worth 16x16 words: Transformers for image recognition at scale, 2021.
  12. Dynamic feature queue for surveillance face anti-spoofing via progressive training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  13. Interpreting super-resolution networks with local attribution maps. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9199–9208, 2021.
  14. Augfpn: Improving multi-scale feature learning for object detection, 2019.
  15. Boundary-aware instance segmentation, 2017.
  16. Real-time compressed sensing for joint hyperspectral image transmission and restoration for cubesat. IEEE Transactions on Geoscience and Remote Sensing, pages 1–1, 2024.
  17. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5197–5206, 2015.
  18. Monodtr: Monocular 3d object detection with depth-aware transformer, 2022.
  19. Photo-realistic single image super-resolution using a generative adversarial network, 2017.
  20. Deeply-Supervised Nets. In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, pages 562–570, San Diego, California, USA, 2015. PMLR.
  21. Feature modulation transformer: Cross-refinement of global representation via high-frequency prior for image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12514–12524, 2023.
  22. On efficient transformer and image pre-training for low-level vision. arXiv preprint arXiv:2112.10175, 2021.
  23. Yawei et al. Li. Lsdir: A large scale dataset for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 1775–1787, 2023.
  24. Swinir: Image restoration using swin transformer. arXiv preprint arXiv:2108.10257, 2021.
  25. Enhanced deep residual networks for single image super-resolution. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2017.
  26. Residual feature aggregation network for image super-resolution. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2356–2365, 2020.
  27. Swin transformer: Hierarchical vision transformer using shifted windows, 2021.
  28. Swinfusion: Cross-domain long-range learning for general image fusion via swin transformer. IEEE/CAA Journal of Automatica Sinica, 9(7):1200–1217, 2022.
  29. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, pages 416–423 vol.2, 2001.
  30. Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications, 76(20):21811–21838, 2017.
  31. Image super-resolution with non-local sparse attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3517–3526, 2021.
  32. Single image super-resolution via a holistic attention network, 2020.
  33. Going deeper with convolutions, 2014.
  34. Inception-v4, inception-resnet and the impact of residual connections on learning, 2016.
  35. Seven ways to improve example-based single image super resolution, 2015.
  36. Ntire 2017 challenge on single image super-resolution: Methods and results. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2017.
  37. Deep learning and the information bottleneck principle. In 2015 IEEE Information Theory Workshop (ITW), pages 1–5, 2015.
  38. Image super-resolution using dense skip connections. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 4809–4817, 2017.
  39. YOLOv9: Learning what you want to learn using programmable gradient information. 2024.
  40. Cspnet: A new backbone that can enhance learning capability of cnn, 2019.
  41. Designing network design strategies through gradient path analysis. arXiv preprint arXiv:2211.04800, 2022.
  42. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7464–7475, 2023.
  43. Esrgan: Enhanced super-resolution generative adversarial networks. In The European Conference on Computer Vision Workshops (ECCVW), 2018.
  44. Uformer: A general u-shaped transformer for image restoration, 2021.
  45. Early convolutions help transformers see better, 2021.
  46. Image super-resolution via channel attention and spatial graph convolutional network. Pattern Recognition, 112:107798, 2021.
  47. Restormer: Efficient transformer for high-resolution image restoration. In CVPR, 2022.
  48. On single image scale-up using sparse-representations. In Curves and Surfaces, pages 711–730, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg.
  49. A lightweight dense connected approach with attention on single image super-resolution. Electronics, 10:1234, 2021.
  50. Swinfir: Revisiting the swinir with fast fourier convolution and improved training for image super-resolution, 2023.
  51. Efficient long-range attention network for image super-resolution, 2022.
  52. Image super-resolution using very deep residual channel attention networks. In ECCV, 2018a.
  53. Residual dense network for image super-resolution. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2472–2481, 2018b.
  54. Yulun et al. Zhang. Ntire 2023 challenge on image super-resolution (×4): Methods and results. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 1865–1884, 2023.
  55. Understanding the robustness in vision transformers, 2022.
  56. Cross-scale internal graph neural network for image super-resolution. In Advances in Neural Information Processing Systems, 2020.
  57. Srformer: Permuted self-attention for single image super-resolution. arXiv preprint arXiv:2303.09735, 2023.
  58. Attention retractable frequency fusion transformer for image super resolution. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 1756–1763, 2023.
Citations (9)

Summary

We haven't generated a summary for this paper yet.