CFAT: Unleashing TriangularWindows for Image Super-resolution (2403.16143v1)
Abstract: Transformer-based models have revolutionized the field of image super-resolution (SR) by harnessing their inherent ability to capture complex contextual features. The overlapping rectangular shifted window technique used in transformer architecture nowadays is a common practice in super-resolution models to improve the quality and robustness of image upscaling. However, it suffers from distortion at the boundaries and has limited unique shifting modes. To overcome these weaknesses, we propose a non-overlapping triangular window technique that synchronously works with the rectangular one to mitigate boundary-level distortion and allows the model to access more unique sifting modes. In this paper, we propose a Composite Fusion Attention Transformer (CFAT) that incorporates triangular-rectangular window-based local attention with a channel-based global attention technique in image super-resolution. As a result, CFAT enables attention mechanisms to be activated on more image pixels and captures long-range, multi-scale features to improve SR performance. The extensive experimental results and ablation study demonstrate the effectiveness of CFAT in the SR domain. Our proposed model shows a significant 0.7 dB performance improvement over other state-of-the-art SR architectures.
- Low-complexity single-image super-resolution based on nonnegative neighbor embedding. 2012.
- Swin-unet: Unet-like pure transformer for medical image segmentation. In European conference on computer vision (ECCV), pages 205–218. Springer, 2022.
- End-to-end object detection with transformers. In European conference on computer vision (ECCV), pages 213–229. Springer, 2020.
- Pre-trained image processing transformer. In Computer Vision and Pattern Recognition (CVPR), pages 12299–12310. IEEE/CVF, 2021.
- Activating more pixels in image super-resolution transformer. arxiv 2022. arXiv preprint arXiv:2205.04437, 1, 2022.
- Activating more pixels in image super-resolution transformer. In Computer Vision and Pattern Recognition (CVPR), pages 22367–22377. IEEE/CVF, 2023.
- Twins: Revisiting the design of spatial attention in vision transformers. Advances in Neural Information Processing Systems (NeurIPS), 34:9355–9366, 2021.
- Swin2sr: Swinv2 transformer for compressed image super-resolution and restoration. In European conference on computer vision (ECCV), pages 669–687. Springer, 2022.
- Second-order attention network for single image super-resolution. In Computer Vision and Pattern Recognition (CVPR), pages 11065–11074. IEEE/CVF, 2019.
- Learning a deep convolutional network for image super-resolution. In European conference on computer vision (ECCV), pages 184–199. Springer, 2014.
- Image super-resolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence (PAMI), 38(2):295–307, 2015.
- Accelerating the super-resolution convolutional neural network. In European conference on computer vision (ECCV), pages 391–407. Springer, 2016.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
- Interpreting super-resolution networks with local attribution maps. In Computer Vision and Pattern Recognition (CVPR), pages 9199–9208. IEEE/CVF, 2021.
- Closed-loop matters: Dual regression networks for single image super-resolution. In Computer Vision and Pattern Recognition (CVPR), pages 5407–5416. IEEE/CVF, 2020.
- Glance and focus networks for dynamic visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 45(4):4605–4621, 2022.
- Single image super-resolution from transformed self-exemplars. In Computer Vision and Pattern Recognition (CVPR), pages 5197–5206. IEEE/CVF, 2015.
- Fast and accurate single image super-resolution via information distillation network. In Computer Vision and Pattern Recognition (CVPR), pages 723–731. IEEE/CVF, 2018.
- Accurate image super-resolution using very deep convolutional networks. In Computer Vision and Pattern Recognition (CVPR), pages 1646–1654. IEEE/CVF, 2016.
- Classsr: A general framework to accelerate super-resolution networks by data characteristic. In Computer Vision and Pattern Recognition (CVPR), pages 12016–12025. IEEE/CVF, 2021.
- Photo-realistic single image super-resolution using a generative adversarial network. In Computer Vision and Pattern Recognition (CVPR), pages 4681–4690. IEEE/CVF, 2017.
- Hst: Hierarchical swin transformer for compressed image super-resolution. In European Conference on Computer Vision (ECCV), pages 651–668. Springer, 2022.
- Uniformer: Unifying convolution and self-attention for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2023.
- On efficient transformer and image pre-training for low-level vision. arXiv preprint arXiv:2112.10175, 3(7):8, 2021.
- Bringing locality to vision transformers. arXiv preprint arXiv:2104.05707, 2021.
- Swinir: Image restoration using swin transformer. In International Conference on Computer Vision (ICCV), pages 1833–1844. IEEE, 2021.
- Enhanced deep residual networks for single image super-resolution. In Computer Vision and Pattern Recognition Workshops (CVPR-W), pages 136–144. IEEE/CVF, 2017.
- Swin transformer: Hierarchical vision transformer using shifted windows. In International Conference on Computer Vision (ICCV), pages 10012–10022. IEEE, 2021.
- Transformer for single image super-resolution. In Computer Vision and Pattern Recognition (CVPR), pages 457–466. IEEE/CVF, 2022.
- Latticenet: Towards lightweight image super-resolution with lattice block. In European conference on computer vision (ECCV), pages 272–289. Springer, 2020.
- Deep constrained least squares for blind image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17642–17652, 2022.
- A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In International Conference on Computer Vision (ICCV), volume 2, pages 416–423. IEEE, 2001.
- Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications, 76:21811–21838, 2017.
- Adavit: Adaptive vision transformers for efficient image recognition. In Computer Vision and Pattern Recognition (CVPR), pages 12309–12318. IEEE/CVF, 2022.
- Single image super-resolution via a holistic attention network. In European conference on computer vision (ECCV), pages 191–207. Springer, 2020.
- Dual circle contrastive learning-based blind image super-resolution. IEEE Transactions on Circuits and Systems for Video Technology, 2023.
- challenge on single image super-resolution: Methods and results. In Computer Vision and Pattern Recognition (CVPR), pages 18–22. IEEE/CVF, 2018.
- Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS), 30, 2017.
- Cvt: Introducing convolutions to vision transformers. In International Conference on Computer Vision (ICCV), pages 22–31. IEEE, 2021.
- Early convolutions help transformers see better. Advances in Neural Information Processing Systems (NeurIPS), 34:30392–30400, 2021.
- Enriched cnn-transformer feature aggregation networks for super-resolution. In Winter Conference on Applications of Computer Vision (WACV), pages 4956–4965. IEEE/CVF, 2023.
- Dipnet: Efficiency distillation and iterative pruning for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1692–1701, 2023.
- On single image scale-up using sparse-representations. In International Conference Curves and Surfaces, pages 711–730. Springer, 2012.
- Accurate image restoration with attention retractable transformer. arXiv preprint arXiv:2210.01427, 2022.
- Image super-resolution using very deep residual channel attention networks. In European conference on computer vision (ECCV), pages 286–301. Springer, 2018.
- Residual dense network for image super-resolution. In Computer Vision and Pattern Recognition (CVPR), pages 2472–2481. IEEE/CVF, 2018.
- Attention retractable frequency fusion transformer for image super resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1756–1763, 2023.