The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report (2404.10343v2)
Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such as runtime, parameters, and FLOPs, while still maintaining a peak signal-to-noise ratio (PSNR) of approximately 26.90 dB on the DIV2K_LSDIR_valid dataset and 26.99 dB on the DIV2K_LSDIR_test dataset. In addition, this challenge has 4 tracks including the main track (overall performance), sub-track 1 (runtime), sub-track 2 (FLOPs), and sub-track 3 (parameters). In the main track, all three metrics (ie runtime, FLOPs, and parameter count) were considered. The ranking of the main track is calculated based on a weighted sum-up of the scores of all other sub-tracks. In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking. In sub-track 2, the number of FLOPs was considered. The score calculated based on the corresponding FLOPs was used to determine the ranking. In sub-track 3, the number of parameters was considered. The score calculated based on the corresponding parameters was used to determine the ranking. RLFN is set as the baseline for efficiency measurement. The challenge had 262 registered participants, and 34 teams made valid submissions. They gauge the state-of-the-art in efficient single-image super-resolution. To facilitate the reproducibility of the challenge and enable other researchers to build upon these findings, the code and the pre-trained model of validated solutions are made publicly available at https://github.com/Amazingren/NTIRE2024_ESR/.
- NTIRE 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 126–135, 2017.
- Single image super-resolution: a comprehensive review and recent insight. Frontiers of Computer Science, 18(1):181702, 2024.
- NTIRE 2024 dense and non-homogeneous dehazing challenge report. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024.
- NTIRE 2024 challenge on night photography rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024.
- Collapsible linear blocks for super-efficient super resolution. Proceedings of Machine Learning and Systems, 4:529–547, 2022.
- Pattern recognition and machine learning. Springer, 2006.
- Deep portrait quality assessment. a NTIRE 2024 challenge survey. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024.
- Run, don’t walk: Chasing higher flops for faster neural networks. In IEEE Conf. Comput. Vis. Pattern Recog., 2023a.
- Large kernel frequency-enhanced network for efficient single image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024a.
- Activating more pixels in image super-resolution transformer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 22367–22377, 2023b.
- Mffn: image super-resolution via multi-level features fusion network. The Visual Computer, 40(2):489–504, 2024b.
- NTIRE 2024 challenge on image super-resolution (×4): Methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024c.
- Rethinking coarse-to-fine approach in single image deblurring. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4641–4650, 2021.
- N-gram in swin transformers for efficient lightweight image super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2071–2081, 2023.
- Improving image restoration by revisiting global information aggregation. In Proceedings of European Conference on Computer Vision, pages 53–71. Springer, 2022.
- Empirical evaluation of gated recurrent neural networks on sequence modeling. ArXiv, abs/1412.3555, 2014.
- Deep raw image super-resolution. a NTIRE 2024 challenge survey. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024.
- Lightnet: Generative model for enhancement of low-light images. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, pages 2231–2240, 2023.
- Diverse branch block: Building a convolution as an inception-like unit. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10886–10895, 2021a.
- Repvgg: Making vgg-style convnets great again. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13733–13742, 2021b.
- An image is worth 16×16161616\times 1616 × 16 words: Transformers for image recognition at scale. In ICLR, 2021.
- Parameter-free similarity-aware attention module for medical image classification and segmentation. IEEE Transactions on Emerging Topics in Computational Intelligence, 2022a.
- Anchor-based plain net for mobile image super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2494–2502, 2021.
- Fast and memory-efficient network towards efficient image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 853–862, 2022b.
- Qlabgrad: A hyperparameter-free and convergence-guaranteed scheme for deep learning. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 12072–12081, 2024.
- Total-body low-dose ct image denoising using a prior knowledge transfer technique with a contrastive regularization mechanism. Medical Physics, 50(5):2971–2984, 2023a.
- A two-branch neural network for short-axis pet image quality enhancement. IEEE Journal of Biomedical and Health Informatics, 2023b.
- Oif-net: An optical flow registration-based pet/mr cross-modal interactive fusion network for low-count brain pet image denoising. IEEE Transactions on Medical Imaging, 2023c.
- Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023.
- Zero-shot referring expression comprehension via structural similarity between images and captions. arXiv preprint arXiv:2311.17048, 2023.
- Parameter-efficient fine-tuning for large models: A comprehensive survey. arXiv preprint arXiv:2403.14608, 2024.
- Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016.
- MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
- Lightweight image super-resolution with information multi-distillation network. In Proceedings of the ACM International Conference on Multimedia, pages 2024–2032, 2019.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Residual local feature network for efficient super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 766–776, 2022.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4681–4690, 2017.
- Feature modulation transformer: Cross-refinement of global representation via high-frequency prior for image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12514–12524, 2023a.
- Involution: Inverting the inherence of convolution for visual recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12321–12330, 2021a.
- Dlgsanet: lightweight dynamic local and global self-attention networks for image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12792–12801, 2023b.
- NTIRE 2024 challenge on short-form UGC video quality assessment: Methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024.
- DHP: Differentiable meta pruning via hypernetworks. In Proceeding of the European Conference on Computer Vision, pages 608–624. Springer, 2020.
- The heterogeneity hypothesis: Finding layer-wise differentiated network architectures. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2144–2153, 2021b.
- Efficient and explicit modelling of image hierarchies for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18278–18289, 2023c.
- Lsdir: A large scale dataset for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023d.
- NTIRE 2023 challenge on efficient super-resolution: Methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023e.
- Blueprint separable residual network for efficient image super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 833–843, 2022.
- Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pages 1833–1844, 2021.
- NTIRE 2024 restore any image model (RAIM) in the wild challenge. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024.
- Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 1132–1140, 2017.
- Residual feature distillation network for lightweight image super-resolution. In Computer Vision–ECCV 2020 Workshops: Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16, pages 41–55. Springer, 2020a.
- Residual feature aggregation network for image super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2359–2368, 2020b.
- Adadm: Enabling normalization for image super-resolution. arXiv preprint arXiv:2111.13905, 2021.
- Updp: A unified progressive depth pruner for cnn and vision transformer. AAAI, 2024a.
- NTIRE 2024 quality assessment of AI-generated content challenge. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024b.
- NTIRE 2024 challenge on low light image enhancement: Methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024c.
- MetaPruning: Meta learning for automatic neural network channel pruning. In Proceedings of the IEEE International Conference on Computer Vision, 2019.
- Swin transformer v2: Scaling up capacity and resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12009–12019, 2022.
- Sgdr: Stochastic gradient descent with warm restarts. In ICLR, 2017.
- And: Adversarial neural degradation for learning blind image super-resolution. Advances in Neural Information Processing Systems, 36, 2024.
- Multi-level dispersion residual network for efficient image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 1660–1669, 2023.
- Adaptive feature consolidation network for burst super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1279–1286, 2022.
- Gated multi-resolution transfer network for burst restoration and enhancement. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 22201–22210. IEEE, 2023.
- Space-time super-resolution using graph-cut optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5):995–1008, 2011.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- Evidence-based framework for multi-image super-resolution. In Recent Findings in Intelligent Computing Techniques, pages 413–423, Singapore, 2018. Springer Singapore.
- Masked jigsaw puzzle: A versatile position embedding for vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20382–20391, 2023.
- Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1874–1883, 2016.
- Spatially-adaptive feature modulation for efficient image super-resolution. In ICCV, 2023.
- NTIRE 2017 challenge on single image super-resolution: Methods and results. In CVPR Workshops, 2017.
- Maxim: Multi-axis mlp for image processing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5769–5780, 2022.
- NTIRE 2024 image shadow removal challenge report. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024.
- Attention is all you need. In Advances in Neural Information Processing Systems (NeurIPS), 2017.
- Swift parameter-free attention network for efficient super-resolution. arXiv preprint arXiv:2311.12770, 2023.
- NTIRE 2024 challenge on stereo image super-resolution: Methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024a.
- Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1905–1914, 2021.
- Yan Wang. Edge-enhanced feature distillation network for efficient super-resolution. In IEEE Conf. Comput. Vis. Pattern Recog. Worksh., pages 777–785, 2022.
- A single residual network with esa modules and distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1970–1980, 2023.
- NTIRE 2024 challenge on light field image super-resolution: Methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024b.
- Adan: Adaptive nesterov momentum algorithm for faster optimizing deep models. arXiv preprint arXiv:2208.06677, 2022.
- Simam: A simple, parameter-free attention module for convolutional neural networks. In International conference on machine learning, pages 11863–11874. PMLR, 2021.
- NTIRE 2024 challenge on blind enhancement of compressed image: Methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024.
- Dipnet: Efficiency distillation and iterative pruning for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1692–1701, 2023.
- NTIRE 2024 challenge on HR depth from images of specular and transparent surfaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024.
- Multi-stage progressive image restoration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14821–14831, 2021.
- Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022.
- Edge-oriented convolution block for real-time super resolution on mobile devices. In Proceedings of the 29th ACM International Conference on Multimedia, pages 4034–4043, 2021.
- NTIRE 2024 challenge on bracketing image restoration and enhancement: Datasets, methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024.
- Vision mamba: Efficient visual representation learning with bidirectional state space model. arXiv preprint arXiv:2401.09417, 2024.
- Neural architecture search with reinforcement learning. In Proceedings of International Conference on Learning Representations, 2017.
- Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8697–8710, 2018.
- Bin Ren (136 papers)
- Yawei Li (72 papers)
- Nancy Mehta (10 papers)
- Radu Timofte (299 papers)
- Hongyuan Yu (21 papers)
- Cheng Wan (48 papers)
- Yuxin Hong (6 papers)
- Bingnan Han (4 papers)
- Zhuoyuan Wu (10 papers)
- Yajun Zou (5 papers)
- Yuqing Liu (28 papers)
- Jizhe Li (3 papers)
- Keji He (2 papers)
- Chao Fan (48 papers)
- Heng Zhang (93 papers)
- Xiaolin Zhang (29 papers)
- Xuanwu Yin (12 papers)
- Kunlong Zuo (6 papers)
- Bohao Liao (6 papers)
- Peizhe Xia (4 papers)