Fortifying Fully Convolutional Generative Adversarial Networks for Image Super-Resolution Using Divergence Measures (2404.06294v1)
Abstract: Super-Resolution (SR) is a time-hallowed image processing problem that aims to improve the quality of a Low-Resolution (LR) sample up to the standard of its High-Resolution (HR) counterpart. We aim to address this by introducing Super-Resolution Generator (SuRGe), a fully-convolutional Generative Adversarial Network (GAN)-based architecture for SR. We show that distinct convolutional features obtained at increasing depths of a GAN generator can be optimally combined by a set of learnable convex weights to improve the quality of generated SR samples. In the process, we employ the Jensen-Shannon and the Gromov-Wasserstein losses respectively between the SR-HR and LR-SR pairs of distributions to further aid the generator of SuRGe to better exploit the available information in an attempt to improve SR. Moreover, we train the discriminator of SuRGe with the Wasserstein loss with gradient penalty, to primarily prevent mode collapse. The proposed SuRGe, as an end-to-end GAN workflow tailor-made for super-resolution, offers improved performance while maintaining low inference time. The efficacy of SuRGe is substantiated by its superior performance compared to 18 state-of-the-art contenders on 10 benchmark datasets.
- O. Rukundo and H. Cao, “Nearest neighbor value interpolation,” arXiv preprint arXiv:1211.1768, 2012.
- C.-Y. Yang, C. Ma, and M.-H. Yang, “Single-image super-resolution: A benchmark,” in European Conference on Computer Vision, 2014, pp. 372–386.
- P. S. Parsania and P. V. Virparia, “A comparative analysis of image interpolation algorithms,” International Journal of Advanced Research in Computer and Communication Engineering, vol. 5, no. 1, pp. 29–34, 2016.
- J.-B. Huang, A. Singh, and N. Ahuja, “Single image super-resolution from transformed self-exemplars,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2015, pp. 5197–5206.
- M. Haris, G. Shakhnarovich, and N. Ukita, “Deep back-projection networks for super-resolution,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 1664–1673.
- C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 2, pp. 295–307, 2015.
- I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” Communications of the ACM, vol. 63, no. 11, pp. 139–144, 2020.
- C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017, pp. 4681–4690.
- A. Odena, V. Dumoulin, and C. Olah, “Deconvolution and checkerboard artifacts,” Distill, 2016.
- Y. Wu and J. Johnson, “Rethinking ”batch” in batchnorm,” arXiv preprint arXiv:2105.07576, 2021.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
- ——, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in IEEE/CVF International Conference on Computer Vision, 2015, pp. 1026–1034.
- O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical Image Computing and Computer Assisted Intervention, 2015, pp. 234–241.
- C. Zhang, F. Rameau, S. Lee, J. Kim, P. Benz, D. M. Argaw, J.-C. Bazin, and I. S. Kweon, “Revisiting residual networks with nonlinear shortcuts.” in British Machine Vision Conference, 2019, p. 12.
- C. Zhang, P. Benz, D. M. Argaw, S. Lee, J. Kim, F. Rameau, J.-C. Bazin, and I. S. Kweon, “Resnet or densenet? introducing dense shortcuts to resnet,” in IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 3550–3559.
- J. Gu, H. Lu, W. Zuo, and C. Dong, “Blind super-resolution with iterative kernel correction,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 1604–1613.
- J. Liang, J. Cao, G. Sun, K. Zhang, L. Van Gool, and R. Timofte, “Swinir: Image restoration using swin transformer,” in IEEE/CVF International Conference on Computer Vision (Workshop), 2021, pp. 1833–1844.
- J. Lee and K. H. Jin, “Local texture estimator for implicit representation function,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 1929–1938.
- F. Mémoli, “Gromov–wasserstein distances and the metric approach to object matching,” Foundations of computational mathematics, vol. 11, pp. 417–487, 2011.
- S. Datta, S. S. Mullick, A. Chakrabarty, and S. Das, “Interval bound interpolation for few-shot learning with few tasks,” in International Conference on Machine Learning, 2023, pp. 7141–7166.
- E. Agustsson and R. Timofte, “Ntire 2017 challenge on single image super-resolution: Dataset and study,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (Workshop), 2017, pp. 1122–1131.
- I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved training of wasserstein gans,” Advances in Neural Information Processing Systems, vol. 30, 2017.
- A. Horé and D. Ziou, “Image quality metrics: Psnr vs. ssim,” in IEEE International Conference on Pattern Recognition, 2010, pp. 2366–2369.
- Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
- J. Xin, J. Li, X. Jiang, N. Wang, H. Huang, and X. Gao, “Wavelet-based dual recursive network for image super-resolution,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 2, pp. 707–720, 2020.
- X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. Change Loy, “Esrgan: Enhanced super-resolution generative adversarial networks,” in European Conference on Computer Vision (Workshop), 2018, pp. 1–16.
- N. C. Rakotonirina and A. Rasoanaivo, “Esrgan+: Further improving enhanced super-resolution generative adversarial network,” in IEEE ICASSP, 2020, pp. 3637–3641.
- X. Wang, L. Xie, C. Dong, and Y. Shan, “Real-esrgan: Training real-world blind super-resolution with pure synthetic data,” in IEEE/CVF International Conference on Computer Vision, 2021, pp. 1905–1914.
- W. Li, K. Zhou, L. Qi, L. Lu, and J. Lu, “Best-buddy gans for highly detailed image super-resolution,” in AAAI Conference on Artificial Intelligence, 2022, pp. 1412–1420.
- T. Dai, J. Cai, Y. Zhang, S.-T. Xia, and L. Zhang, “Second-order attention network for magnification-arbitrary single image super-resolution,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11 065–11 074.
- D. Zhang, F. Huang, S. Liu, X. Wang, and Z. Jin, “Swinfir: Revisiting the swinir with fast fourier convolution and improved training for image super-resolution,” arXiv preprint arXiv:2208.11247, 2022.
- X. Chu, L. Chen, and W. Yu, “Nafssr: stereo image super-resolution using nafnet,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 1239–1248.
- X. Chen, X. Wang, J. Zhou, Y. Qiao, and C. Dong, “Activating more pixels in image super-resolution transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023, pp. 22 367–22 377.
- Z. Chen, Y. Zhang, J. Gu, L. Kong, X. Yang, and F. Yu, “Dual aggregation transformer for image super-resolution,” in Proceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 12 312–12 321.
- Y. Zhou, Z. Li, C.-L. Guo, S. Bai, M.-M. Cheng, and Q. Hou, “Srformer: Permuted self-attention for single image super-resolution,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 12 780–12 791.
- P. Pope, C. Zhu, A. Abdelkader, M. Goldblum, and T. Goldstein, “The intrinsic dimension of images and its impact on learning,” in International Conference on Learning Representations, 2021.
- Y. Wang, F. Perazzi, B. McWilliams, A. Sorkine-Hornung, O. Sorkine-Hornung, and C. Schroers, “A fully progressive approach to single-image super-resolution,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (Workshop), 2018, pp. 864–873.
- S. Bianco, C. Cusano, and R. Schettini, “Color constancy using cnns,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (Workshop), 2015, pp. 81–89.
- T. Karras, T. Aila et al., “Progressive growing of gans for improved quality, stability, and variation,” arXiv preprint arXiv:1710.10196, 2017.
- M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative adversarial networks,” in International Conference on Machine Learning, 2017, pp. 214–223.
- M. Bevilacqua, A. Roumy, C. Guillemot, and M. line Alberi Morel, “Low-complexity single-image super-resolution based on nonnegative neighbor embedding,” in British Machine Vision Conference, 2012, pp. 135.1–135.10.
- R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse-representations,” in International Conference on Curves and Surfaces, 2012, pp. 711–730.
- D. Martin, C. Fowlkes, D. Tal, and J. Malik, “A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics,” in IEEE/CVF International Conference on Computer Vision, 2001, pp. 416–423.
- A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012, pp. 3354–3361.
- D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International journal of computer vision, vol. 47, pp. 7–42, 2002.
- Y. Blau, R. Mechrez, R. Timofte, T. Michaeli, and L. Zelnik-Manor, “The 2018 pirm challenge on perceptual image super-resolution,” in European Conference on Computer Vision (Workshop), 2018, pp. 1–22.
- X. Wang, K. Yu, C. Dong, and C. C. Loy, “Recovering realistic texture in image super-resolution by deep spatial feature transform,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 606–615.
- Y. Matsui, K. Ito et al., “Sketch-based manga retrieval using manga109 dataset,” Multimedia Tools and Applications, vol. 76, pp. 21 811–21 838, 2017.
- C. Bunne, D. Alvarez-Melis, A. Krause, and S. Jegelka, “Learning generative models across incomparable spaces,” in International Conference on Machine Learning, 2019, pp. 851–861.
- M. Haris, G. Shakhnarovich, and N. Ukita, “Deep back-projection networks for single image super-resolution,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 12, pp. 4323–4337, 2021.
- W. Zhang, Y. Liu, C. Dong, and Y. Qiao, “Ranksrgan: Generative adversarial networks with ranker for image super-resolution,” in IEEE/CVF International Conference on Computer Vision, 2019, pp. 3096–3105.
- J. Song, H. Yi, W. Xu, B. Li, and X. Li, “Gram-gan: Image super-resolution based on gram matrix and discriminator perceptual loss,” Sensors, vol. 23, no. 4, 2023.
- S. Xue, W. Qiu, F. Liu, and X. Jin, “Wavelet-based residual attention network for image super-resolution,” Neurocomputing, vol. 382, pp. 116–126, 2020.
- Y. He, A. B. Hamza, and H. Krim, “A generalized divergence measure for robust image registration,” IEEE Transactions on Signal Processing, vol. 51, no. 5, pp. 1211–1220, 2003.
- G. Lecué and M. Lerasle, “Robust machine learning by median-of-means: Theory and practice,” The Annals of Statistics, vol. 48, no. 2, pp. 906 – 931, 2020.