Toward Real World Stereo Image Super-Resolution via Hybrid Degradation Model and Discriminator for Implied Stereo Image Information (2312.07934v1)
Abstract: Real-world stereo image super-resolution has a significant influence on enhancing the performance of computer vision systems. Although existing methods for single-image super-resolution can be applied to improve stereo images, these methods often introduce notable modifications to the inherent disparity, resulting in a loss in the consistency of disparity between the original and the enhanced stereo images. To overcome this limitation, this paper proposes a novel approach that integrates a implicit stereo information discriminator and a hybrid degradation model. This combination ensures effective enhancement while preserving disparity consistency. The proposed method bridges the gap between the complex degradations in real-world stereo domain and the simpler degradations in real-world single-image super-resolution domain. Our results demonstrate impressive performance on synthetic and real datasets, enhancing visual perception while maintaining disparity consistency. The complete code is available at the following \href{https://github.com/fzuzyb/SCGLANet}{link}.
- X. Ji, Y. Cao, Y. Tai, C. Wang, J. Li, and F. Huang, “Real-world super-resolution via kernel estimation and noise injection,” in proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 466–467, 2020.
- K. Zhang, J. Liang, L. Van Gool, and R. Timofte, “Designing a practical degradation model for deep blind image super-resolution,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4791–4800, 2021.
- J. Liang, J. Cao, G. Sun, K. Zhang, L. Van Gool, and R. Timofte, “Swinir: Image restoration using swin transformer,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844, 2021.
- X. Wang, L. Xie, C. Dong, and Y. Shan, “Real-esrgan: Training real-world blind super-resolution with pure synthetic data,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1905–1914, 2021.
- Z. Luo, Y. Huang, S. Li, L. Wang, and T. Tan, “Learning the degradation distribution for blind image super-resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6063–6072, 2022.
- J. Liang, H. Zeng, and L. Zhang, “Efficient and degradation-adaptive network for real-world image super-resolution,” in Proceedings of the European Conference on Computer Vision, pp. 574–591, 2022.
- C. Mou, Y. Wu, X. Wang, C. Dong, J. Zhang, and Y. Shan, “Metric learning based interactive modulation for real-world super-resolution,” in Proceedings of the European Conference on Computer Vision, pp. 723–740, 2022.
- R. K. Cosner, I. D. J. Rodriguez, T. G. Molnar, W. Ubellacker, Y. Yue, A. D. Ames, and K. L. Bouman, “Self-supervised online learning for safety-critical control using stereo vision,” in Proceedings of the IEEE Conference International Conference on Robotics and Automation, pp. 11487–11493, 2022.
- W. Chuah, R. Tennakoon, R. Hoseinnezhad, D. Suter, and A. Bab-Hadiashar, “Semantic guided long range stereo depth estimation for safer autonomous vehicle applications,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 10, pp. 18916–18926, 2022.
- B. Krajancich, P. Kellnhofer, and G. Wetzstein, “Optimizing depth perception in virtual and augmented reality through gaze-contingent stereo rendering,” ACM Transactions on Graphics, vol. 39, no. 6, pp. 1–10, 2020.
- L. Wang, Y. Wang, Z. Liang, Z. Lin, J. Yang, W. An, and Y. Guo, “Learning parallax attention for stereo image super-resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12250–12259, 2019.
- X. Ying, Y. Wang, L. Wang, W. Sheng, W. An, and Y. Guo, “A stereo attention module for stereo image super-resolution,” IEEE Signal Processing Letters, vol. 27, pp. 496–500, 2020.
- W. Song, S. Choi, S. Jeong, and K. Sohn, “Stereoscopic image super-resolution with stereo consistent feature,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12031–12038, 2020.
- Y. Wang, X. Ying, L. Wang, J. Yang, W. An, and Y. Guo, “Symmetric parallax attention for stereo image super-resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 766–775, 2021.
- X. Zhu, K. Guo, H. Fang, L. Chen, S. Ren, and B. Hu, “Cross view capture for stereo image super-resolution,” IEEE Transactions on Multimedia, vol. 24, pp. 3074–3086, 2021.
- C. Chen, C. Qing, X. Xu, and P. Dickinson, “Cross parallax attention network for stereo image super-resolution,” IEEE Transactions on Multimedia, vol. 24, pp. 202–216, 2022.
- X. Chu, L. Chen, and W. Yu, “Nafssr: stereo image super-resolution using nafnet,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1239–1248, 2022.
- Z. He, Z. Jin, and Y. Zhao, “Srdrl: A blind super-resolution framework with degradation reconstruction loss,” IEEE Transactions on Multimedia, vol. 24, pp. 2877–2889, 2022.
- Y. Chen, C. Shen, X.-S. Wei, L. Liu, and J. Yang, “Adversarial posenet: A structure-aware convolutional network for human pose estimation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1212–1221, 2017.
- W. Yang, X. Zhang, Y. Tian, W. Wang, J.-H. Xue, and Q. Liao, “Deep learning for single image super-resolution: A brief review,” IEEE Transactions on Multimedia, vol. 21, no. 12, pp. 3106–3121, 2019.
- J. Kim, J. K. Lee, and K. M. Lee, “Accurate image super-resolution using very deep convolutional networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1646–1654, 2016.
- C. Dong, C. C. Loy, and X. Tang, “Accelerating the super-resolution convolutional neural network,” in Proceedings of the European Conference on Computer Vision, pp. 391–407, 2016.
- T. Tong, G. Li, X. Liu, and Q. Gao, “Image super-resolution using dense skip connections,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4799–4807, 2017.
- B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced deep residual networks for single image super-resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144, 2017.
- Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu, “Image super-resolution using very deep residual channel attention networks,” in Proceedings of the European Conference on Computer Vision, pp. 286–301, 2018.
- Y. Liu, S. Wang, J. Zhang, S. Wang, S. Ma, and W. Gao, “Iterative network for image super-resolution,” IEEE Transactions on Multimedia, vol. 24, pp. 2259–2272, 2022.
- M. Zhang, Q. Wu, J. Guo, Y. Li, and X. Gao, “Heat transfer-inspired network for image super-resolution reconstruction,” IEEE Transactions on Neural Networks and Learning Systems, 2022.
- I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” Communications of the ACM, vol. 63, no. 11, pp. 139–144, 2020.
- C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4681–4690, 2017.
- X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. Change Loy, “Esrgan: Enhanced super-resolution generative adversarial networks,” in Proceedings of the European Conference on Computer Vision Workshops, pp. 1–8, 2018.
- W. Zhang, Y. Liu, C. Dong, and Y. Qiao, “Ranksrgan: Generative adversarial networks with ranker for image super-resolution,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3096–3105, 2019.
- Y. Yan, C. Liu, C. Chen, X. Sun, L. Jin, X. Peng, and X. Zhou, “Fine-grained attention and feature-sharing generative adversarial networks for single image super-resolution,” IEEE Transactions on Multimedia, vol. 24, pp. 1473–1487, 2021.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in Neural Information Processing Systems, vol. 30, pp. 1–11, 2017.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” 2020.
- F. Yang, H. Yang, J. Fu, H. Lu, and B. Guo, “Learning texture transformer network for image super-resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5791–5800, 2020.
- Z. Lu, J. Li, H. Liu, C. Huang, L. Zhang, and T. Zeng, “Transformer for single image super-resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 457–466, 2022.
- Y. Yuan, S. Liu, J. Zhang, Y. Zhang, C. Dong, and L. Lin, “Unsupervised image super-resolution using cycle-in-cycle generative adversarial networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 701–710, 2018.
- J. Gu, H. Lu, W. Zuo, and C. Dong, “Blind super-resolution with iterative kernel correction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1604–1613, 2019.
- A. Lugmayr, M. Danelljan, and R. Timofte, “Unsupervised learning for real-world super-resolution,” in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop, pp. 3408–3416, 2019.
- A. Ignatov, N. Kobyshev, R. Timofte, K. Vanhoey, and L. Van Gool, “Dslr-quality photos on mobile devices with deep convolutional networks,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3277–3285, 2017.
- D. S. Jeon, S.-H. Baek, I. Choi, and M. H. Kim, “Enhancing the spatial resolution of stereo images using a parallax prior,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1721–1730, 2018.
- B. Yan, C. Ma, B. Bare, W. Tan, and S. C. Hoi, “Disparity-aware domain adaptation in stereo image restoration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13179–13187, 2020.
- K. Jin, Z. Wei, A. Yang, S. Guo, M. Gao, X. Zhou, and G. Guo, “Swinipassr: Swin transformer based parallax attention network for stereo image super-resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 920–929, 2022.
- Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022, 2021.
- Y. Wang, L. Wang, J. Yang, W. An, and Y. Guo, “Flickr1024: A large-scale dataset for stereo image super-resolution,” in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 1–6, 2019.
- Y. Zhou, Y. Xue, W. Deng, R. Nie, J. Zhang, et al., “Stereo cross global learnable attention module for stereo image super-resolution,” in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 1–10, 2023.
- W. Shi, J. Caballero, F. Huszár, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang, “Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1874–1883, 2016.
- J. L. Ba, J. R. Kiros, and G. E. Hinton, “Layer normalization,” arXiv preprint arXiv:1607.06450, 2016.
- A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, et al., “Searching for mobilenetv3,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324, 2019.
- J.-N. Su, M. Gan, G.-Y. Chen, J.-L. Yin, and C. P. Chen, “Global learnable attention for single image super-resolution,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
- T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, “Spectral normalization for generative adversarial networks,” arXiv preprint arXiv:1802.05957, 2018.
- C. Ma, B. Yan, W. Tan, and X. Jiang, “Perception-oriented stereo image super-resolution,” in Proceedings of the 29th ACM International Conference on Multimedia, pp. 2420–2428, 2021.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
- R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 586–595, 2018.
- D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nešić, X. Wang, and P. Westling, “High-resolution stereo datasets with subpixel-accurate ground truth,” in Pattern Recognition: 36th German Conference, Münster, Germany, Proceedings 36, pp. 31–42, 2014.
- A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3354–3361, 2012.
- A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “The kitti vision benchmark suite,” URL http://www. cvlibs. net/datasets/kitti, vol. 2, no. 5, pp. 1–13, 2015.
- Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual dense network for image super-resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2472–2481, 2018.
- J. Lei, Z. Zhang, X. Fan, B. Yang, X. Li, Y. Chen, and Q. Huang, “Deep stereoscopic image super-resolution via interaction module,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 8, pp. 3051–3061, 2020.
- Q. Dai, J. Li, Q. Yi, F. Fang, and G. Zhang, “Feedback network for mutually boosted stereo image super-resolution and disparity estimation,” in Proceedings of the 29th ACM International Conference on Multimedia, pp. 1985–1993, 2021.
- L. Wang, Y. Guo, Y. Wang, J. Li, S. Gu, and R. Timofte, “Ntire 2023 challenge on stereo image super-resolution: Methods and results,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–26, 2023.
- C. Ma, C.-Y. Yang, X. Yang, and M.-H. Yang, “Learning a no-reference quality metric for single-image super-resolution,” Computer Vision and Image Understanding, vol. 158, pp. 1–16, 2017.
- Y. Blau, R. Mechrez, R. Timofte, T. Michaeli, and L. Zelnik-Manor, “The 2018 pirm challenge on perceptual image super-resolution,” in Proceedings of the European Conference on Computer Vision Workshops, pp. 1–23, 2018.
- L. Lipson, Z. Teed, and J. Deng, “Raft-stereo: Multilevel recurrent field transforms for stereo matching,” in Proceedings of the IEEE Conference on International Conference on 3D Vision, pp. 218–227, 2021.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- I. Loshchilov and F. Hutter, “Stochastic gradient descent with warm restarts,” in Proceedings of the 5th International Conference on Learning Representations, pp. 1–16, 2016.
- J. Gu and C. Dong, “Interpreting super-resolution networks with local attribution maps,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9199–9208, 2021.