Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss (2404.01692v2)

Published 2 Apr 2024 in cs.CV

Abstract: In real-world scenarios, image recognition tasks, such as semantic segmentation and object detection, often pose greater challenges due to the lack of information available within low-resolution (LR) content. Image super-resolution (SR) is one of the promising solutions for addressing the challenges. However, due to the ill-posed property of SR, it is challenging for typical SR methods to restore task-relevant high-frequency contents, which may dilute the advantage of utilizing the SR method. Therefore, in this paper, we propose Super-Resolution for Image Recognition (SR4IR) that effectively guides the generation of SR images beneficial to achieving satisfactory image recognition performance when processing LR images. The critical component of our SR4IR is the task-driven perceptual (TDP) loss that enables the SR network to acquire task-specific knowledge from a network tailored for a specific task. Moreover, we propose a cross-quality patch mix and an alternate training framework that significantly enhances the efficacy of the TDP loss by addressing potential problems when employing the TDP loss. Through extensive experiments, we demonstrate that our SR4IR achieves outstanding task performance by generating SR images useful for a specific image recognition task, including semantic segmentation, object detection, and image classification. The implementation code is available at https://github.com/JaehaKim97/SR4IR.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (79)
  1. NTIRE 2017 challenge on single image super-resolution: Dataset and study. In CVPR Workshops, 2017.
  2. Finding tiny faces in the wild with generative adversarial network. In CVPR, 2018a.
  3. Sod-mtgan: Small object detection via multi-task generative adversarial network. In ECCV, 2018b.
  4. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In BMVC, 2012.
  5. Super-resolution with deep convolutional sufficient statistics. In ICLR, 2016.
  6. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE TPAMI, 2016.
  7. Rethinking atrous convolution for semantic image segmentation. arXiv, 2017.
  8. Ilvr: Conditioning method for denoising diffusion probabilistic models. In ICCV, 2021.
  9. Exploring resolution and degradation clues as self-supervised signal for low quality object detection. In ECCV, 2022.
  10. Is image super-resolution helpful for other vision tasks? In WACV, 2016.
  11. Second-order attention network for single image super-resolution. In CVPR, 2019.
  12. Image super-resolution using deep convolutional networks. IEEE TPAMI, 2016.
  13. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2020.
  14. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html.
  15. Texture synthesis using convolutional neural networks. In NIPS, 2015.
  16. Shortcut learning in deep neural networks. Nature Machine Intelligence, 2020.
  17. Generative adversarial nets. In NIPS, 2014.
  18. Deep back-projection networks for super-resolution. In CVPR, 2018.
  19. Task-driven super resolution: Object detection in low-resolution images. In ICONIP, 2021.
  20. Deep residual learning for image recognition. In CVPR, 2016.
  21. Denoising diffusion probabilistic models. NIPS, 2020.
  22. Efficientnet: Rethinking model scaling for convolutional neural networks. ICCV, 2019.
  23. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv, 2017.
  24. Densely connected convolutional networks. In CVPR, 2017.
  25. Sgdr: Stochastic gradient descent with warm restarts. In ICLR, 2017.
  26. Decoupled weight decay regularization. In ICLR, 2019.
  27. Perceptual losses for real-time style transfer and super-resolution. In ECCV, 2016.
  28. Denoising diffusion restoration models. NIPS, 2022.
  29. Accurate image super-resolution using very deep convolutional networks. In CVPR, 2016.
  30. 3d object representations for fine-grained categorization. In ICCV workshops, 2013.
  31. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
  32. Photo-realistic single image super-resolution using a generative adversarial network. In CVPR, 2017.
  33. SwinIR: Image restoration using swin transformer. In ICCV, 2021.
  34. Details or artifacts: A locally discriminative learning approach to realistic image super-resolution. In CVPR, 2022.
  35. Enhanced deep residual networks for single image super-resolution. In CVPR Workshops, 2017.
  36. Microsoft COCO: common objects in context. In ECCV, 2014.
  37. Focal loss for dense object detection. In ICCV, 2017.
  38. When image denoising meets high-level vision tasks: A deep learning approach. In IJCAI, 2018.
  39. Ssd: Single shot multibox detector. In ECCV, 2016.
  40. A convnet for the 2020s. CVPR, 2022a.
  41. Exploring simple and transferable recognition-aware image processing. IEEE TPAMI, 2022b.
  42. Fully convolutional networks for semantic segmentation. In CVPR, 2015.
  43. Image restoration with mean-reverting stochastic differential equations. In ICML, 2023.
  44. TorchVision maintainers and contributors. Torchvision: Pytorch’s computer vision library. https://github.com/pytorch/vision, 2016.
  45. Single image super-resolution via a holistic attention network. In ECCV, 2020.
  46. Content-aware local gan for photo-realistic super-resolution. In ICCV, 2023.
  47. Automatic differentiation in pytorch. In NIPS Workshops, 2017.
  48. You only look once: Unified, real-time object detection. In CVPR, 2016.
  49. Faster r-cnn: Towards real-time object detection with region proposal networks. In NIPS, 2015.
  50. ImageNet large scale visual recognition challenge. IJCV, 2015.
  51. Image super-resolution via iterative refinement. IEEE TPAMI, 2022.
  52. Enhancenet: Single image super-resolution through automated texture synthesis. In ICCV, 2017.
  53. Grad-cam: Visual explanations from deep networks via gradient-based localization. In ICCV, 2017.
  54. The effects of super-resolution on object detection performance in satellite imagery. In CVPR Workshops, 2019.
  55. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
  56. Natural and realistic single image super-resolution with explicit natural manifold discrimination. In CVPR, 2019.
  57. Toward real-world super-resolution via adaptive downsampling models. IEEE TPAMI, 2021.
  58. Efficientnet: Rethinking model scaling for convolutional neural networks. ICML, 2019.
  59. Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 2008.
  60. The caltech-ucsd birds-200-2011 dataset. Technical Report CNS-TR-2011-001, California Institute of Technology, 2011.
  61. Dual super-resolution learning for semantic segmentation. In CVPR, 2020.
  62. ESRGAN: enhanced super-resolution generative adversarial networks. In ECCV Workshops, 2018.
  63. Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In ICCV workshop, 2021.
  64. Zero-shot image restoration using denoising diffusion null-space model. In ICLR, 2023.
  65. Uformer: A general u-shaped transformer for image restoration. In CVPR, 2022.
  66. Diffir: Efficient diffusion model for image restoration. In ICCV, 2023.
  67. Segformer: Simple and efficient design for semantic segmentation with transformers. 2021.
  68. Scaling up to excellence: Practicing model scaling for photo-realistic image restoration in the wild. In CVPR, 2024.
  69. Cutmix: Regularization strategy to train strong classifiers with localizable features. In ICCV, 2019.
  70. Restormer: Efficient transformer for high-resolution image restoration. In CVPR, 2022.
  71. On single image scale-up using sparse-representations. In International Conference on Curves and Surfaces, 2010.
  72. Mixup: Beyond empirical risk minimization. In ICLR, 2017a.
  73. Learning deep cnn denoiser prior for image restoration. In CVPR, 2017b.
  74. The unreasonable effectiveness of deep features as a perceptual metric. In CVPR, 2018a.
  75. Topformer: Token pyramid transformer for mobile semantic segmentation. In CVPR, 2022.
  76. Image super-resolution using very deep residual channel attention networks. In ECCV, 2018b.
  77. Residual dense network for image super-resolution. In CVPR, 2018c.
  78. Residual super-resolution single shot network for low-resolution object detection. IEEE Access, 6:47780–47793, 2018.
  79. Improving low-resolution image classification by super-resolution with enhancing high-frequency content. In ICPR, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Jaeha Kim (4 papers)
  2. Junghun Oh (6 papers)
  3. Kyoung Mu Lee (107 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.