Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dual Inverse Degradation Network for Real-World SDRTV-to-HDRTV Conversion (2307.03394v3)

Published 7 Jul 2023 in eess.IV and cs.MM

Abstract: In this study, we address the emerging necessity of converting Standard Dynamic Range Television (SDRTV) content into High Dynamic Range Television (HDRTV) in light of the limited number of native HDRTV content. A principal technical challenge in this conversion is the exacerbation of coding artifacts inherent in SDRTV, which detrimentally impacts the quality of the resulting HDRTV. To address this issue, our method introduces a novel approach that conceptualizes the SDRTV-to-HDRTV conversion as a composite task involving dual degradation restoration. This encompasses inverse tone mapping in conjunction with video restoration. We propose Dual Inversion Downgraded SDRTV to HDRTV Network (DIDNet), which can accurately perform inverse tone mapping while preventing encoding artifacts from being amplified, thereby significantly improving visual quality. DIDNet integrates an intermediate auxiliary loss function to effectively separate the dual degradation restoration tasks and efficient learning of both artifact reduction and inverse tone mapping during end-to-end training. Additionally, DIDNet introduces a spatio-temporal feature alignment module for video frame fusion, which augments texture quality and reduces artifacts. The architecture further includes a dual-modulation convolution mechanism for optimized inverse tone mapping. Recognizing the richer texture and high-frequency information in HDRTV compared to SDRTV, we further introduce a wavelet attention module to enhance frequency features. Our approach demonstrates marked superiority over existing state-of-the-art techniques in terms of quantitative performance and visual quality.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Y. Dun, Z. Da, S. Yang, Y. Xue, and X. Qian, “Kernel-attended residual network for single image super-resolution,” Knowledge-Based Systems, vol. 213, p. 106663, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0950705120307929
  2. X. Chai, F. Shao, Q. Jiang, and H. Ying, “Tccl-net: Transformer-convolution collaborative learning network for omnidirectional image super-resolution,” Knowledge-Based Systems, vol. 274, p. 110625, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0950705123003751
  3. J. Yang, C. Wu, T. You, D. Wang, Y. Li, C. Shang, and Q. Shen, “Hierarchical spatio-spectral fusion for hyperspectral image super resolution via sparse representation and pre-trained deep model,” Knowledge-Based Systems, vol. 260, p. 110170, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0950705122012667
  4. H. Feng, L. Wang, Y. Li, and A. Du, “Lkasr: Large kernel attention for lightweight image super-resolution,” Knowledge-Based Systems, vol. 252, p. 109376, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0950705122006906
  5. C. Liu and P. Lei, “An efficient group skip-connecting network for image super-resolution,” Knowledge-Based Systems, vol. 222, p. 107017, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S095070512100280X
  6. H. Liu, F. Cao, C. Wen, and Q. Zhang, “Lightweight multi-scale residual networks with attention for image super-resolution,” Knowledge-Based Systems, vol. 203, p. 106103, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0950705120303695
  7. L. Huang and Y. Huang, “Drgan: A dual resolution guided low-resolution image inpainting,” Knowledge-Based Systems, vol. 264, p. 110346, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0950705123000965
  8. K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising,” IEEE Transactions on Image Processing, vol. 26, no. 7, pp. 3142–3155, 2017.
  9. P. Esser, R. Rombach, and B. Ommer, “Taming transformers for high-resolution image synthesis,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 12 873–12 883.
  10. X. Chen, Z. Zhang, J. S. Ren, L. Tian, Y. Qiao, and C. Dong, “A new journey from sdrtv to hdrtv,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2021, pp. 4500–4509.
  11. T. Shao, D. Zhai, J. Jiang, and X. Liu, “Hybrid conditional deep inverse tone mapping,” in Proceedings of the 30th ACM International Conference on Multimedia, ser. MM ’22.   New York, NY, USA: Association for Computing Machinery, 2022, p. 1016–1024. [Online]. Available: https://doi.org/10.1145/3503161.3548129
  12. G. Xu, Q. Hou, L. Zhang, and M.-M. Cheng, “Fmnet: Frequency-aware modulation network for sdr-to-hdr translation,” in Proceedings of the 30th ACM International Conference on Multimedia, ser. MM ’22.   New York, NY, USA: Association for Computing Machinery, 2022, p. 6425–6435. [Online]. Available: https://doi.org/10.1145/3503161.3548016
  13. C. Guo, L. Fan, Z. Xue, and X. Jiang, “Learning a practical sdr-to-hdrtv up-conversion using new dataset and degradation models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023, pp. 22 231–22 241.
  14. R. Zhu, Z. Luo, H. Chen, Y. Dong, and S.-T. Wu, “Realizing rec. 2020 color gamut with quantum dot displays,” Optics express, vol. 23, no. 18, pp. 23 680–23 693, 2015.
  15. International Telecommunication Union, “Rec. 709,” 2023, [Online; accessed 1-December-2023]. [Online]. Available: https://en.wikipedia.org/wiki/Rec._709
  16. Wikipedia, “nits,” 2023, [Online; accessed 1-December-2023]. [Online]. Available: https://en.wikipedia.org/wiki/Candela_per_square_metre
  17. N. Xu, T. Chen, J. E. Crenshaw, T. Kunkel, and B. Lee, “Methods and systems for inverse tone mapping,” U.S. Patent US9607364B2, 2013.
  18. A. BallestadAndrey and K. J. WARD, “Method and apparatus for image data transformation,” U.S. Patent US9224363B2, 2015.
  19. L. Wang and K.-J. Yoon, “Deep learning for hdr imaging: State-of-the-art and future trends,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, pp. 8874–8895, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:239050071
  20. L. Tang, H. Huang, Y. Zhang, G. Qi, and Z. Yu, “Structure-embedded ghosting artifact suppression network for high dynamic range image reconstruction,” Knowledge-Based Systems, vol. 263, p. 110278, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S095070512300028X
  21. A. Choudhury and S. Daly, “Hdr image quality assessment using machine-learning based combination of quality metrics,” in 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2018, pp. 91–95.
  22. S. Y. Kim, J. Oh, and M. Kim, “Deep sr-itm: Joint learning of super-resolution and inverse tone-mapping for 4k uhd hdr applications,” international conference on computer vision, 2019.
  23. ——, “Jsi-gan: Gan-based joint super-resolution and inverse tone-mapping with pixel-wise task-specific filters for uhd hdr video,” national conference on artificial intelligence, 2019.
  24. G. He, K. Xu, L. Xu, C. Wu, M. Sun, X. Wen, and Y.-W. Tai, “Sdrtv-to-hdrtv via hierarchical dynamic context feature mapping,” in Proceedings of the 30th ACM International Conference on Multimedia, ser. MM ’22.   New York, NY, USA: Association for Computing Machinery, 2022, p. 2890–2898. [Online]. Available: https://doi.org/10.1145/3503161.3548043
  25. Y. Hu, J. Li, Y. Huang, and X. Gao, “Channel-wise and spatial feature modulation network for single image super-resolution,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 11, pp. 3911–3927, 2019.
  26. B. Xu, J. Zhang, R. Wang, K. Xu, Y.-L. Yang, C. Li, and R. Tang, “Adversarial monte carlo denoising with conditioned auxiliary feature modulation.” ACM Trans. Graph., vol. 38, no. 6, pp. 224–1, 2019.
  27. W. Jang, G. Ju, Y. Jung, J. Yang, X. Tong, and S. Lee, “Stylecarigan: caricature generation via stylegan feature map modulation,” ACM Transactions on Graphics (TOG), vol. 40, no. 4, pp. 1–16, 2021.
  28. J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei, “Deformable convolutional networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 764–773.
  29. X. Wang, K. C. Chan, K. Yu, C. Dong, and C. Change Loy, “Edvr: Video restoration with enhanced deformable convolutional networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019, pp. 0–0.
  30. X. Zhu, H. Hu, S. Lin, and J. Dai, “Deformable convnets v2: More deformable, better results,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 9308–9316.
  31. W. Ma and J. Lu, “An equivalence of fully connected layer and convolutional layer,” arXiv preprint arXiv:1712.01252, 2017.
  32. G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the high efficiency video coding (hevc) standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649–1668, 2012.
  33. M. M. Ho, J. Zhou, and G. He, “Rr-dncnn v2.0: Enhanced restoration-reconstruction deep neural network for down-sampling-based video coding,” IEEE Transactions on Image Processing, vol. 30, pp. 1702–1715, 2021.
  34. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  35. A. Hore and D. Ziou, “Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 20, no. 4, pp. 988–1000, 2010.
  36. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error measurement to structural similarity,” IEEE transactions on image processing, vol. 13, no. 1, pp. 600–612, 2004.
  37. Z. Wang, E. P. Simoncelli, and A. C. Bovik, “Multiscale structural similarity for image quality assessment,” The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, vol. 2, pp. 1398–1402, 2003.
  38. C. Meininger, T. Pruitt, and V. Teoh, “Eitp is now ratified as itu-r bt.2124—is the industry ready to move on from e2000?” SMPTE Motion Imaging Journal, vol. 129, no. 10, pp. 47–58, 2020.
  39. H. R. Sheikh and A. C. Bovik, “A visual information fidelity approach to video quality assessment,” in The first international workshop on video processing and quality metrics for consumer electronics.   sn, 2005, pp. 2117–2128.
  40. R. Reisenhofer, S. Bosse, G. Kutyniok, and T. Wiegand, “A haar wavelet-based perceptual similarity index for image quality assessment,” Signal Processing: Image Communication, vol. 61, pp. 33–43, 2018.
  41. L. Zhang, Y. Shen, and H. Li, “Vsi: A visual saliency-induced index for perceptual image quality assessment,” IEEE Transactions on Image Processing, vol. 23, no. 10, pp. 4270–4281, 2014.
  42. J. He, Y. Liu, Y. Qiao, and C. Dong, “Conditional sequential modulation for efficient global image retouching,” in European Conference on Computer Vision.   Springer, 2020, pp. 679–695.
  43. J. Deng, L. Wang, S. Pu, and C. Zhuo, “Spatio-temporal deformable convolution for compressed video quality enhancement,” in Proceedings of the AAAI conference on artificial intelligence, 2020, pp. 10 696–10 703.
  44. C. Yang, M. Jin, X. Jia, Y. Xu, and Y. Chen, “Adaint: Learning adaptive intervals for 3d lookup tables on real-time image enhancement,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 17 522–17 531.
  45. S. Y. Kim, J. Oh, and M. Kim, “Deep sr-itm: Joint learning of super-resolution and inverse tone-mapping for 4k uhd hdr applications,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3116–3125.
  46. X. Chen, Y. Liu, Z. Zhang, Y. Qiao, and C. Dong, “Hdrunet: Single image hdr reconstruction with denoising and dequantization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 354–363.
  47. R. Reisenhofer, S. Bosse, G. Kutyniok, and T. Wiegand, “A haar wavelet-based perceptual similarity index for image quality assessment,” Signal Processing: Image Communication, vol. 61, pp. 33–43, 2018. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0923596517302187
  48. P. Dai, X. Yu, L. Ma, B. Zhang, J. Li, W. Li, J. Shen, and X. Qi, “Video demoireing with relation-based temporal consistency,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 17 622–17 631.
Citations (6)

Summary

We haven't generated a summary for this paper yet.