Papers
Topics
Authors
Recent
Search
2000 character limit reached

LYT-NET: Lightweight YUV Transformer-based Network for Low-light Image Enhancement

Published 26 Jan 2024 in cs.CV and eess.IV | (2401.15204v6)

Abstract: This letter introduces LYT-Net, a novel lightweight transformer-based model for low-light image enhancement (LLIE). LYT-Net consists of several layers and detachable blocks, including our novel blocks--Channel-Wise Denoiser (CWD) and Multi-Stage Squeeze & Excite Fusion (MSEF)--along with the traditional Transformer block, Multi-Headed Self-Attention (MHSA). In our method we adopt a dual-path approach, treating chrominance channels U and V and luminance channel Y as separate entities to help the model better handle illumination adjustment and corruption restoration. Our comprehensive evaluation on established LLIE datasets demonstrates that, despite its low complexity, our model outperforms recent LLIE methods. The source code and pre-trained models are available at https://github.com/albrateanu/LYT-Net

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. W. Wang, X. Wu, X. Yuan, and Z. Gao, “An experiment-based review of low-light image enhancement methods,” Ieee Access, vol. 8, pp. 87 884–87 917, 2020.
  2. C. Wei, W. Wang, W. Yang, and J. Liu, “Deep retinex decomposition for low-light enhancement,” in Proceedings of the British Machine Vision Conference (BMVC), 2018.
  3. C. Orhei and R. Vasiu, “An analysis of extended and dilated filters in sharpening algorithms,” IEEE Access, 2023.
  4. C. Orhei, V. Bogdan, C. Bonchis, and R. Vasiu, “Dilated filters for edge-detection algorithms,” Applied Sciences, vol. 11, no. 22, p. 10716, 2021.
  5. C.-C. Orhei, “Urban landmark detection using computer vision,” Ph.D. dissertation, Universitatea Politehnica Timişoara, 2022.
  6. C. D. Căleanu, C. L. Sîrbu, and G. Simion, “Deep neural architectures for contrast enhanced ultrasound (CEUS) focal liver lesions automated diagnosis,” vol. 21, no. 12.   MDPI, 2021, p. 4126.
  7. S. Vert, D. Andone, A. Ternauciuc, V. Mihaescu, O. Rotaru, M. Mocofan, C. Orhei, and R. Vasiu, “User evaluation of a multi-platform digital storytelling concept for cultural heritage,” Mathematics, vol. 9, no. 21, p. 2678, 2021.
  8. C. Orhei, L. Radu, M. Mocofan, S. Vert, and R. Vasiu, “Urban landmark detection using A-KAZE features and vector of aggregated local descriptors,” in 2022 International Symposium on Electronics and Telecommunications (ISETC).   IEEE, 2022, pp. 1–4.
  9. A. Avram, I. Porobic, and P. Papazian, “An overview of intelligent surveillance systems development,” in 2018 International Symposium on Electronics and Telecommunications (ISETC), 2018, pp. 1–6.
  10. S. Yang, D. Zhou, J. Cao, and Y. Guo, “Rethinking low-light enhancement via transformer-gan,” IEEE Signal Processing Letters, vol. 29, pp. 1082–1086, 2022.
  11. E. H. Land, “The retinex theory of color vision,” Scientific american, vol. 237, no. 6, pp. 108–129, 1977.
  12. S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, and M.-H. Yang, “Restormer: Efficient transformer for high-resolution image restoration,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 5728–5739.
  13. S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M.-H. Yang, and L. Shao, “Learning enriched features for real image restoration and enhancement,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXV 16.   Springer, 2020, pp. 492–511.
  14. D. J. Jobson, Z.-u. Rahman, and G. A. Woodell, “Properties and performance of a center/surround retinex,” IEEE transactions on image processing, vol. 6, no. 3, pp. 451–462, 1997.
  15. Z.-u. Rahman, D. J. Jobson, and G. A. Woodell, “Multi-scale retinex for color image enhancement,” in Proceedings of 3rd IEEE international conference on image processing, vol. 3.   IEEE, 1996, pp. 1003–1006.
  16. D. J. Jobson, Z.-u. Rahman, and G. A. Woodell, “A multiscale retinex for bridging the gap between color images and the human observation of scenes,” IEEE Transactions on Image processing, vol. 6, no. 7, pp. 965–976, 1997.
  17. Y. Zhang, J. Zhang, and X. Guo, “Kindling the darkness: A practical low-light image enhancer,” in Proceedings of the 27th ACM international conference on multimedia, 2019, pp. 1632–1640.
  18. X. Yi, H. Xu, H. Zhang, L. Tang, and J. Ma, “Diff-retinex: Rethinking low-light image enhancement with a generative diffusion model,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 12 302–12 311.
  19. L. Zhu, Y. Chen, P. Ghamisi, and J. A. Benediktsson, “Generative adversarial networks for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 9, pp. 5046–5063, 2018.
  20. Y. Jiang, X. Gong, D. Liu, Y. Cheng, C. Fang, X. Shen, J. Yang, P. Zhou, and Z. Wang, “Enlightengan: Deep light enhancement without paired supervision,” IEEE transactions on image processing, vol. 30, pp. 2340–2349, 2021.
  21. X. Xu, R. Wang, C.-W. Fu, and J. Jia, “SNR-aware low-light image enhancement,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 17 714–17 724.
  22. C. Chen, Q. Chen, M. N. Do, and V. Koltun, “Seeing motion in the dark,” in Proceedings of the IEEE/CVF International conference on computer vision, 2019, pp. 3185–3194.
  23. H. Zeng, J. Cai, L. Li, Z. Cao, and L. Zhang, “Learning image-adaptive 3d lookup tables for high performance photo enhancement in real-time,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 4, pp. 2058–2073, 2020.
  24. R. Wang, Q. Zhang, C.-W. Fu, X. Shen, W.-S. Zheng, and J. Jia, “Underexposed photo enhancement using deep illumination estimation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 6849–6857.
  25. S. Moran, P. Marza, S. McDonagh, S. Parisot, and G. Slabaugh, “DeepLPF: Deep local parametric filters for image enhancement,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 12 826–12 835.
  26. Z. Wang, X. Cun, J. Bao, W. Zhou, J. Liu, and H. Li, “Uformer: A general u-shaped transformer for image restoration,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 17 683–17 693.
  27. W. Yang, W. Wang, H. Huang, S. Wang, and J. Liu, “Sparse gradient regularized deep retinex network for robust low-light image enhancement,” IEEE Transactions on Image Processing, vol. 30, pp. 2072–2086, 2021.
  28. R. Liu, L. Ma, J. Zhang, X. Fan, and Z. Luo, “Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 10 561–10 570.
  29. K. Xu, X. Yang, B. Yin, and R. W. Lau, “Learning to restore low-light images via decomposition-and-enhancement,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 2281–2290.
  30. X. Guo, Y. Li, and H. Ling, “Lime: Low-light image enhancement via illumination map estimation,” IEEE Transactions on image processing, vol. 26, no. 2, pp. 982–993, 2016.
  31. X. Fu, D. Zeng, Y. Huang, Y. Liao, X. Ding, and J. Paisley, “A fusion-based enhancing method for weakly illuminated images,” Signal Processing, vol. 129, pp. 82–96, 2016.
  32. X. Dong, Y. Pang, and J. Wen, “Fast efficient algorithm for enhancement of low lighting video,” in ACM SIGGRAPH 2010 Posters, 2010, pp. 1–1.
  33. S. Wang, J. Zheng, H.-M. Hu, and B. Li, “Naturalness preserved enhancement algorithm for non-uniform illumination images,” IEEE transactions on image processing, vol. 22, no. 9, pp. 3538–3548, 2013.
Citations (7)

Summary

  • The paper presents a transformer-based network that leverages the YUV color space for targeted low-light image enhancement.
  • It introduces detachable modules like the Channel-wise Denoiser and Multi-stage Squeeze and Excite Fusion to balance noise reduction and feature extraction.
  • Experimental results show competitive PSNR and SSIM with reduced FLOPS (3.49G) and minimal parameters (0.045M), enabling efficient deployment.

Overview of LYT-NET: LIGHTWEIGHT YUV TRANSFORMER-BASED NETWORK FOR LOW-LIGHT IMAGE ENHANCEMENT

This paper presents LYT-Net, an innovative approach for enhancing low-light images, a challenging task in computer vision. Developed by an interdisciplinary research team, LYT-Net leverages a transformer-based architecture to process images in the YUV color space, distinguishing itself from traditional Retinex-based models and direct CNN mappings. This strategy uniquely exploits the separation of luminance (Y) and chrominance (U and V) to achieve a fine-grained balance between light and color enhancement. The design objective of LYT-Net prioritizes computational efficiency without compromising the quality of enhancement, a significant consideration in real-world applications where resource constraints are prevalent.

Methodological Advances

The architecture of LYT-Net is characterized by several key components:

  • YUV Color Space Utilization: By processing images in the YUV color space, LYT-Net achieves separate handling of luminance and chrominance. This separation allows for targeted enhancement, wherein the model enhances luminance for improved visibility without distorting color information, performing particularly well on human-perceptual levels.
  • Transformer-Based Architecture: The inclusion of a Multi-headed Self-attention (MHSA) Block within the model architecture optimizes the capture of long-range dependencies, fundamental for comprehending comprehensive image contexts in low-light conditions. This mechanism allows the model architecture to focus on spatial variability critical for LLIE.
  • Detachable Blocks: LYT-Net employs innovative blocks such as the Channel-wise Denoiser (CWD) and Multi-stage Squeeze and Excite Fusion (MSEF), which integrate convolutional and attention-based operations, balancing feature extraction and noise reduction.
  • Hybrid Loss Function: The training process employs a multifaceted loss function incorporating Smooth L1 loss, Perceptual loss, Histogram loss, PSNR loss, Color loss, and MS-SSIM loss, addressing diverse enhancement criteria while efficiently guiding model convergence.

Experimental Validation

The empirical evaluation of LYT-Net utilizes datasets including LOL-v1, LOL-v2-real, and LOL-v2-synthetic, benchmarking its performance against state-of-the-art models. Quantitative assessments reveal that LYT-Net is on par with, or outperforms, other techniques in terms of PSNR and SSIM metrics. Notably, it achieves third-best complexity with significantly reduced FLOPS (3.49G) and parameter counts (0.045M), underscoring its suitability for deployment in environments where computational efficiency is essential.

Qualitative assessments further corroborate these findings. Visual comparisons reveal that LYT-Net excels in balancing luminance enhancement and color fidelity, delivering superior enhancements without over- or under-exposure, a common issue with contemporary methods.

Implications and Future Directions

LYT-Net’s design leverages the optimal separation of color components inherent in the YUV space, aiming to set a new standard for LLIE by demonstrating the potential of lightweight solutions. Its deployment could be transformative in areas like mobile imaging, video surveillance, and autonomous systems where operating under varying lighting conditions is a practical challenge.

Theoretically, the paper invites further exploration of transformer-based architectures in image enhancement domains. It posits the potential extensibility of such models beyond LLIE, such as in high dynamic range (HDR) imaging and other image restoration tasks.

The authors suggest interesting avenues for future exploration, including expanding the training and testing datasets to refine model robustness further. Additionally, given LYT-Net’s efficiency, potential integration with sensor technology signals opportunities for real-time applications, broadening its impact across diverse technological realms.

In conclusion, while LYT-Net proposes a compact and power-efficient framework for resolving LLIE challenges, it also exemplifies the broader applicability of transformer-based models in advancing image processing capabilities in resource-constrained environments.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.