Transformer-based Learned Image Compression for Joint Decoding and Denoising (2402.12888v1)
Abstract: This work introduces a Transformer-based image compression system. It has the flexibility to switch between the standard image reconstruction and the denoising reconstruction from a single compressed bitstream. Instead of training separate decoders for these tasks, we incorporate two add-on modules to adapt a pre-trained image decoder from performing the standard image reconstruction to joint decoding and denoising. Our scheme adopts a two-pronged approach. It features a latent refinement module to refine the latent representation of a noisy input image for reconstructing a noise-free image. Additionally, it incorporates an instance-specific prompt generator that adapts the decoding process to improve on the latent refinement. Experimental results show that our method achieves a similar level of denoising quality to training a separate decoder for joint decoding and denoising at the expense of only a modest increase in the decoder's model size and computational complexity.
- K. L. Cheng, Y. Xie, and Q. Chen, “Optimizing image compression via joint learning with denoising,” in European Conference on Computer Vision. Springer, 2022, pp. 56–73.
- B. Brummer and C. De Vleeschouwer, “On the importance of denoising when learning to compress images,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 2440–2448.
- ISO/IEC JTC 1/SC29/WG1 N100095, REQ, “Final call for proposals for JPEG AI,” 94th Meeting, Online, January 2022.
- L. Larigauderie, M. Testolina, and T. Ebrahimi, “On combining denoising with learning-based image decoding,” in Applications of Digital Image Processing XLV, vol. 12226. SPIE, 2022, pp. 193–206.
- S. Ranjbar Alvar, M. Ulhaq, H. Choi, and I. V. Bajić, “Joint image compression and denoising via latent-space scalability,” Frontiers in Signal Processing, vol. 2, p. 932873, 2022.
- M. Lu, P. Guo, H. Shi, C. Cao, and Z. Ma, “Transformer-based image compression,” in Data Compression Conference, 2022.
- X. Wang, K. Yu, C. Dong, and C. C. Loy, “Recovering realistic texture in image super-resolution by deep spatial feature transform,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 606–615.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25, 2012.
- Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 10 012–10 022.
- A. Abdelhamed, S. Lin, and M. S. Brown, “A high-quality denoising dataset for smartphone cameras,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 1692–1700.
- J. Liu, G. Lu, Z. Hu, and D. Xu, “A unified end-to-end framework for efficient deep image compression,” arXiv preprint arXiv:2002.03370, 2020.
- J.-B. Huang, A. Singh, and N. Ahuja, “Single image super-resolution from transformed self-exemplars,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 5197–5206.
- “JPEG-AI Anchors,” https://gitlab.com/wg1/jpeg-ai/jpeg-ai-anchors.
- ISO/IEC JTC 1/SC29/WG1 N100600, “JPEG AI common training & test conditions v8.0,” 100th Meeting, Covilhã, Portugal, July 2023.
- R. Franzen, “Kodak lossless true color image suite,” source: http://r0k. us/graphics/kodak, vol. 4, no. 2, p. 9, 1999.
- S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, and M.-H. Yang, “Restormer: Efficient transformer for high-resolution image restoration,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 5728–5739.
- K. Zhang, W. Zuo, and L. Zhang, “Ffdnet: Toward a fast and flexible solution for cnn-based image denoising,” IEEE Transactions on Image Processing, vol. 27, no. 9, pp. 4608–4622, 2018.