Enhancing Perception Quality in Remote Sensing Image Compression via Invertible Neural Network (2405.10518v2)
Abstract: Decoding remote sensing images to achieve high perceptual quality, particularly at low bitrates, remains a significant challenge. To address this problem, we propose the invertible neural network-based remote sensing image compression (INN-RSIC) method. Specifically, we capture compression distortion from an existing image compression algorithm and encode it as a set of Gaussian-distributed latent variables via INN. This ensures that the compression distortion in the decoded image becomes independent of the ground truth. Therefore, by leveraging the inverse mapping of INN, we can input the decoded image along with a set of randomly resampled Gaussian distributed variables into the inverse network, effectively generating enhanced images with better perception quality. To effectively learn compression distortion, channel expansion, Haar transformation, and invertible blocks are employed to construct the INN. Additionally, we introduce a quantization module (QM) to mitigate the impact of format conversion, thus enhancing the framework's generalization and improving the perceptual quality of enhanced images. Extensive experiments demonstrate that our INN-RSIC significantly outperforms the existing state-of-the-art traditional and deep learning-based image compression methods in terms of perception quality.
- Z. Xie, M. Yuan, F. Zhang, M. Chen, J. Shan, L. Sun, and X. Liu, “Using remote sensing data and graph theory to identify polycentric urban structure,” IEEE Geoscience and Remote Sensing Letters, vol. 20, pp. 1–5, 2023.
- Z. Zhang, W. Xu, Z. Shi, and Q. Qin, “Establishment of a comprehensive drought monitoring index based on multisource remote sensing data and agricultural drought monitoring,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 14, pp. 2113–2126, 2021.
- Z. Lv, H. Huang, W. Sun, T. Lei, J. A. Benediktsson, and J. Li, “Novel enhanced unet for change detection using multimodal remote sensing image,” IEEE Geoscience and Remote Sensing Letters, vol. 20, pp. 1–5, 2023.
- S. Xiang and Q. Liang, “Remote sensing image compression with long-range convolution and improved non-local attention model,” Signal Processing, vol. 209, p. 109005, 2023.
- Y. Li, J. Ma, and Y. Zhang, “Image retrieval from remote sensing big data: A survey,” Information Fusion, vol. 67, pp. 94–115, 2021.
- D. S. Taubman, M. W. Marcellin, and M. Rabbani, “JPEG2000: Image compression fundamentals, standards and practice,” Journal of Electronic Imaging, vol. 11, no. 2, pp. 286–287, 2002.
- F. Bellard, “BPG image format.” [Online]. Available: https://bellard.org/bpg.
- “Versatile video coding reference software version 16.0 (vtm-16.0).” [Online]. Available: https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware-VTM/-/releases/VTM-16.0.
- T. Pan, L. Zhang, Y. Song, and Y. Liu, “Hybrid attention compression network with light graph attention module for remote sensing images,” IEEE Geoscience and Remote Sensing Letters, vol. 20, pp. 1–5, 2023.
- Z. Cheng, H. Sun, M. Takeuchi, and J. Katto, “Learned image compression with discretized Gaussian mixture likelihoods and attention modules,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 7939–7948.
- L. Zhang, X. Hu, T. Pan, and L. Zhang, “Global priors with anchored-stripe attention and multiscale convolution for remote sensing image compression,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 17, pp. 138–149, 2024.
- F. Mentzer, G. D. Toderici, M. Tschannen, and E. Agustsson, “High-fidelity generative image compression,” in Advances in Neural Information Processing Systems (NIPS), vol. 33. Curran Associates, Inc., 2020, pp. 11 913–11 924.
- Z. Tang, H. Wang, X. Yi, Y. Zhang, S. Kwong, and C.-C. J. Kuo, “Joint graph attention and asymmetric convolutional neural network for deep image compression,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 1, pp. 421–433, 2022.
- J. Ballé, D. Minnen, S. Singh, S. J. Hwang, and N. Johnston, “Variational image compression with a scale hyperprior,” in International Conference on Learning Representations (ICLR), 2018.
- D. He, Z. Yang, W. Peng, R. Ma, H. Qin, and Y. Wang, “Elic: Efficient learned image compression with unevenly grouped space-channel contextual adaptive coding,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 5718–5727.
- X. Lu, B. Wang, X. Zheng, and X. Li, “Exploring models and data for remote sensing image caption generation,” IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 4, pp. 2183–2195, 2017.
- P. Han, B. Zhao, and X. Li, “Edge-guided remote-sensing image compression,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–15, 2023.
- T. Pan, L. Zhang, L. Qu, and Y. Liu, “A coupled compression generation network for remote-sensing images at extremely low bitrates,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–14, 2023.
- S. Xiang and Q. Liang, “Remote sensing image compression based on high-frequency and low-frequency components,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–15, 2024.
- R. Zou, C. Song, and Z. Zhang, “The devil is in the details: Window-based attention for image compression,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 17 492–17 501.
- Y. Qian, X. Sun, M. Lin, Z. Tan, and R. Jin, “Entroformer: A transformer-based entropy model for learned image compression,” in International Conference on Learning Representations (ICLR), 2022.
- S. Xiang, Q. Liang, and L. Fang, “Discrete wavelet transform-based gaussian mixture model for remote sensing image compression,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–12, 2023.
- Y. Xie, K. L. Cheng, and Q. Chen, “Enhanced invertible encoding for learned image compression,” in Proceedings of the 29th ACM international conference on multimedia, 2021, pp. 162–170.
- J. Li and X. Hou, “Object-fidelity remote sensing image compression with content-weighted bitrate allocation and patch-based local attention,” IEEE Transactions on Geoscience and Remote Sensing, pp. 1–1, 2024.
- R. Zhao, T. Liu, J. Xiao, D. P. K. Lun, and K.-M. Lam, “Invertible image decolorization,” IEEE Transactions on Image Processing, vol. 30, pp. 6081–6095, 2021.
- F. Gao, X. Deng, J. Jing, X. Zou, and M. Xu, “Extremely low bit-rate image compression via invertible image generation,” IEEE Transactions on Circuits and Systems for Video Technology, pp. 1–1, 2023.
- S. T. Radev, U. K. Mertens, A. Voss, L. Ardizzone, and U. Köthe, “Bayesflow: Learning complex stochastic models with invertible neural networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 4, pp. 1452–1466, 2022.
- M. Zhou, X. Fu, J. Huang, F. Zhao, A. Liu, and R. Wang, “Effective pan-sharpening with transformer and invertible neural network,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–15, 2022.
- C. Fu and B. Du, “Remote sensing image compression based on the multiple prior information,” Remote Sensing, vol. 15, no. 8, 2023.
- R. Rombach, P. Esser, and B. Ommer, “Network-to-network translation with conditional invertible neural networks,” in Advances in Neural Information Processing Systems (NIPS), vol. 33. Curran Associates, Inc., 2020, pp. 2784–2797.
- J.-J. Huang and P. L. Dragotti, “Winnet: Wavelet-inspired invertible network for image denoising,” IEEE Transactions on Image Processing, vol. 31, pp. 4377–4392, 2022.
- H. Liu, M. Shao, Y. Qiao, Y. Wan, and D. Meng, “Unpaired image super-resolution using a lightweight invertible neural network,” Pattern Recognition, vol. 144, p. 109822, 2023.
- M. Zhou, J. Huang, X. Fu, F. Zhao, and D. Hong, “Effective pan-sharpening by multiscale invertible neural network and heterogeneous task distilling,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–14, 2022.
- Y. Song, C. Meng, and S. Ermon, “Mintnet: Building invertible neural networks with masked convolutions,” in Advances in Neural Information Processing Systems (NIPS), vol. 32. Curran Associates, Inc., 2019.
- M. Xiao, S. Zheng, C. Liu, Y. Wang, D. He, G. Ke, J. Bian, Z. Lin, and T.-Y. Liu, “Invertible image rescaling,” in Computer Vision – ECCV 2020. Cham: Springer International Publishing, 2020, pp. 126–144.
- X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. Change Loy, “Esrgan: Enhanced super-resolution generative adversarial networks,” in Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 0–0.
- R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 586–595.
- M. Xia, X. Liu, and T.-T. Wong, “Invertible grayscale,” ACM Transactions on Graphics, vol. 37, no. 6, dec 2018.
- Y. L. G.-S. X. Q. L. Jian Ding, Nan Xue, “Learning roi transformer for detecting oriented objects in aerial images,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
- Y. Yang and S. Newsam, “Bag-of-visual-words and spatial extensions for land-use classification,” in Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2010, pp. 270–279.
- I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in International Conference on Learning Representations (ICLR), 2019.
- M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” Advances in Neural Information Processing Systems (NIPS), vol. 30, 2017.
- Junhui Li (51 papers)
- Xingsong Hou (11 papers)