Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploration of Learned Lifting-Based Transform Structures for Fully Scalable and Accessible Wavelet-Like Image Compression (2402.18761v1)

Published 29 Feb 2024 in eess.IV, cs.CV, and cs.MM

Abstract: This paper provides a comprehensive study on features and performance of different ways to incorporate neural networks into lifting-based wavelet-like transforms, within the context of fully scalable and accessible image compression. Specifically, we explore different arrangements of lifting steps, as well as various network architectures for learned lifting operators. Moreover, we examine the impact of the number of learned lifting steps, the number of channels, the number of layers and the support of kernels in each learned lifting operator. To facilitate the study, we investigate two generic training methodologies that are simultaneously appropriate to a wide variety of lifting structures considered. Experimental results ultimately suggest that retaining fixed lifting steps from the base wavelet transform is highly beneficial. Moreover, we demonstrate that employing more learned lifting steps and more layers in each learned lifting operator do not contribute strongly to the compression performance. However, benefits can be obtained by utilizing more channels in each learned lifting operator. Ultimately, the learned wavelet-like transform proposed in this paper achieves over 25% bit-rate savings compared to JPEG 2000 with compact spatial support.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. “End-to-end optimized image compression,” arXiv preprint arXiv:1611.01704, 2016.
  2. “Joint autoregressive and hierarchical priors for learned image compression,” Advances in neural information processing systems, vol. 31, 2018.
  3. “Lossy image compression with compressive autoencoders,” arXiv preprint arXiv:1703.00395, 2017.
  4. “Full resolution image compression with recurrent neural networks,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017, pp. 5306–5314.
  5. “iwave: Cnn-based wavelet-like transform for image compression,” IEEE Transactions on Multimedia, vol. 22, no. 7, pp. 1667–1679, 2019.
  6. “Optimized lifting scheme based on a dynamical fully connected network for image coding,” in 2020 IEEE International Conference on Image Processing (ICIP). IEEE, 2020, pp. 3329–3333.
  7. “A neural network approach for joint optimization of predictors in lifting-based image coders,” in 2021 IEEE International Conference on Image Processing (ICIP). IEEE, 2021, pp. 3747–3751.
  8. “Dynamic neural network for lossy-to-lossless image coding,” IEEE Transactions on Image Processing, vol. 31, pp. 569–584, 2021.
  9. “Machine-learning based secondary transform for improved image compression in jpeg2000,” in 2021 IEEE International Conference on Image Processing (ICIP), 2021, pp. 3752–3756.
  10. “A neural network lifting based secondary transform for improved fully scalable image compression in jpeg 2000,” in 2022 IEEE International Conference on Image Processing (ICIP), 2022, pp. 1606–1610.
  11. “Neural network assisted lifting steps for improved fully scalable lossy image compression in jpeg2000,” IEEE Transactions on Pattern Analysis and Machine Intelligence, to be published.
  12. Multiresolution signal decomposition: transforms, subbands, and wavelets, Academic press, 2001.
  13. Wim Sweldens, “The lifting scheme: A construction of second generation wavelets,” SIAM journal on mathematical analysis, vol. 29, no. 2, pp. 511–546, 1998.
  14. Wim Sweldens, “Lifting scheme: a new philosophy in biorthogonal wavelet constructions,” in Wavelet Applications in Signal and Image Processing III, Andrew F. Laine and Michael A. Unser, Eds. International Society for Optics and Photonics, 1995, vol. 2569, pp. 68 – 79, SPIE.
  15. Wim Sweldens, “The lifting scheme: A custom-design construction of biorthogonal wavelets,” Applied and Computational Harmonic Analysis, vol. 3, no. 2, pp. 186–200, 1996.
  16. “Factoring wavelet transforms into lifting steps,” Journal of Fourier analysis and applications, vol. 4, no. 3, pp. 247–269, 1998.
  17. “Factoring wavelet transforms into lifting steps,” Wavelets in the Geosciences, pp. 131–157, 2005.
  18. “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, 2015, pp. 234–241.
  19. “Block-optimized variable bit rate neural image compression,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018, pp. 2551–2554.
  20. “Learning-based image compression using convolutional autoencoder and wavelet decomposition,” in IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019, number CONF.
  21. “Variational image compression with a scale hyperprior,” arXiv preprint arXiv:1802.01436, 2018.
  22. “Coarse-to-fine hyper-prior modeling for learned image compression,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, pp. 11013–11020.
  23. “Rate-distortion optimized learning-based image compression using an adaptive hierachical autoencoder with conditional hyperprior,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1885–1889.
  24. “A spatial rnn codec for end-to-end image compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 13269–13277.
  25. “Context-adaptive entropy model for end-to-end optimized image compression,” arXiv preprint arXiv:1809.10452, 2018.
  26. “Efficient and effective context-based convolutional entropy modeling for image compression,” IEEE Transactions on Image Processing, vol. 29, pp. 5900–5911, 2020.
  27. “Checkerboard context model for efficient learned image compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 14771–14780.
  28. “Elic: Efficient learned image compression with unevenly grouped space-channel contextual adaptive coding,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5718–5727.
  29. “Biorthogonal bases of compactly supported wavelets,” Communications on pure and applied mathematics, vol. 45, no. 5, pp. 485–560, 1992.
  30. JPEG 2000: Image Compression Fundamentals, Standards and Practice, Kluwer Academic Publishers, Norwell, MA, USA, 2001.
  31. “End-to-end optimized versatile image compression with wavelet-like transform,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 3, pp. 1247–1263, 2020.
  32. “Improved transform structures for learned wavelet-like fully scalable image compression,” 2023 IEEE 25th International Workshop on Multimedia Signal Processing (MMSP), 2023.
  33. “Nonlinear wavelet transforms for image coding via lifting,” IEEE Transactions on Image Processing, vol. 12, no. 12, pp. 1449–1459, 2003.
  34. “Learning convolutional networks for content-weighted image compression,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 3214–3223.
  35. “Universally quantized neural compression,” Advances in neural information processing systems, vol. 33, pp. 12367–12376, 2020.
  36. “Estimating or propagating gradients through stochastic neurons for conditional computation,” arXiv preprint arXiv:1308.3432, 2013.
  37. “Understanding straight-through estimator in training activation quantized neural nets,” arXiv preprint arXiv:1903.05662, 2019.
  38. “Soft-to-hard vector quantization for end-to-end learning compressible representations,” Advances in neural information processing systems, vol. 30, 2017.
  39. “Improving inference for neural image compression,” Advances in Neural Information Processing Systems, vol. 33, pp. 573–584, 2020.
  40. Sepp Hochreiter, “The vanishing gradient problem during learning recurrent neural nets and problem solutions,” International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 6, no. 02, pp. 107–116, 1998.
  41. “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  42. Gisle Bjontegaard, “Calculation of average psnr differences between rd-curves,” ITU SG16 Doc. VCEG-M33, 2001.

Summary

  • The paper introduces novel learned lifting structures that merge neural networks with wavelet-like transforms to enhance image compression scalability.
  • It demonstrates that a hybrid structure with additional learned lifting steps and the proposal-opacity topology significantly boosts coding efficiency without incurring excessive computational cost.
  • The study outlines an optimal balance between improved performance and increased complexity, offering guidance for future research in scalable image compression techniques.

Exploration of Learned Wavelet-Like Transforms for Enhanced Image Compression

Introducing New Transform Structures

The evolution of learning-based methods has transformed the landscape of image compression. Among these, lifting-based, wavelet-like transforms have emerged as a pivotal area of research due to their inherent multi-resolution support and scalability features. Notably, this paper offers an in-depth analysis of integrating neural networks into lifting-based transforms for scalable image compression, marking a significant stride in the quest to refine wavelet-based methods with the power of machine learning.

Investigation of Lifting Structures

The research primarily revolves around the exploration of different ways to incorporate learned lifting steps within wavelet-like transforms and their impact on compression efficiency. Among various configurations, the paper prominently features:

  • Predict-update and update-predict lifting structures, where conventional lifting operators are substituted with learned neural networks.
  • The introduction of a hybrid lifting structure that augments a base wavelet transform with additional learned lifting steps for aliasing suppression and redundancy reduction.

The comparison among these structures yielded insightful conclusions. Notably, the hybrid structure with two additional learned lifting steps, when employing the proposal-opacity network topology, showcased superior performance.

The Proposal-Opacity Network Topology

A distinguished contribution of this paper is the proposal-opacity network topology. Characterized by linear proposals modulated by non-linear opacities, this topology offers a nuanced approach to compressing images. It demonstrates how increasing the diversity, denoted by the number of channels within each learned lifting operator, can significantly enhance coding efficiency. Impressively, the research revealed that employing a more considerable number of channels, within reasonable bounds, contributed more effectively to compression performance than simply increasing the depth of lifting structures or the spatial support.

Computational Considerations and Practical Implications

On the computational front, the paper meticulously evaluates the trade-offs between the enhanced coding efficiency afforded by learned lifting operators and the associated increase in computational complexity and region of support. The findings advocate for a balanced approach, recommending the augmentation of well-performing base wavelet transforms with a selective number of learned lifting steps to achieve an optimal blend of compression efficiency, computational load, and support compactness.

Future Directions

Looking forward, this comprehensive examination of learned lifting-based transform structures opens several avenues for further research. The demonstrated superiority of the proposal-opacity network topology paves the way for its application in other domains of image processing and beyond. Moreover, the paper underlines the importance of continuing to probe into the balance between network complexity and compression efficiency—a key consideration as the field progresses.

Conclusions

In summary, this paper stands as a rigorous exploration of leveraging neural networks within the framework of lifting-based wavelet-like transforms for scalable image compression. Its findings not only underscore the potential of learned lifting operators in enhancing coding efficiency but also highlight the critical role of choosing the appropriate network topologies and lifting structures. As the pursuit of more advanced and efficient image compression technologies marches on, the insights from this research offer valuable guidance for future explorations in the field of generative AI and beyond.