Adaptive Online Learning of Separable Path Graph Transforms for Intra-prediction (2402.16371v1)
Abstract: Current video coding standards, including H.264/AVC, HEVC, and VVC, employ discrete cosine transform (DCT), discrete sine transform (DST), and secondary to Karhunen-Loeve transforms (KLTs) decorrelate the intra-prediction residuals. However, the efficiency of these transforms in decorrelation can be limited when the signal has a non-smooth and non-periodic structure, such as those occurring in textures with intricate patterns. This paper introduces a novel adaptive separable path graph-based transform (GBT) that can provide better decorrelation than the DCT for intra-predicted texture data. The proposed GBT is learned in an online scenario with sequential K-means clustering, which groups similar blocks during encoding and decoding to adaptively learn the GBT for the current block from previously reconstructed areas with similar characteristics. A signaling overhead is added to the bitstream of each coding block to indicate the usage of the proposed graph-based transform. We assess the performance of this method combined with H.264/AVC intra-coding tools and demonstrate that it can significantly outperform H.264/AVC DCT for intra-predicted texture data.
- T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra, “Overview of the h. 264/avc video coding standard,” IEEE Transactions on circuits and systems for video technology, vol. 13, no. 7, pp. 560–576, 2003.
- G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the high efficiency video coding (hevc) standard,” IEEE Transactions on circuits and systems for video technology, vol. 22, no. 12, pp. 1649–1668, 2012.
- B. Bross, Y.-K. Wang, Y. Ye, S. Liu, J. Chen, G. J. Sullivan, and J.-R. Ohm, “Overview of the versatile video coding (vvc) standard and its applications,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 10, pp. 3736–3764, 2021.
- G. Strang, “The discrete cosine transform,” SIAM review, vol. 41, no. 1, pp. 135–147, 1999.
- X. Zhao, S.-H. Kim, Y. Zhao, H. E. Egilmez, M. Koo, S. Liu, J. Lainema, and M. Karczewicz, “Transform coding in the vvc standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 10, pp. 3878–3890, 2021.
- Y. Ye and M. Karczewicz, “Improved h. 264 intra coding based on bi-directional intra prediction, directional transform, and adaptive coefficient scanning,” in 2008 15th IEEE International Conference on Image Processing. IEEE, 2008, pp. 2116–2119.
- S. Takamura and A. Shimizu, “On intra coding using mode dependent 2d-klt,” in 2013 Picture Coding Symposium (PCS). IEEE, 2013, pp. 137–140.
- Y. Liu and J. Ostermann, “Scene-based klt for intra coding in hevc,” in 2018 Picture Coding Symposium (PCS). IEEE, 2018, pp. 6–10.
- C. Lan, J. Xu, G. Shi, and F. Wu, “Exploiting non-local correlation via signal-dependent transform (sdt),” IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 7, pp. 1298–1308, 2011.
- D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst, “The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains,” IEEE signal processing magazine, vol. 30, no. 3, pp. 83–98, 2013.
- A. Ortega, P. Frossard, J. Kovačevic´´c{\acute{\text{c}}}over´ start_ARG c end_ARG, J. M. Moura, and P. Vandergheynst, “Graph signal processing: Overview, challenges, and applications,” Proceedings of the IEEE, vol. 106, no. 5, pp. 808–828, 2018.
- H. E. Egilmez, E. Pavez, and A. Ortega, “Graph learning from data under laplacian and structural constraints,” IEEE Journal of Selected Topics in Signal Processing, vol. 11, no. 6, pp. 825–841, 2017.
- E. Pavez, H. E. Egilmez, and A. Ortega, “Learning graphs with monotone topology properties and multiple connected components,” IEEE Transactions on Signal Processing, vol. 66, no. 9, pp. 2399–2413, 2018.
- K.-S. Lu, E. Pavez, and A. Ortega, “On learning laplacians of tree structured graphs,” in 2018 IEEE Data Science Workshop (DSW). IEEE, 2018, pp. 205–209.
- G. Mateos, S. Segarra, A. G. Marques, and A. Ribeiro, “Connecting the dots: Identifying network structure via graph signal processing,” IEEE Signal Processing Magazine, vol. 36, no. 3, pp. 16–43, 2019.
- X. Dong, D. Thanou, M. Rabbat, and P. Frossard, “Learning graphs from data: A signal representation perspective,” IEEE Signal Processing Magazine, vol. 36, no. 3, pp. 44–63, 2019.
- H. E. Egilmez, Y.-H. Chao, and A. Ortega, “Graph-based transforms for video coding,” IEEE Transactions on Image Processing, vol. 29, pp. 9330–9344, 2020.
- E. Pavez, “Laplacian constrained precision matrix estimation: Existence and high dimensional consistency,” in International Conference on Artificial Intelligence and Statistics. PMLR, 2022, pp. 9711–9722.
- A. King, “Online k-means clustering of nonstationary data,” Prediction Project Report, pp. 1–9, 2012.
- B. Caputo, E. Hayman, and P. Mallikarjuna, “Class-specific material categorisation,” in Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1, vol. 2. IEEE, 2005, pp. 1597–1604.
- S. Lazebnik, C. Schmid, and J. Ponce, “A sparse texture representation using local affine regions,” IEEE transactions on pattern analysis and machine intelligence, vol. 27, no. 8, pp. 1265–1278, 2005.
- G. Bjontegaard, “Calculation of average psnr differences between rd-curves,” ITU SG16 Doc. VCEG-M33, 2001.
- A. Chu, C. M. Sehgal, and J. F. Greenleaf, “Use of gray value distribution of run lengths for texture analysis,” Pattern recognition letters, vol. 11, no. 6, pp. 415–419, 1990.