Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Att2CPC: Attention-Guided Lossy Attribute Compression of Point Clouds (2410.17823v1)

Published 23 Oct 2024 in cs.LG, cs.CV, and eess.IV

Abstract: With the great progress of 3D sensing and acquisition technology, the volume of point cloud data has grown dramatically, which urges the development of efficient point cloud compression methods. In this paper, we focus on the task of learned lossy point cloud attribute compression (PCAC). We propose an efficient attention-based method for lossy compression of point cloud attributes leveraging on an autoencoder architecture. Specifically, at the encoding side, we conduct multiple downsampling to best exploit the local attribute patterns, in which effective External Cross Attention (ECA) is devised to hierarchically aggregate features by intergrating attributes and geometry contexts. At the decoding side, the attributes of the point cloud are progressively reconstructed based on the multi-scale representation and the zero-padding upsampling tactic. To the best of our knowledge, this is the first approach to introduce attention mechanism to point-based lossy PCAC task. We verify the compression efficiency of our model on various sequences, including human body frames, sparse objects, and large-scale point cloud scenes. Experiments show that our method achieves an average improvement of 1.15 dB and 2.13 dB in BD-PSNR of Y channel and YUV channel, respectively, when comparing with the state-of-the-art point-based method Deep-PCAC. Codes of this paper are available at https://github.com/I2-Multimedia-Lab/Att2CPC.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in 2012 IEEE conference on computer vision and pattern recognition.   IEEE, 2012, pp. 3354–3361.
  2. Z. Pan, A. D. Cheok, H. Yang, J. Zhu, and J. Shi, “Virtual reality and mixed reality for virtual learning environments,” Computers & graphics, vol. 30, no. 1, pp. 20–28, 2006.
  3. R. Mekuria, K. Blom, and P. Cesar, “Design, implementation, and evaluation of a point cloud codec for tele-immersive video,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 4, pp. 828–842, 2016.
  4. D. Graziosi, O. Nakagami, S. Kuma, A. Zaghetto, T. Suzuki, and A. Tabatabai, “An overview of ongoing point cloud compression standardization activities: Video-based (v-pcc) and geometry-based (g-pcc),” APSIPA Transactions on Signal and Information Processing, vol. 9, p. e13, 2020.
  5. S. Schwarz, M. Preda, V. Baroncini, M. Budagavi, P. Cesar, P. A. Chou, R. A. Cohen, M. Krivokuća, S. Lasserre, Z. Li et al., “Emerging mpeg standards for point cloud compression,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 9, no. 1, pp. 133–148, 2018.
  6. M. Wien, “High efficiency video coding,” Coding Tools and specification, vol. 24, 2015.
  7. B. Bross, Y.-K. Wang, Y. Ye, S. Liu, J. Chen, G. J. Sullivan, and J.-R. Ohm, “Overview of the versatile video coding (vvc) standard and its applications,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 10, pp. 3736–3764, 2021.
  8. G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the high efficiency video coding (hevc) standard,” IEEE Transactions on circuits and systems for video technology, vol. 22, no. 12, pp. 1649–1668, 2012.
  9. R. L. De Queiroz and P. A. Chou, “Compression of 3d point clouds using a region-adaptive hierarchical transform,” IEEE Transactions on Image Processing, vol. 25, no. 8, pp. 3947–3956, 2016.
  10. J. Ballé, V. Laparra, and E. P. Simoncelli, “End-to-end optimized image compression,” arXiv preprint arXiv:1611.01704, 2016.
  11. D. Ding, Z. Ma, D. Chen, Q. Chen, Z. Liu, and F. Zhu, “Advances in video compression system using deep neural network: A review and case studies,” Proceedings of the IEEE, vol. 109, no. 9, pp. 1494–1520, 2021.
  12. M. Lu, Z. Duan, F. Zhu, and Z. Ma, “Deep hierarchical video compression,” arXiv preprint arXiv:2312.07126, 2023.
  13. K. Liu, D. Wu, Y. Wu, Y. Wang, D. Feng, B. Tan, and S. Garg, “Manipulation attacks on learned image compression,” IEEE Transactions on Artificial Intelligence, pp. 1–14, 2023.
  14. J. Zhang, G. Liu, D. Ding, and Z. Ma, “Transformer and upsampling-based point cloud compression,” in Proceedings of the 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis, 2022, pp. 33–39.
  15. K. You, P. Gao, and Q. Li, “Ipdae: Improved patch-based deep autoencoder for lossy point cloud geometry compression,” in Proceedings of the 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis, 2022, pp. 1–10.
  16. K. You and P. Gao, “Patch-based deep autoencoder for point cloud geometry compression,” in ACM Multimedia Asia, 2021, pp. 1–7.
  17. Y. He, X. Ren, D. Tang, Y. Zhang, X. Xue, and Y. Fu, “Density-preserving deep point cloud compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 2333–2342.
  18. J. Wang, D. Ding, Z. Li, and Z. Ma, “Multiscale point cloud geometry compression,” in 2021 Data Compression Conference (DCC).   IEEE, 2021, pp. 73–82.
  19. J. Wang, D. Ding, Z. Li, X. Feng, C. Cao, and Z. Ma, “Sparse tensor-based multiscale representation for point cloud geometry compression,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  20. G. Liu, R. Xue, J. Li, D. Ding, and Z. Ma, “Grnet: Geometry restoration for g-pcc compressed point clouds using auxiliary density signaling,” IEEE Transactions on Visualization and Computer Graphics, no. 01, pp. 1–15, nov 5555.
  21. X. Sheng, L. Li, D. Liu, Z. Xiong, Z. Li, and F. Wu, “Deep-pcac: An end-to-end deep lossy compression framework for point cloud attributes,” IEEE Transactions on Multimedia, vol. 24, pp. 2617–2632, 2021.
  22. J. Wang and Z. Ma, “Sparse tensor-based point cloud attribute compression,” in 2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR).   IEEE, 2022, pp. 59–64.
  23. J. Zhang, T. Chen, D. Ding, and Z. Ma, “Yoga: Yet another geometry-based point cloud compressor,” in Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 9070–9081.
  24. T.-P. Lin, M. Yim, J.-C. Chiang, W.-H. Peng, and W.-N. Lie, “Sparse tensor-based point cloud attribute compression using augmented normalizing flows,” in 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2023, pp. 1739–1744.
  25. T. Huang, J. Zhang, J. Chen, Z. Ding, Y. Tai, Z. Zhang, C. Wang, and Y. Liu, “3qnet: 3d point cloud geometry quantization compression network,” ACM Transactions on Graphics (TOG), vol. 41, no. 6, pp. 1–13, 2022.
  26. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  27. M.-H. Guo, J.-X. Cai, Z.-N. Liu, T.-J. Mu, R. R. Martin, and S.-M. Hu, “Pct: Point cloud transformer,” Computational Visual Media, vol. 7, pp. 187–199, 2021.
  28. H. Zhao, L. Jiang, J. Jia, P. H. Torr, and V. Koltun, “Point transformer,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 16 259–16 268.
  29. G. Liu, J. Wang, D. Ding, and Z. Ma, “Pcgformer: Lossy point cloud geometry compression via local self-attention,” in 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP).   IEEE, 2022, pp. 1–5.
  30. S. Li, P. Gao, X. Tan, and M. Wei, “Proxyformer: Proxy alignment assisted point cloud completion with missing part sensitive transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 9466–9475.
  31. X. Wu, Y. Lao, L. Jiang, X. Liu, and H. Zhao, “Point transformer v2: Grouped vector attention and partition-based pooling,” Advances in Neural Information Processing Systems, vol. 35, pp. 33 330–33 342, 2022.
  32. X. Wu, L. Jiang, P.-S. Wang, Z. Liu, X. Liu, Y. Qiao, W. Ouyang, T. He, and H. Zhao, “Point transformer v3: Simpler, faster, stronger,” arXiv preprint arXiv:2312.10035, 2023.
  33. D. Maturana and S. Scherer, “Voxnet: A 3d convolutional neural network for real-time object recognition,” in 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS).   IEEE, 2015, pp. 922–928.
  34. G. Riegler, A. Osman Ulusoy, and A. Geiger, “Octnet: Learning deep 3d representations at high resolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 3577–3586.
  35. C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” arXiv preprint arXiv:1612.00593, pp. 652–660, 2017.
  36. C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” Advances in neural information processing systems, vol. 30, 2017.
  37. M.-H. Guo, T.-X. Xu, J.-J. Liu, Z.-N. Liu, P.-T. Jiang, T.-J. Mu, S.-H. Zhang, R. R. Martin, M.-M. Cheng, and S.-M. Hu, “Attention mechanisms in computer vision: A survey,” Computational visual media, vol. 8, no. 3, pp. 331–368, 2022.
  38. L. Hui, H. Yang, M. Cheng, J. Xie, and J. Yang, “Pyramid point cloud transformer for large-scale place recognition,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 6098–6107.
  39. Y.-X. Zhang, Z.-L. Sun, Z. Zeng, and K.-M. Lam, “Partial point cloud registration with deep local feature,” IEEE Transactions on Artificial Intelligence, vol. 4, no. 5, pp. 1317–1327, 2023.
  40. Y. Shao, Q. Zhang, G. Li, Z. Li, and L. Li, “Hybrid point cloud attribute compression using slice-based layered structure and block-based intra prediction,” in Proceedings of the 26th ACM international conference on Multimedia, 2018, pp. 1199–1207.
  41. C. Zhang, D. Florêncio, and C. Loop, “Point cloud attribute compression with graph transform,” in 2014 IEEE International Conference on Image Processing (ICIP), 2014, pp. 2066–2070.
  42. F. Song, G. Li, W. Gao, and T. H. Li, “Rate-distortion optimized graph for point cloud attribute coding,” IEEE Signal Processing Letters, vol. 29, pp. 922–926, 2022.
  43. G. Fang, Q. Hu, H. Wang, Y. Xu, and Y. Guo, “3dac: Learning attribute compression for point clouds,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 14 819–14 828.
  44. L. Hou, L. Gao, Y. Xu, Z. Li, X. Xu, and S. Liu, “Learning-based intra-prediction for point cloud attribute transform coding,” in 2022 IEEE 24th International Workshop on Multimedia Signal Processing (MMSP), 2022, pp. 1–6.
  45. D. T. N. A. Kaup, “Lossless point cloud geometry and attribute compression using a learned conditional probability model,” arXiv preprint arXiv:2303.06519, 2023.
  46. D. T. Nguyen, K. G. Nambiar, and A. Kaup, “Deep probabilistic model for lossless scalable point cloud attribute compression,” in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023, pp. 1–5.
  47. J. Wang, D. Ding, and Z. Ma, “Lossless point cloud attribute compression using cross-scale, cross-group, and cross-color prediction,” 2023 Data Compression Conference (DCC), pp. 228–237, 2023.
  48. L. Wei, S. Wan, F. Yang, and Z. Wang, “Content-adaptive level of detail for lossless point cloud compression,” APSIPA Transactions on Signal and Information Processing, 2022.
  49. C.-M. Feng, Y. Yan, G. Chen, H. Fu, Y. Xu, and L. Shao, “Accelerated multi-modal mr imaging with transformers,” arXiv e-prints, pp. arXiv–2106, 2021.
  50. X. Sheng, L. Li, D. Liu, Z. Xiong, Z. Li, and F. Wu, “Deep-pcac: An end-to-end deep lossy compression framework for point cloud attributes,” IEEE Transactions on Multimedia, vol. 24, pp. 2617–2632, 2022.
  51. X. Sheng, L. Li, D. Liu, and Z. Xiong, “Attribute artifacts removal for geometry-based point cloud compression,” IEEE Transactions on Image Processing, vol. 31, pp. 3399–3413, 2022.
  52. A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su et al., “Shapenet: An information-rich 3d model repository,” arXiv preprint arXiv:1512.03012, 2015.
  53. K.-Y. Chang, K.-H. Lu, and C.-S. Chen, “Aesthetic critiques generation for photos,” in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 3534–3543.
  54. S. Schwarz, G. Martin-Cocher, D. Flynn, and M. Budagavi, “Common test conditions for point cloud compression,” ISO/IEC JTC1/SC29/WG11 w17766, Ljubljana, Slovenia, Tech. Rep., 2018.
  55. I. Armeni, O. Sener, A. R. Zamir, H. Jiang, I. Brilakis, M. Fischer, and S. Savarese, “3d semantic parsing of large-scale indoor spaces,” in Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, 2016.
  56. C. Loop, Q. Cai, S. O. Escolano, and P. A. Chou, “Microsoft voxelized upper bodies-a voxelized point cloud dataset,” ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document m38673 M, vol. 72012, p. 2016, 2016.
  57. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Y. Bengio and Y. LeCun, Eds., 2015.
  58. D. Tian, H. Ochimizu, C. Feng, R. Cohen, and A. Vetro, “Evaluation metrics for point cloud compression,” ISO/IEC JTC1/SC29/WG11, Tech. Rep. M39316, 2017.
  59. W. 7, “G-pcc codec description v11,” ISO/IEC JTC 1/SC 29/WG 7 N0099, 2021.
  60. J. Ballé, D. Minnen, S. Singh, S. J. Hwang, and N. Johnston, “Variational image compression with a scale hyperprior,” arXiv preprint arXiv:1802.01436, 2018.
  61. D. Minnen, J. Ballé, and G. D. Toderici, “Joint autoregressive and hierarchical priors for learned image compression,” Advances in neural information processing systems, vol. 31, 2018.
  62. F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte, and L. Van Gool, “Conditional probability models for deep image compression,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4394–4402.
  63. Y. He, D. Tang, Y. Zhang, X. Xue, and Y. Fu, “Grad-pu: Arbitrary-scale point cloud upsampling via gradient descent with learned distance functions,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

Summary

We haven't generated a summary for this paper yet.