Scalable Human-Machine Point Cloud Compression (2402.12532v3)
Abstract: Due to the limited computational capabilities of edge devices, deep learning inference can be quite expensive. One remedy is to compress and transmit point cloud data over the network for server-side processing. Unfortunately, this approach can be sensitive to network factors, including available bitrate. Luckily, the bitrate requirements can be reduced without sacrificing inference accuracy by using a machine task-specialized codec. In this paper, we present a scalable codec for point-cloud data that is specialized for the machine task of classification, while also providing a mechanism for human viewing. In the proposed scalable codec, the "base" bitstream supports the machine task, and an "enhancement" bitstream may be used for better input reconstruction performance for human viewing. We base our architecture on PointNet++, and test its efficacy on the ModelNet40 dataset. We show significant improvements over prior non-specialized codecs.
- N. Shlezinger and I. V. Bajić, “Collaborative inference for AI-empowered IoT devices,” IEEE Internet of Things Magazine, vol. 5, no. 4, pp. 92–98, Dec. 2022.
- I. V. Bajić, W. Lin, and Y. Tian, “Collaborative intelligence: Challenges and opportunities,” in Proc. IEEE ICASSP, 2021, pp. 8493–8497.
- M. Ulhaq and I. V. Bajić, “Learned point cloud compression for classification,” in Proc. IEEE MMSP, 2023.
- C. Qi, H. Su, K. Mo, and L. J. Guibas, “PointNet: Deep learning on point sets for 3D classification and segmentation,” in Proc. IEEE/CVF CVPR, 2017, pp. 77–85.
- C. Qi, L. Yi, H. Su, and L. J. Guibas, “PointNet++: Deep hierarchical feature learning on point sets in a metric space,” in Proc. NIPS, 2017.
- D. Maturana and S. A. Scherer, “VoxNet: A 3D convolutional neural network for real-time object recognition,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS), 2015, pp. 922–928.
- G. Riegler, A. O. Ulusoy, and A. Geiger, “OctNet: Learning deep 3D representations at high resolutions,” in Proc. IEEE/CVF CVPR, 2017, pp. 6620–6629.
- Google, “Draco: 3D data compression,” 2017. [Online]. Available: https://google.github.io/draco/
- K. Mammou, P. A. Chou, D. Flynn, M. Krivokuća, O. Nakagami, and T. Sugio, “G-PCC codec description v2,” 2019, ISO/IEC JTC1/SC29/WG11 N18189.
- D. Flynn and K. Mammou, “MPEG-PCC-TMC13,” 2021. [Online]. Available: https://github.com/MPEGGroup/mpeg-pcc-tmc13
- J. Ballé, D. C. Minnen, S. Singh, S. J. Hwang, and N. Johnston, “Variational image compression with a scale hyperprior,” arXiv preprint arXiv:1802.01436, 2018.
- M. Quach, G. Valenzise, and F. Dufaux, “Learning convolutional transforms for lossy point cloud geometry compression,” in Proc. IEEE ICIP, 2019, pp. 4320–4324.
- W. Yan, Y. shao, S. Liu, T. H. Li, Z. Li, and G. Li, “Deep autoencoder-based lossy geometry compression for point clouds,” arXiv preprint arXiv:1905.03691, 2019.
- Y. He, X. Ren, D. Tang, Y. Zhang, X. Xue, and Y. Fu, “Density-preserving deep point cloud compression,” arXiv preprint arXiv:2204.12684v1, 2022.
- C. Fu, G. Li, R. Song, W. Gao, and S. Liu, “OctAttention: Octree-based large-scale contexts model for point cloud compression,” in Proc. AAAI, vol. 36, no. 1, Jun. 2022, pp. 625–633.
- K.-S. You, P. Gao, and Q. T. Li, “IPDAE: Improved patch-based deep autoencoder for lossy point cloud geometry compression,” in Proc. 1st Int. Workshop on Advances in Point Cloud Compression, Processing and Analysis, 2022.
- N. Le, H. Zhang, F. Cricri, R. Ghaznavi-Youvalari, and E. Rahtu, “Image coding for machines: an end-to-end learned approach,” in Proc. IEEE ICASSP, 2021, pp. 1590–1594.
- L.-Y. Duan, J. Liu, W. Yang, T. Huang, and W. Gao, “Video Coding for Machines: A paradigm of collaborative compression and intelligent analytics,” IEEE Trans. Image Process., vol. 29, pp. 8680–8695, 2020.
- Y. Hu, S. Yang, W. Yang, L.-Y. Duan, and J. Liu, “Towards coding for human and machine vision: A scalable image coding approach,” in Proc. IEEE ICME, 2020, pp. 1–6.
- H. Choi and I. V. Bajić, “Scalable image coding for humans and machines,” IEEE Trans. Image Process., vol. 31, pp. 2739–2754, 2022.
- A. F. R. Guarda, N. M. M. Rodrigues, and F. Pereira, “Point cloud geometry scalable coding with a single end-to-end deep learning model,” in Proc. IEEE ICIP, 2020, pp. 3354–3358.
- Y. Foroutan, A. Harell, A. de Andrade, and I. V. Bajić, “Base layer efficiency in scalable human-machine coding,” in Proc. IEEE ICIP, 2023, pp. 3299–3303.
- Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, “3D ShapeNets: A deep representation for volumetric shapes,” in Proc. IEEE/CVF CVPR, 2015, pp. 1912–1920.
- J. Bégaint, F. Racapé, S. Feltman, and A. Pushparaja, “CompressAI: a PyTorch library and evaluation platform for end-to-end compression research,” arXiv preprint arXiv:2011.03029, 2020.
- M. Ulhaq and F. Racapé, “CompressAI Trainer,” 2022. [Online]. Available: https://github.com/InterDigitalInc/CompressAI-Trainer
- H. Blum, “A transformation for extracting new descriptors of shape,” in Models for the Perception of Speech and Visual Form, W. Wathen-Dunn, Ed. Cambridge, MA: MIT Press, 1967, p. 362–380.