Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

T3DNet: Compressing Point Cloud Models for Lightweight 3D Recognition (2402.19264v1)

Published 29 Feb 2024 in cs.CV

Abstract: 3D point cloud has been widely used in many mobile application scenarios, including autonomous driving and 3D sensing on mobile devices. However, existing 3D point cloud models tend to be large and cumbersome, making them hard to deploy on edged devices due to their high memory requirements and non-real-time latency. There has been a lack of research on how to compress 3D point cloud models into lightweight models. In this paper, we propose a method called T3DNet (Tiny 3D Network with augmEntation and disTillation) to address this issue. We find that the tiny model after network augmentation is much easier for a teacher to distill. Instead of gradually reducing the parameters through techniques such as pruning or quantization, we pre-define a tiny model and improve its performance through auxiliary supervision from augmented networks and the original model. We evaluate our method on several public datasets, including ModelNet40, ShapeNet, and ScanObjectNN. Our method can achieve high compression rates without significant accuracy sacrifice, achieving state-of-the-art performances on three datasets against existing methods. Amazingly, our T3DNet is 58 times smaller and 54 times faster than the original model yet with only 1.4% accuracy descent on the ModelNet40 dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (67)
  1. X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, “Multi-view 3d object detection network for autonomous driving,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017, pp. 1907–1915.
  2. C. Urmson, J. Anhalt, D. Bagnell, C. Baker, R. Bittner, M. Clark, J. Dolan, D. Duggins, T. Galatali, C. Geyer et al., “Autonomous driving in urban environments: Boss and the urban challenge,” Journal of field Robotics, vol. 25, no. 8, pp. 425–466, 2008.
  3. A. S. Huang, A. Bachrach, P. Henry, M. Krainin, D. Maturana, D. Fox, and N. Roy, “Visual odometry and mapping for autonomous flight using an rgb-d camera,” in Robotics Research.   Springer, 2017, pp. 235–252.
  4. G. Elbaz, T. Avraham, and A. Fischer, “3d point cloud registration for localization using a deep neural network auto-encoder,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4631–4640.
  5. S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, “Pv-rcnn: Point-voxel feature set abstraction for 3d object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 10 529–10 538.
  6. W. Zheng, W. Tang, L. Jiang, and C.-W. Fu, “Se-ssd: Self-ensembling single-stage object detector from point cloud,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 14 494–14 503.
  7. J. Lin, W.-M. Chen, Y. Lin, C. Gan, S. Han et al., “Mcunet: Tiny deep learning on iot devices,” Advances in Neural Information Processing Systems, vol. 33, pp. 11 711–11 722, 2020.
  8. J. Yang, H. Zou, S. Cao, Z. Chen, and L. Xie, “Mobileda: Toward edge domain adaptation,” IEEE Internet of Things Journal, vol. 7, no. 8, pp. 6909–6918, 2020.
  9. G. Luetzenburg, A. Kroon, and A. A. Bjørk, “Evaluation of the apple iphone 12 pro lidar for an application in geosciences,” Scientific reports, vol. 11, no. 1, pp. 1–9, 2021.
  10. P.-S. Wang, Y. Liu, Y.-X. Guo, C.-Y. Sun, and X. Tong, “O-cnn: Octree-based convolutional neural networks for 3d shape analysis,” ACM Transactions On Graphics (TOG), vol. 36, no. 4, pp. 1–11, 2017.
  11. K. Liu, Z. Gao, F. Lin, and B. M. Chen, “Fg-net: A fast and accurate framework for large-scale lidar point cloud understanding,” IEEE Transactions on Cybernetics, vol. 53, no. 1, pp. 553–564, 2022.
  12. B. Graham and L. van der Maaten, “Submanifold sparse convolutional networks,” arXiv preprint arXiv:1706.01307, 2017.
  13. Y. Yan, Y. Mao, and B. Li, “Second: Sparsely embedded convolutional detection,” Sensors, vol. 18, no. 10, p. 3337, 2018.
  14. H. Tang, Z. Liu, S. Zhao, Y. Lin, J. Lin, H. Wang, and S. Han, “Searching efficient 3d architectures with sparse point-voxel convolution,” in European conference on computer vision.   Springer, 2020, pp. 685–702.
  15. S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding,” arXiv preprint arXiv:1510.00149, 2015.
  16. B. Cui, Y. Li, and Z. Zhang, “Joint structured pruning and dense knowledge distillation for efficient transformer model compression,” Neurocomputing, vol. 458, pp. 56–69, 2021.
  17. B. Hassibi and D. Stork, “Second order derivatives for network pruning: Optimal brain surgeon,” Advances in neural information processing systems, vol. 5, 1992.
  18. Y. Gong, L. Liu, M. Yang, and L. Bourdev, “Compressing deep convolutional networks using vector quantization,” arXiv preprint arXiv:1412.6115, 2014.
  19. J. Wu, C. Leng, Y. Wang, Q. Hu, and J. Cheng, “Quantized convolutional neural networks for mobile devices,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4820–4828.
  20. R. Banner, Y. Nahshan, and D. Soudry, “Post training 4-bit quantization of convolutional networks for rapid-deployment,” Advances in Neural Information Processing Systems, vol. 32, 2019.
  21. G. Hinton, O. Vinyals, J. Dean et al., “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, vol. 2, no. 7, 2015.
  22. H. Cai, C. Gan, J. Lin, and S. Han, “Network augmentation for tiny deep learning,” arXiv preprint arXiv:2110.08890, 2021.
  23. Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, “3d shapenets: A deep representation for volumetric shapes,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1912–1920.
  24. M. A. Uy, Q.-H. Pham, B.-S. Hua, T. Nguyen, and S.-K. Yeung, “Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 1588–1597.
  25. A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su et al., “Shapenet: An information-rich 3d model repository,” arXiv preprint arXiv:1512.03012, 2015.
  26. C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” Advances in neural information processing systems, vol. 30, 2017.
  27. A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, “Fitnets: Hints for thin deep nets,” arXiv preprint arXiv:1412.6550, 2014.
  28. Y. Zhang, T. Xiang, T. M. Hospedales, and H. Lu, “Deep mutual learning,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4320–4328.
  29. D. Maturana and S. Scherer, “Voxnet: A 3d convolutional neural network for real-time object recognition,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2015, pp. 922–928.
  30. G. Riegler, A. Osman Ulusoy, and A. Geiger, “Octnet: Learning deep 3d representations at high resolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 3577–3586.
  31. C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652–660.
  32. Y. Liu, B. Fan, S. Xiang, and C. Pan, “Relation-shape convolutional neural network for point cloud analysis,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 8895–8904.
  33. A. Komarichev, Z. Zhong, and J. Hua, “A-cnn: Annularly convolutional neural networks on point clouds,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7421–7430.
  34. T. Huang, J. Zhang, J. Chen, Z. Ding, Y. Tai, Z. Zhang, C. Wang, and Y. Liu, “3qnet: 3d point cloud geometry quantization compression network,” ACM Transactions on Graphics (TOG), vol. 41, no. 6, pp. 1–13, 2022.
  35. Y. Cui, Y. An, W. Sun, H. Hu, and X. Song, “Lightweight attention module for deep learning on classification and segmentation of 3-d point clouds,” IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1–12, 2020.
  36. E. Y. Puang, H. Zhang, H. Zhu, and W. Jing, “Hierarchical point cloud encoding and decoding with lightweight self-attention based model,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 4542–4549, 2022.
  37. G. Zhu, Y. Zhou, J. Zhao, R. Yao, and M. Zhang, “Point cloud recognition based on lightweight embeddable attention module,” Neurocomputing, vol. 472, pp. 138–148, 2022.
  38. J.-H. Luo, J. Wu, and W. Lin, “Thinet: A filter level pruning method for deep neural network compression,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5058–5066.
  39. M. Lin, Q. Chen, and S. Yan, “Network in network,” arXiv preprint arXiv:1312.4400, 2013.
  40. S. Hanson and L. Pratt, “Comparing biases for minimal network construction with back-propagation,” Advances in neural information processing systems, vol. 1, 1988.
  41. V. Lebedev and V. Lempitsky, “Fast convnets using group-wise brain damage,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2554–2564.
  42. H. Zhou, J. M. Alvarez, and F. Porikli, “Less is more: Towards compact cnns,” in European conference on computer vision.   Springer, 2016, pp. 662–677.
  43. W. Wen, C. Wu, Y. Wang, Y. Chen, and H. Li, “Learning structured sparsity in deep neural networks,” Advances in neural information processing systems, vol. 29, 2016.
  44. H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning filters for efficient convnets,” arXiv preprint arXiv:1608.08710, 2016.
  45. V. Vanhoucke, A. Senior, and M. Z. Mao, “Improving the speed of neural networks on cpus,” in Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011, 2011.
  46. S. Gupta, A. Agrawal, K. Gopalakrishnan, and P. Narayanan, “Deep learning with limited numerical precision,” in International conference on machine learning.   PMLR, 2015, pp. 1737–1746.
  47. M. Courbariaux, Y. Bengio, and J.-P. David, “Binaryconnect: Training deep neural networks with binary weights during propagations,” Advances in neural information processing systems, vol. 28, 2015.
  48. M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “Xnor-net: Imagenet classification using binary convolutional neural networks,” in European conference on computer vision.   Springer, 2016, pp. 525–542.
  49. P. Merolla, R. Appuswamy, J. Arthur, S. K. Esser, and D. Modha, “Deep neural networks are robust to weight binarization and other non-linear distortions,” arXiv preprint arXiv:1606.01981, 2016.
  50. Y. Zhang, X. Li, and Z. Zhang, “Efficient person search via expert-guided knowledge distillation,” IEEE Transactions on Cybernetics, vol. 51, no. 10, pp. 5093–5104, 2019.
  51. W.-C. Kao, H.-X. Xie, C.-Y. Lin, and W.-H. Cheng, “Specific expert learning: Enriching ensemble diversity via knowledge distillation,” IEEE Transactions on Cybernetics, 2021.
  52. K. Wei, C. Deng, X. Yang, and D. Tao, “Incremental zero-shot learning,” IEEE Transactions on Cybernetics, vol. 52, no. 12, pp. 13 788–13 799, 2021.
  53. J. Ba and R. Caruana, “Do deep nets really need to be deep?” Advances in neural information processing systems, vol. 27, 2014.
  54. Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A review and new perspectives,” IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 8, pp. 1798–1828, 2013.
  55. G. Zhu, J. Wang, P. Wang, Y. Wu, and H. Lu, “Feature distilled tracking,” IEEE transactions on cybernetics, vol. 49, no. 2, pp. 440–452, 2017.
  56. H. Zhao, X. Sun, J. Dong, C. Chen, and Z. Dong, “Highlight every step: Knowledge distillation via collaborative teaching,” IEEE Transactions on Cybernetics, vol. 52, no. 4, pp. 2070–2081, 2020.
  57. Q. Guo, X. Wang, Y. Wu, Z. Yu, D. Liang, X. Hu, and P. Luo, “Online knowledge distillation via collaborative learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11 020–11 029.
  58. D. Chen, J.-P. Mei, C. Wang, Y. Feng, and C. Chen, “Online knowledge distillation with diverse peers,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 04, 2020, pp. 3430–3437.
  59. X. Ma, C. Qin, H. You, H. Ran, and Y. Fu, “Rethinking network design and local geometry in point cloud: A simple residual mlp framework,” arXiv preprint arXiv:2202.07123, 2022.
  60. X. Wu, Y. Lao, L. Jiang, X. Liu, and H. Zhao, “Point transformer v2: Grouped vector attention and partition-based pooling,” Advances in Neural Information Processing Systems, vol. 35, pp. 33 330–33 342, 2022.
  61. Z. Guo, X. Zhang, H. Mu, W. Heng, Z. Liu, Y. Wei, and J. Sun, “Single path one-shot neural architecture search with uniform sampling,” in European Conference on Computer Vision.   Springer, 2020, pp. 544–560.
  62. H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han, “Once-for-all: Train one network and specialize it for efficient deployment,” arXiv preprint arXiv:1908.09791, 2019.
  63. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The journal of machine learning research, vol. 15, no. 1, pp. 1929–1958, 2014.
  64. A. Dai, A. X. Chang, M. Savva, M. Halber, T. Funkhouser, and M. Nießner, “Scannet: Richly-annotated 3d reconstructions of indoor scenes,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 5828–5839.
  65. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  66. L. Zhang and K. Ma, “Improve object detection with feature-based knowledge distillation: Towards accurate and efficient detectors,” in International Conference on Learning Representations, 2020.
  67. L. Zhang, R. Dong, H.-S. Tai, and K. Ma, “Pointdistiller: Structured knowledge distillation towards efficient and compact 3d detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 21 791–21 801.

Summary

We haven't generated a summary for this paper yet.