Dataset Distillation of 3D Point Clouds via Distribution Matching (2503.22154v2)
Abstract: Large-scale datasets are usually required to train deep neural networks, but it increases the computational complexity hindering the practical applications. Recently, dataset distillation for images and texts has been attracting a lot of attention, that reduces the original dataset to a synthetic dataset to alleviate the computational burden of training while preserving essential task-relevant information. However, the dataset distillation for 3D point clouds remains largely unexplored, as the point clouds exhibit fundamentally different characteristics from that of images, making the dataset distillation more challenging. In this paper, we propose a distribution matching-based distillation framework for 3D point clouds that jointly optimizes the geometric structures as well as the orientations of the synthetic 3D objects. To address the semantic misalignment caused by unordered indexing of points, we introduce a Semantically Aligned Distribution Matching loss computed on the sorted features in each channel. Moreover, to address the rotation variation, we jointly learn the optimal rotation angles while updating the synthetic dataset to better align with the original feature distribution. Extensive experiments on widely used benchmark datasets demonstrate that the proposed method consistently outperforms existing dataset distillation methods, achieving superior accuracy and strong cross-architecture generalization.