Linear optimal transport subspaces for point set classification (2403.10015v1)
Abstract: Learning from point sets is an essential component in many computer vision and machine learning applications. Native, unordered, and permutation invariant set structure space is challenging to model, particularly for point set classification under spatial deformations. Here we propose a framework for classifying point sets experiencing certain types of spatial deformations, with a particular emphasis on datasets featuring affine deformations. Our approach employs the Linear Optimal Transport (LOT) transform to obtain a linear embedding of set-structured data. Utilizing the mathematical properties of the LOT transform, we demonstrate its capacity to accommodate variations in point sets by constructing a convex data space, effectively simplifying point set classification problems. Our method, which employs a nearest-subspace algorithm in the LOT space, demonstrates label efficiency, non-iterative behavior, and requires no hyper-parameter tuning. It achieves competitive accuracies compared to state-of-the-art methods across various point set classification tasks. Furthermore, our approach exhibits robustness in out-of-distribution scenarios where training and test distributions vary in terms of deformation magnitudes.
- C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660, 2017.
- H. Zhao, L. Jiang, C.-W. Fu, and J. Jia, “Pointweb: Enhancing local neighborhood features for point cloud processing,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5565–5573, 2019.
- Y. Li, R. Bu, M. Sun, W. Wu, X. Di, and B. Chen, “Pointcnn: Convolution on x-transformed points,” Advances in neural information processing systems, vol. 31, 2018.
- X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, “Multi-view 3d object detection network for autonomous driving,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1907–1915, 2017.
- Y. Zhou and O. Tuzel, “Voxelnet: End-to-end learning for point cloud based 3d object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4490–4499, 2018.
- Y. Xu and U. Stilla, “Toward building and civil infrastructure reconstruction from point clouds: A review on data and key techniques,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 14, pp. 2857–2885, 2021.
- Q. Wang, Y. Tan, and Z. Mei, “Computational methods of acquisition and processing of 3d point cloud data for construction applications,” Archives of computational methods in engineering, vol. 27, pp. 479–499, 2020.
- L. Zhou, Y. Du, and J. Wu, “3d shape generation and completion through point-voxel diffusion,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5826–5835, 2021.
- F. Pomerleau, F. Colas, R. Siegwart, et al., “A review of point cloud registration algorithms for mobile robotics,” Foundations and Trends® in Robotics, vol. 4, no. 1, pp. 1–104, 2015.
- X. Wang, M. H. Ang Jr, and G. H. Lee, “Cascaded refinement network for point cloud completion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 790–799, 2020.
- J. Zeng, G. Cheung, M. Ng, J. Pang, and C. Yang, “3d point cloud denoising using graph laplacian regularization of a low dimensional manifold model,” IEEE Transactions on Image Processing, vol. 29, pp. 3474–3489, 2019.
- Y. Lu, X. Liu, A. Soltoggio, and S. Kolouri, “Slosh: Set locality sensitive hashing via sliced-wasserstein embeddings,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2566–2576, 2024.
- F. Radenović, G. Tolias, and O. Chum, “Fine-tuning cnn image retrieval with no human annotation,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 7, pp. 1655–1668, 2018.
- Q. Wang, J. Xie, W. Zuo, L. Zhang, and P. Li, “Deep cnns meet global covariance pooling: Better representation and generalization,” IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 8, pp. 2582–2597, 2020.
- D. Acharya, Z. Huang, D. Pani Paudel, and L. Van Gool, “Covariance pooling for facial expression recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 367–374, 2018.
- Y. Zhang, J. Hare, and A. Prügel-Bennett, “Fspool: Learning set representations with featurewise sort pooling,” arXiv preprint arXiv:1906.02795, 2019.
- Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon, “Dynamic graph cnn for learning on point clouds,” Acm Transactions On Graphics (tog), vol. 38, no. 5, pp. 1–12, 2019.
- W. Liu, J. Sun, W. Li, T. Hu, and P. Wang, “Deep learning on point clouds and its application: A survey,” Sensors, vol. 19, no. 19, p. 4188, 2019.
- F. Liao, M. Liang, Y. Dong, T. Pang, X. Hu, and J. Zhu, “Defense against adversarial attacks using high-level representation guided denoiser,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1778–1787, 2018.
- S. Basu, S. Kolouri, and G. K. Rohde, “Detecting and visualizing cell phenotype differences from microscopy images using transport-based morphometry,” Proceedings of the National Academy of Sciences, vol. 111, no. 9, pp. 3448–3453, 2014.
- M. Shifat-E-Rabbi, X. Yin, A. H. M. Rubaiyat, S. Li, S. Kolouri, A. Aldroubi, J. M. Nichols, and G. K. Rohde, “Radon cumulative distribution transform subspace modeling for image classification,” Journal of Mathematical Imaging and Vision, vol. 63, pp. 1185–1203, 2021.
- S. Kolouri, S. R. Park, M. Thorpe, D. Slepcev, and G. K. Rohde, “Optimal mass transport: Signal processing and machine-learning applications,” IEEE signal processing magazine, vol. 34, no. 4, pp. 43–59, 2017.
- W. Wang, D. Slepčev, S. Basu, J. A. Ozolek, and G. K. Rohde, “A linear optimal transportation framework for quantifying and visualizing variations in sets of images,” International journal of computer vision, vol. 101, pp. 254–269, 2013.
- Y. Brenier, “Polar factorization and monotone rearrangement of vector-valued functions,” Commun. Pure Appl. Math., vol. 44, no. 4, pp. 375–417, 1991.
- C. Villani, Topics in Optimal Transportation. No. 58, American Mathematical Soc., 2003.
- A. Aldroubi, S. Li, and G. K. Rohde, “Partitioning signal classes using transport transforms for data analysis and machine learning,” Sampl. Theory Signal Process. Data Anal., vol. 19, no. 6, 2021.
- C. Moosmüller and A. Cloninger, “Linear optimal transport embedding: Provable wasserstein classification for certain rigid transformations and perturbations,” Information and Inference: A Journal of the IMA, vol. 12, no. 1, pp. 363–389, 2023.
- M. Shifat-E-Rabbi, Y. Zhuang, S. Li, A. H. M. Rubaiyat, X. Yin, and G. K. Rohde, “Invariance encoding in sliced-wasserstein space for image classification with limited training data,” Pattern Recognition, vol. 137, p. 109268, 2023.
- A. H. M. Rubaiyat, M. Shifat-E-Rabbi, Y. Zhuang, S. Li, and G. K. Rohde, “Nearest subspace search in the signed cumulative distribution transform space for 1d signal classification,” in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3508–3512, IEEE, 2022.
- A. H. M. Rubaiyat, S. Li, X. Yin, M. Shifat-E-Rabbi, Y. Zhuang, and G. K. Rohde, “End-to-end signal classification in signed cumulative distribution transform space,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
- Y. Zhuang, S. Li, M. Shifat-E-Rabbi, X. Yin, A. H. M. Rubaiyat, G. K. Rohde, et al., “Local sliced-wasserstein feature sets for illumination-invariant face recognition,” arXiv preprint arXiv:2202.10642, 2022.
- F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al., “Scikit-learn: Machine learning in python,” the Journal of machine Learning research, vol. 12, pp. 2825–2830, 2011.
- C. Garcia, “Point cloud mnist 2d,” 2020.
- Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
- Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, “3d shapenets: A deep representation for volumetric shapes,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1912–1920, 2015.
- A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, et al., “Shapenet: An information-rich 3d model repository,” arXiv preprint arXiv:1512.03012, 2015.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.