Curvature Augmented Manifold Embedding and Learning (2403.14813v1)
Abstract: A new dimensional reduction (DR) and data visualization method, Curvature-Augmented Manifold Embedding and Learning (CAMEL), is proposed. The key novel contribution is to formulate the DR problem as a mechanistic/physics model, where the force field among nodes (data points) is used to find an n-dimensional manifold representation of the data sets. Compared with many existing attractive-repulsive force-based methods, one unique contribution of the proposed method is to include a non-pairwise force. A new force field model is introduced and discussed, inspired by the multi-body potential in lattice-particle physics and Riemann curvature in topology. A curvature-augmented force is included in CAMEL. Following this, CAMEL formulation for unsupervised learning, supervised learning, semi-supervised learning/metric learning, and inverse learning are provided. Next, CAMEL is applied to many benchmark datasets by comparing existing models, such as tSNE, UMAP, TRIMAP, and PacMap. Both visual comparison and metrics-based evaluation are performed. 14 open literature and self-proposed metrics are employed for a comprehensive comparison. Conclusions and future work are suggested based on the current investigation. Related code and demonstration are available on https://github.com/ymlasu/CAMEL for interested readers to reproduce the results and other applications.
- Principal component analysis: a review and recent developments. Philosophical transactions of the royal society A: Mathematical, Physical and Engineering Sciences, 374(2065):20150202, 2016.
- The method of proper orthogonal decomposition for dynamical characterization and order reduction of mechanical systems: an overview. Nonlinear dynamics, 41:147–169, 2005.
- Numerical methods for the discretization of random fields by means of the karhunen–loève expansion. Computer Methods in Applied Mechanics and Engineering, 271:109–129, 2014.
- Nonlinear dimensionality reduction by locally linear embedding. science, 290(5500):2323–2326, 2000.
- The isomap algorithm and topological stability. Science, 295(5552):7–7, 2002.
- Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation, 15(6):1373–1396, 2003.
- Dimensionality reduction: A comparative review. Journal of Machine Learning Research, 10(66-71):13, 2009.
- Stephen G Kobourov. Spring embedders and force directed graph drawing algorithms. arXiv preprint arXiv:1201.3011, 2012.
- Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
- Laurens Van Der Maaten. Accelerating t-sne using tree-based algorithms. The journal of machine learning research, 15(1):3221–3245, 2014.
- Visualizing large-scale and high-dimensional data. In Proceedings of the 25th international conference on world wide web, pages 287–297, 2016.
- Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018.
- Trimap: Large-scale dimensionality reduction using triplets. arXiv preprint arXiv:1910.00204, 2019.
- Understanding how dimension reduction tools work: an empirical approach to deciphering t-sne, umap, trimap, and pacmap for data visualization. The Journal of Machine Learning Research, 22(1):9129–9201, 2021.
- James Melville. Notes on pacmap. https://jlmelville.github.io/smallvis/pacmap.html#20NG, Year Published Nov/2021/ Last Updated Jan/2022. Accessed: 02/07/2024.
- Attraction-repulsion spectrum in neighbor embeddings. The Journal of Machine Learning Research, 23(1):4118–4149, 2022.
- From t𝑡titalic_t-sne to umap with contrastive learning. In The Eleventh International Conference on Learning Representations, 2022.
- James Melville. Some theory: Comparision of t-sne, umap, largevis. https://jlmelville.github.io/smallvis/pacmap.html#20NG, Year Published Nov/2021/ Last Updated Jan/2022. Accessed: 02/07/2024.
- Forceatlas2, a continuous graph layout algorithm for handy network visualization designed for the gephi software. PloS one, 9(6):e98679, 2014.
- Hierarchical nearest neighbor graph embedding for efficient dimensionality reduction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 336–345, 2022.
- Conceptual and empirical comparison of dimensionality reduction algorithms (pca, kpca, lda, mds, svd, lle, isomap, le, ica, t-sne). Computer Science Review, 40:100378, 2021.
- Various dimension reduction techniques for high dimensional data analysis: a review. Artificial Intelligence Review, 54:3473–3515, 2021.
- A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. Journal of Applied Science and Technology Trends, 1(2):56–70, 2020.
- Unexplainable explanations: Towards interpreting tsne and umap embeddings. arXiv preprint arXiv:2306.11898, 2023.
- The embedded-atom method: a review of theory and applications. Materials Science Reports, 9(7-8):251–310, 1993.
- Embedded-atom method: Derivation and application to impurities, surfaces, and other defects in metals. Physical Review B, 29(12):6443, 1984.
- A generalized 2d non-local lattice spring model for fracture simulation. Computational Mechanics, 54:1541–1558, 2014.
- A non-local 3d lattice particle framework for elastic solids. International Journal of Solids and Structures, 81:411–420, 2016.
- Peridynamic theory of solid mechanics. Advances in applied mechanics, 44:73–168, 2010.
- A nonlocal lattice particle model for j2 plasticity. International Journal for Numerical Methods in Engineering, 121(24):5469–5489, 2020.
- Modeling plasticity of cubic crystals using a nonlocal lattice particle method. Computer Methods in Applied Mechanics and Engineering, 385:114069, 2021.
- A novel volume-compensated particle method for 2d elasticity and plasticity analysis. International Journal of Solids and Structures, 51(9):1819–1833, 2014.
- Salomon Bochner. Vector fields and ricci curvature. 1946.
- Yann Ollivier. Ricci curvature of markov chains on metric spaces. Journal of Functional Analysis, 256(3):810–864, 2009.
- Ricci curvature of graphs. Tohoku Mathematical Journal, Second Series, 63(4):605–627, 2011.
- Community detection on networks with ricci flow. Scientific reports, 9(1):9984, 2019.
- Barry C Arnold. Pareto distribution. Wiley StatsRef: Statistics Reference Online, pages 1–10, 2014.
- Performance evaluation of methods for integrative dimension reduction. Information Sciences, 493:105–119, 2019.
- Quality assessment of dimensionality reduction: Rank-based criteria. Neurocomputing, 72(7-9):1431–1443, 2009.
- Multi-scale similarities in stochastic neighbour embedding: Reducing dimensionality while preserving both local and global structure. Neurocomputing, 169:246–261, 2015.
- Towards a comprehensive evaluation of dimension reduction methods for transcriptomic data visualization. Communications biology, 5(1):719, 2022.
- Curvature graph network. In International conference on learning representations, 2019.
- Optics: Ordering points to identify the clustering structure. ACM Sigmod record, 28(2):49–60, 1999.