Fast hyperboloid decision tree algorithms (2310.13841v2)
Abstract: Hyperbolic geometry is gaining traction in machine learning for its effectiveness at capturing hierarchical structures in real-world data. Hyperbolic spaces, where neighborhoods grow exponentially, offer substantial advantages and consistently deliver state-of-the-art results across diverse applications. However, hyperbolic classifiers often grapple with computational challenges. Methods reliant on Riemannian optimization frequently exhibit sluggishness, stemming from the increased computational demands of operations on Riemannian manifolds. In response to these challenges, we present hyperDT, a novel extension of decision tree algorithms into hyperbolic space. Crucially, hyperDT eliminates the need for computationally intensive Riemannian optimization, numerically unstable exponential and logarithmic maps, or pairwise comparisons between points by leveraging inner products to adapt Euclidean decision tree algorithms to hyperbolic space. Our approach is conceptually straightforward and maintains constant-time decision complexity while mitigating the scalability issues inherent in high-dimensional Euclidean spaces. Building upon hyperDT we introduce hyperRF, a hyperbolic random forest model. Extensive benchmarking across diverse datasets underscores the superior performance of these models, providing a swift, precise, accurate, and user-friendly toolkit for hyperbolic data analysis.
- The political blogosphere and the 2004 U.S. election: divided they blog. In Proceedings of the 3rd international workshop on Link discovery, LinkKDD ’05, pp. 36–43, New York, NY, USA, August 2005. Association for Computing Machinery. ISBN 978-1-59593-215-0. doi: 10.1145/1134271.1134277. URL https://doi.org/10.1145/1134271.1134277.
- Using hyperbolic large-margin classifiers for biological link prediction. In Proceedings of the 5th Workshop on Semantic Deep Learning (SemDeep-5), pp. 26–30, Macau, China, August 2019. Association for Computational Linguistics. URL https://aclanthology.org/W19-5805.
- Hyperbolic Geometry in Computer Vision: A Novel Framework for Convolutional Neural Networks, March 2023. URL https://arxiv.org/abs/2303.15919v2.
- Leo Breiman. Random forests. Machine Learning, 45(1):5–32, October 2001. ISSN 1573-0565. doi: 10.1023/A:1010933404324. URL https://doi.org/10.1023/A:1010933404324.
- Leo Breiman. Classification and Regression Trees. Routledge, New York, October 2017. ISBN 978-1-315-13947-0. doi: 10.1201/9781315139470.
- Neural Embeddings of Graphs in Hyperbolic Space, May 2017. URL http://arxiv.org/abs/1705.10359. arXiv:1705.10359 [cs, stat].
- Hyperbolic Graph Convolutional Neural Networks, October 2019. URL http://arxiv.org/abs/1910.12933. arXiv:1910.12933 [cs, stat].
- From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering, October 2020. URL https://arxiv.org/abs/2010.00402v1.
- HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections, June 2021. URL http://arxiv.org/abs/2106.03306. arXiv:2106.03306 [cs].
- XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794, August 2016. doi: 10.1145/2939672.2939785. URL http://arxiv.org/abs/1603.02754. arXiv:1603.02754 [cs].
- Fully Hyperbolic Neural Networks, March 2022. URL http://arxiv.org/abs/2105.14686. arXiv:2105.14686 [cs].
- Large-Margin Classification in Hyperbolic Space, June 2018. URL http://arxiv.org/abs/1806.00437. arXiv:1806.00437 [cs, stat].
- Neural Distance Embeddings for Biological Sequences, October 2021. URL http://arxiv.org/abs/2109.09740. arXiv:2109.09740 [cs, q-bio].
- Representation Tradeoffs for Hyperbolic Embeddings, April 2018. URL http://arxiv.org/abs/1804.03329. arXiv:1804.03329 [cs, stat].
- Hyperbolic Image-Text Representations, June 2023. URL http://arxiv.org/abs/2304.09172. arXiv:2304.09172 [cs].
- Deep generative model embedding of single-cell RNA-Seq profiles on hyperspheres and hyperbolic spaces. Nature Communications, 12(1):2554, May 2021. ISSN 2041-1723. doi: 10.1038/s41467-021-22851-4. URL https://www.nature.com/articles/s41467-021-22851-4. Number: 1 Publisher: Nature Publishing Group.
- Hyperbolic Random Forests, August 2023. URL http://arxiv.org/abs/2308.13279. arXiv:2308.13279 [cs].
- Horospherical Decision Boundaries for Large Margin Classification in Hyperbolic Space, June 2023. URL http://arxiv.org/abs/2302.06807. arXiv:2302.06807 [cs, stat].
- Christiane Fellbaum. WordNet. In Roberto Poli, Michael Healy, and Achilles Kameas (eds.), Theory and Applications of Ontology: Computer Applications, pp. 231–243. Springer Netherlands, Dordrecht, 2010. ISBN 978-90-481-8847-5. doi: 10.1007/978-90-481-8847-5˙10. URL https://doi.org/10.1007/978-90-481-8847-5_10.
- Hyperbolic Entailment Cones for Learning Hierarchical Embeddings, June 2018. URL http://arxiv.org/abs/1804.01882. arXiv:1804.01882 [cs, stat].
- Learning mixed-curvature representations in products of model spaces. 2019.
- Hyperbolic Attention Networks, May 2018. URL http://arxiv.org/abs/1805.09786. arXiv:1805.09786 [cs].
- Visualising very large phylogenetic trees in three dimensional hyperbolic space. BMC Bioinformatics, 5(1):48, April 2004. ISSN 1471-2105. doi: 10.1186/1471-2105-5-48. URL https://doi.org/10.1186/1471-2105-5-48.
- Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates. Biology, 11(9):1256, September 2022. ISSN 2079-7737. doi: 10.3390/biology11091256. URL https://www.mdpi.com/2079-7737/11/9/1256. Number: 9 Publisher: Multidisciplinary Digital Publishing Institute.
- Generalized and Scalable Optimal Sparse Decision Trees, November 2022. URL http://arxiv.org/abs/2006.08690. arXiv:2006.08690 [cs, stat].
- Hyperbolic Manifold Regression, May 2020. URL http://arxiv.org/abs/2005.13885. arXiv:2005.13885 [cs, stat].
- Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees with Continuous Features, June 2022. URL http://arxiv.org/abs/2206.11844. arXiv:2206.11844 [cs].
- American Gut: an Open Platform for Citizen Science Microbiome Research. mSystems, 3(3):e00031–18, 2018. ISSN 2379-5077. doi: 10.1128/mSystems.00031-18.
- Greengenes2 unifies microbial data in a single reference tree. Nature Biotechnology, pp. 1–4, July 2023. ISSN 1546-1696. doi: 10.1038/s41587-023-01845-1. URL https://www.nature.com/articles/s41587-023-01845-1. Publisher: Nature Publishing Group.
- Fast Sparse Decision Tree Optimization via Reference Ensembles, July 2022. URL http://arxiv.org/abs/2112.00798. arXiv:2112.00798 [cs].
- geomstats: a Python Package for Riemannian Geometry in Machine Learning, May 2018. URL https://arxiv.org/abs/1805.08308v2.
- A Wrapped Normal Distribution on Hyperbolic Space for Gradient-Based Learning, May 2019. URL http://arxiv.org/abs/1902.02992. arXiv:1902.02992 [cs, stat].
- Poincar\’e Embeddings for Learning Hierarchical Representations, May 2017. URL http://arxiv.org/abs/1705.08039. arXiv:1705.08039 [cs, stat].
- Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry, July 2018. URL http://arxiv.org/abs/1806.03417. arXiv:1806.03417 [cs, stat].
- Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12(85):2825–2830, 2011. ISSN 1533-7928. URL http://jmlr.org/papers/v12/pedregosa11a.html.
- Learning Transferable Visual Models From Natural Language Supervision, February 2021. URL http://arxiv.org/abs/2103.00020. arXiv:2103.00020 [cs].
- Rik Sarkar. Low Distortion Delaunay Embedding of Trees in Hyperbolic Plane. In Marc Van Kreveld and Bettina Speckmann (eds.), Graph Drawing, volume 7034, pp. 355–366. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012. ISBN 978-3-642-25877-0 978-3-642-25878-7. doi: 10.1007/978-3-642-25878-7˙34. URL http://link.springer.com/10.1007/978-3-642-25878-7_34. Series Title: Lecture Notes in Computer Science.
- Hyperbolic Representation Learning for Fast and Efficient Neural Question Answering. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, WSDM ’18, pp. 583–591, New York, NY, USA, February 2018. Association for Computing Machinery. ISBN 978-1-4503-5581-0. doi: 10.1145/3159652.3159664. URL https://doi.org/10.1145/3159652.3159664.
- Poincar\’e GloVe: Hyperbolic Word Embeddings, November 2018. URL http://arxiv.org/abs/1810.06546. arXiv:1810.06546 [cs].
- HypLL: The Hyperbolic Learning Library, August 2023. URL http://arxiv.org/abs/2306.06154. arXiv:2306.06154 [cs].