Using Enriched Category Theory to Construct the Nearest Neighbour Classification Algorithm (2312.16529v2)
Abstract: This paper is the first to construct and motivate a Machine Learning algorithm solely with Enriched Category Theory, supplementing evidence that Category Theory can provide valuable insights into the construction and explainability of Machine Learning algorithms. It is shown that a series of reasonable assumptions about a dataset lead to the construction of the Nearest Neighbours Algorithm. This construction is produced as an extension of the original dataset using profunctors in the category of Lawvere metric spaces, leading to a definition of an Enriched Nearest Neighbours Algorithm, which, consequently, also produces an enriched form of the Voronoi diagram. Further investigation of the generalisations this construction induces demonstrates how the $k$ Nearest Neighbours Algorithm may also be produced. Moreover, how the new construction allows metrics on the classification labels to inform the outputs of the Enriched Nearest Neighbour Algorithm: Enabling soft classification boundaries and dependent classifications. This paper is intended to be accessible without any knowledge of Category Theory.
- A. Dudzik and P. Veličković. Graph Neural Networks are Dynamic Programmers, Oct. 2022. URL http://arxiv.org/abs/2203.15544. arXiv:2203.15544 [cs, math, stat].
- E. Fix and J. L. Hodges. Discriminatory Analysis. Nonparametric Discrimination: Consistency Properties. International Statistical Review / Revue Internationale de Statistique, 57(3):238–247, 1989. ISSN 0306-7734. doi: 10.2307/1403797. URL https://www.jstor.org/stable/1403797. Publisher: [Wiley, International Statistical Institute (ISI)].
- B. Fong and D. I. Spivak. Seven Sketches in Compositionality: An Invitation to Applied Category Theory, Oct. 2018. URL http://arxiv.org/abs/1803.05316. Number: arXiv:1803.05316 arXiv:1803.05316 [math].
- G. M. Kelly. Basic concepts of enriched category theory. Repr. Theory Appl. Categ., (10):vi+137, 2005. Reprint of the 1982 original [Cambridge Univ. Press, Cambridge; MR0651714].
- D. Shiebler. Kan Extensions in Data Science and Machine Learning, July 2022.