Clustering matrices through optimal permutations (2110.12776v2)
Abstract: Matrices are two-dimensional data structures allowing one to conceptually organize information. For example, adjacency matrices are useful to store the links of a network; correlation matrices are simple ways to arrange gene co-expression data or correlations of neuronal activities. Clustering matrix values into geometric patterns that are easy to interpret helps us to understand and explain the functional and structural organization of the system components described by matrix entries. Here we introduce a theoretical framework to cluster a matrix into a desired pattern by performing a similarity transformation obtained by solving a minimization problem named the optimal permutation problem. On the computational side, we present a fast clustering algorithm that can be applied to any type of matrix, including non-normal and singular matrices. We apply our algorithm to the neuronal correlation matrix and the synaptic adjacency matrix of the Caenorhabditis elegans nervous system by performing different types of clustering, including block-diagonal, nested, banded, and triangular patterns. Some of these clustering patterns show their biological significance in that they separate matrix entries into groups that match the experimentally known classification of C. elegans neurons into four broad categories, namely: interneurons, motor, sensory, and polymodal neurons.