Boolean Matrix Multiplication for Highly Clustered Data on the Congested Clique (2405.16103v1)
Abstract: We present a protocol for the Boolean matrix product of two $n\times b$ Boolean matrices on the congested clique designed for the situation when the rows of the first matrix or the columns of the second matrix are highly clustered in the space ${0,1}n.$ With high probability (w.h.p), it uses $\tilde{O}\left(\sqrt {\frac M n+1}\right)$ rounds on the congested clique with $n$ nodes, where $M$ is the minimum of the cost of a minimum spanning tree (MST) of the rows of the first input matrix and the cost of an MST of the columns of the second input matrix in the Hamming space ${0,1}n.$ A key step in our protocol is the computation of an approximate minimum spanning tree of a set of $n$ points in the space ${0,1}n$. We provide a protocol for this problem (of interest in its own rights) based on a known randomized technique of dimension reduction in Hamming spaces. W.h.p., it constructs an $O(1)$-factor approximation of an MST of $n$ points in the Hamming space ${ 0,\ 1}n$ using $O(\log3 n)$ rounds on the congested clique with $n$ nodes.
- The Design and Analysis of Computer Algorithms. Addison-Wesley Publishing Company, Reading, 1974.
- Approximate nearest neighbor search in high dimensions. CoRR, abs/1806.09823, 2018.
- Efficient distributed algorithms in the k-machine model via pram simulations. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 223–232. IEEE, 2021.
- A. Bjorklund and A. Lingas. Fast boolean matrix multiplication for highly clustered data. In Proceedings of the Algorithms and Data Structures Symposium WADS, pages 258–263. Springer, 2001.
- Algebraic methods in the congested clique. Distributed Computing, 32(6):461–478, 2019.
- Sparse matrix multiplication and triangle listing in the congested clique model. Theoretical Computer Science, 809:45–60, 2020.
- L. Gasieniec and A. Lingas. An improved bound on boolean matrix multiplication for highly clustered data. In Proceedings of the Algorithms and Data Structures Symposium WADS, pages 329–339. Springer, 2003.
- Minimum spanning tree based clustering algorithms. In Proceedings of the 18th International Conference on Tools with Ar- tificial Intelligence (ICTAI’06), pages 73–81. IEEE, 2006.
- Convex hulls, triangulations, and Voronoi diagrams of planar point sets on the congested clique. arXiv:2305.09987, 2023. Preliminary version in Proceedings of the Thirty-Fifth Canadian Conference on Computational Geometry (CCCG 2023), pages 183–189, 2023.
- Efficient search for approximate nearest neighbor in high dimensional spaces. SIAM Journal on Computing, 30(2):467–474, 2000.
- Approximate minimum spanning tree clustering in high-dimensional space. Intelligent Data Analysis, 13(4):575–597, 2009.
- C. Lenzen. Optimal deterministic routing and sorting on the congested clique. In Proceedings of the 2013 ACM Symposium on Principles of Distributed Computing (PODC 2013), pages 42–50. ACM, 2013.
- Minimum-weight spanning tree construction in O(loglogn)𝑂𝑛O(\log\log n)italic_O ( roman_log roman_log italic_n ) communication rounds. SIAM Journal on Computing, 35(1):120–131, 2005.
- K. Nowicki. A deterministic algorithm for the MST problem in constant rounds of congested clique. In Proceedings of the Fifty-Third Annual ACM SIGACT Symposium on Theory of Computing (STOC 2021), pages 1154–1165. ACM, 2021.
- P. Robinson. Brief announcement: What can we compute in a single round of the congested clique? In Proceedings of the 2023 ACM Symposium on Principles of Distributed Computing (PODC 2023), pages 168–171. ACM, 2023.
- New bounds for matrix multiplication: from alpha to omega. In Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2024). ACM-SIAM, 2024.