Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Input/Output Complexity of Sparse Matrix Multiplication (1403.3551v1)

Published 14 Mar 2014 in cs.DS

Abstract: We consider the problem of multiplying sparse matrices (over a semiring) where the number of non-zero entries is larger than main memory. In the classical paper of Hong and Kung (STOC '81) it was shown that to compute a product of dense $U \times U$ matrices, $\Theta \left(U3 / (B \sqrt{M}) \right)$ I/Os are necessary and sufficient in the I/O model with internal memory size $M$ and memory block size $B$. In this paper we generalize the upper and lower bounds of Hong and Kung to the sparse case. Our bounds depend of the number $N = \mathtt{nnz}(A)+\mathtt{nnz}(C)$ of nonzero entries in $A$ and $C$, as well as the number $Z = \mathtt{nnz}(AC)$ of nonzero entries in $AC$. We show that $AC$ can be computed using $\tilde{O} \left(\tfrac{N}{B} \min\left(\sqrt{\tfrac{Z}{M}},\tfrac{N}{M}\right) \right)$ I/Os, with high probability. This is tight (up to polylogarithmic factors) when only semiring operations are allowed, even for dense rectangular matrices: We show a lower bound of $\Omega \left(\tfrac{N}{B} \min\left(\sqrt{\tfrac{Z}{M}},\tfrac{N}{M}\right) \right)$ I/Os. While our lower bound uses fairly standard techniques, the upper bound makes use of compressed matrix multiplication'' sketches, which is new in the context of I/O-efficient algorithms, and a new matrix product size estimation technique that avoids theno cancellation'' assumption.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Rasmus Pagh (88 papers)
  2. Morten Stöckel (9 papers)
Citations (24)