Matrix Product Sketching via Coordinated Sampling (2501.17836v1)

Published 29 Jan 2025 in cs.DS, cs.DB, and cs.LG

Abstract: We revisit the well-studied problem of approximating a matrix product, $\mathbf{A}^{T\mathbf{B}$,} based on small space sketches $\mathcal{S}(\mathbf{A})$ and $\mathcal{S}(\mathbf{B})$ of $\mathbf{A} \in \R^{n \times d}$ and $\mathbf{B}\in \R^{n \times m}$. We are interested in the setting where the sketches must be computed independently of each other, except for the use of a shared random seed. We prove that, when $\mathbf{A}$ and $\mathbf{B}$ are sparse, methods based on \emph{coordinated random sampling} can outperform classical linear sketching approaches, like Johnson-Lindenstrauss Projection or CountSketch. For example, to obtain Frobenius norm error $\epsilon|\mathbf{A}|_F|\mathbf{B}|_F$, coordinated sampling requires sketches of size $O(s/\epsilon^2)$ when $\mathbf{A}$ and $\mathbf{B}$ have at most $s \leq d,m$ non-zeros per row. In contrast, linear sketching leads to sketches of size $O(d/\epsilon^2)$ and $O(m/\epsilon^2)$ for $\mathbf{A}$ and $\mathbf{B}$. We empirically evaluate our approach on two applications: 1) distributed linear regression in databases, a problem motivated by tasks like dataset discovery and augmentation, and 2) approximating attention matrices in transformer-based LLMs. In both cases, our sampling algorithms yield an order of magnitude improvement over linear sketching.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Matrix Product Sketching via Coordinated Sampling (2501.17836v1)

Summary

Related Papers