Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Strong Scaling of Matrix Multiplication Algorithms and Memory-Independent Communication Lower Bounds (1202.3177v1)

Published 14 Feb 2012 in cs.DS, cs.CC, cs.DC, cs.NA, math.CO, and math.NA

Abstract: A parallel algorithm has perfect strong scaling if its running time on P processors is linear in 1/P, including all communication costs. Distributed-memory parallel algorithms for matrix multiplication with perfect strong scaling have only recently been found. One is based on classical matrix multiplication (Solomonik and Demmel, 2011), and one is based on Strassen's fast matrix multiplication (Ballard, Demmel, Holtz, Lipshitz, and Schwartz, 2012). Both algorithms scale perfectly, but only up to some number of processors where the inter-processor communication no longer scales. We obtain a memory-independent communication cost lower bound on classical and Strassen-based distributed-memory matrix multiplication algorithms. These bounds imply that no classical or Strassen-based parallel matrix multiplication algorithm can strongly scale perfectly beyond the ranges already attained by the two parallel algorithms mentioned above. The memory-independent bounds and the strong scaling bounds generalize to other algorithms.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Grey Ballard (36 papers)
  2. James Demmel (54 papers)
  3. Olga Holtz (16 papers)
  4. Benjamin Lipshitz (7 papers)
  5. Oded Schwartz (14 papers)
Citations (63)