Communication lower bounds for nested bilinear algorithms via rank expansion of Kronecker products (2107.09834v4)
Abstract: We develop lower bounds on communication in the memory hierarchy or between processors for nested bilinear algorithms, such as Strassen's algorithm for matrix multiplication. We build on a previous framework that establishes communication lower bounds by use of the rank expansion, or the minimum rank of any fixed size subset of columns of a matrix, for each of the three matrices encoding a bilinear algorithm. This framework provides lower bounds for a class of dependency directed acyclic graphs (DAGs) corresponding to the execution of a given bilinear algorithm, in contrast to other approaches that yield bounds for specific DAGs. However, our lower bounds only apply to executions that do not compute the same DAG node multiple times. Two bilinear algorithms can be nested by taking Kronecker products between their encoding matrices. Our main result is a lower bound on the rank expansion of a matrix constructed by a Kronecker product derived from lower bounds on the rank expansion of the Kronecker product's operands. We apply the rank expansion lower bounds to obtain novel communication lower bounds for nested Toom-Cook convolution, Strassen's algorithm, and fast algorithms for contraction of partially symmetric tensors.
- IEEE Transactions on Acoustics, Speech, and Signal Processing 25(5), 392–410 (1977)
- Optimization Letters 13(5), 961–976 (2019)
- In: Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures, pp. 222–231 (2013)
- In: Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures, pp. 193–204 (2012)
- SIAM Journal on Matrix Analysis and Applications 32(3), 866–901 (2011)
- Journal of the ACM (JACM) 59(6), 1–23 (2013)
- ACM Transactions on Parallel Computing (TOPC) 3(3), 1–34 (2016)
- In: 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 557–567. IEEE (2018)
- In: Workshop on Algorithms and Data Structures, pp. 181–192. Springer (2017)
- In: Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 2034–2052. SIAM (2019)
- Journal of Parallel and Distributed Computing 27(2), 172–182 (1995)
- Theory of Computing Systems 32(5), 531–559 (1999)
- Advances in Mathematics 20(2), 151–173 (1976)
- arXiv:1308.0068 (2013)
- De Stefani, L.: On the I/O complexity of hybrid algorithms for integer multiplication. arXiv:1912.08045 (2020)
- arXiv:1802.06905 (2018)
- In: Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures, pp. 523–525 (2020)
- The Johns Hopkins University Press (2013)
- Halmos, P.R.: Finite-dimensional vector spaces. Springer (1958)
- The Journal of Physical Chemistry A 107(46), 9887–9897 (2003)
- Hölder, O.: Über einen mittelwertssatz. Nachr. Acad. Wiss. Göttingen Math.-Phys. K pp. 38–47 (1889)
- In: Proceedings of the thirteenth annual ACM symposium on Theory of computing, pp. 326–333 (1981)
- Journal of Parallel and Distributed Computing 64(9), 1017–1026 (2004)
- In: Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures, pp. 329–338 (2020)
- SIAM Review 62(4), 743–777 (2020)
- Kogge, P., Shalf, J.: Exascale computing trends: Adjusting to the “new normal” for computer architecture. Computing in Science & Engineering 15(6), 16–26 (2013)
- Linear algebra and its applications 18(2), 95–138 (1977)
- In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4013–4021 (2016)
- Bulletin of the American Mathematical Society 55(10), 961–962 (1949)
- In: 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 482–490. IEEE (2019)
- Pan, V.: How can we speed up matrix multiplication? SIAM review 26(3), 393–415 (1984)
- IEEE transactions on acoustics, speech, and signal processing 35(3), 384–390 (1987)
- In: Proceedings of IEEE International Symposium on Circuits and Systems-ISCAS’94, vol. 2, pp. 449–452. IEEE (1994)
- Computational Methods in Applied Mathematics 21(1), 211–231 (2021)
- SIAM Journal on Scientific Computing 43(5), A3328–A3356 (2021)
- Strassen, V.: Gaussian elimination is not optimal. Numerische mathematik 13(4), 354–356 (1969)
- Yao, A.C.C.: Some complexity questions related to distributive computing. In: Proceedings of the eleventh annual ACM symposium on Theory of computing, pp. 209–213 (1979)
- Caleb Ju (9 papers)
- Yifan Zhang (245 papers)
- Edgar Solomonik (39 papers)