Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Cache-oblivious Matrix Multiplication for Exact Factorisation (1705.04807v1)

Published 11 May 2017 in cs.NA

Abstract: We present a cache-oblivious adaptation of matrix multiplication to be incorporated in the parallel TU decomposition for rectangular matrices over finite fields, based on the Morton-hybrid space-filling curve representation. To realise this, we introduce the concepts of alignment and containment of sub-matrices under the Morton-hybrid layout. We redesign the decompositions within the recursive matrix multiplication to force the base case to avoid all jumps in address space, at the expense of extra recursive matrix multiplication (MM) calls. We show that the resulting cache oblivious adaptation has low span, and our experiments demonstrate that its sequential evaluation order demonstrates orders of magnitude improvement in run-time, despite the recursion overhead.

Citations (1)

Summary

We haven't generated a summary for this paper yet.