Numeric Canonicalization Techniques
- Numeric canonicalization is the process of converting mathematical objects into unique minimal representatives using specific equivalence relations and lexicographic ordering.
- It applies to structures like matrices over finite fields and batched Einstein summations, enabling efficient equivalence testing, optimization, and reuse of computation.
- The methodology includes recursive row/column sorting and graph-based canonical labeling, which have been shown to yield significant speedups in high-performance numerical computing.
Numeric canonicalization is the process of mapping mathematical objects—such as matrices with integer entries modulo , or sets of indexed tensor operations (e.g., batched Einstein summations)—to unique, minimal representatives under a specified equivalence relation. This is typically achieved via a total ordering (often lexicographic) that allows algorithmic identification, comparison, and optimization across equivalence classes. Numeric canonicalization serves as a foundation for equivalence testing, code transformation, and reuse of computational work in symbolic, algebraic, and high-performance numerical computing.
1. Equivalence Relations and Canonical Representatives
The core of numeric canonicalization is defining an equivalence relation under which objects are considered "the same" for computational or structural purposes. For matrices with entries in , the relevant equivalence is simultaneous row and column permutation: such that , where is the group of permutation matrices. Canonicalization then selects, from each equivalence class, a unique representative—typically the minimal one under lexicographic order on an encoding of the object (Yordzhev, 2021).
For batched Einstein summation expressions, equivalence is defined by index renaming, batch and operand permutations, and algebraic symmetries (commutativity and associativity of products). The absence of a canonical form impedes reuse and fusion of computational results; deriving a unique normal form enables systematized optimization (Kulkarni et al., 18 Jan 2026).
2. Matrix Canonicalization Over : Methods and Characterization
Given , each matrix is uniquely represented by its row-tuple , where . Lexicographic ordering on row-tuples imposes a total order on matrices. The canonical matrix in the equivalence class is the unique representative with minimal row-tuple, i.e., (Yordzhev, 2021).
A matrix is canonical if and only if it satisfies six conditions (Theorem 4.1): (1) nondecreasing row-tuple; (2) first-row size bound; (3) nondecreasing last- columns (for , the number of nonzero entries in the first row); (4) row-weight monotonicity; (5) local minimality under certain row/column swap operations; (6) recursive canonicity of a lower-dimensional submatrix arising from deleting certain rows and columns.
The canonical form is algorithmically computed by recursively sorting rows and appropriate columns and applying these necessary and sufficient criteria, ensuring uniqueness within each equivalence class.
3. Batched Einstein Summation Canonicalization
In the context of tensor algebra, equivalence of batched Einstein summations is complicated by multiple sources of symmetry: index names, order of operands, factor permutations in products, and batch arrangements. To render equivalence checking tractable and canonicalization practical, each batched einsum is encoded as a vertex-colored, directed graph . This graph captures all operand, index, positional, and batch symmetries using appropriately colored node classes and edges that encode access patterns and operand structure (Kulkarni et al., 18 Jan 2026).
Mature graph-canonicalization algorithms (e.g., nauty, bliss) are applied to , producing a canonical labeling of its nodes. This labeling is then inverted to yield the canonical batched-einsum tuple . Equivalence testing becomes a single comparison of canonical forms: .
4. Algorithmic Procedures and Pseudocode
Matrix Canonicalization
The canonicalization of follows a recursive, stepwise algorithm:
- Row Sorting: Sort rows to put in nondecreasing order.
- Column Sorting: If last columns are not nondecreasing, reorder them; if this lowers the row-tuple, is non-canonical.
- Row-Weight Check: All rows below must have at least as many nonzero entries as the first.
- Local Minimality: For each candidate block swap, ensure that it does not yield a lexicographically smaller row-tuple.
- Recursion: On remaining submatrix, apply steps 1–4. Canonicalization is complete when no swap or submatrix operation can further decrease .
Batched Einsum Canonicalization
The process comprises:
- Graph Construction: Build the colored graph representing the batched einsum’s structural elements and dependencies.
- Canonical Relabeling: Apply a black-box canonicalizer to obtain node relabeling.
- Permutation and Reconstruction: Map the permuted graph back to a batched-einsum form.
- Substitution Map Construction: Derive argument and index name substitutions.
- Return: The canonicalized form and substitution maps.
Algorithmic complexity is linear or quadratic in the number of nodes (), and practical instances complete canonicalization in under 1 ms (Kulkarni et al., 18 Jan 2026).
5. Functional and Optimization Aspects
For batched Einstein summations, arrays can be symbolic, given as functional operands . The canonicalization process remains valid whether the operands are concrete buffers or arbitrary pointwise functions, provided the induced memory-access pattern is unchanged. This allows reuse of performance tuning and code transformation knowledge across broad classes of equivalent tensor operations, as performance is primarily determined by the pattern of data access rather than the precise functional form (Kulkarni et al., 18 Jan 2026).
Transforms and optimizations tuned for the canonical form are "transferred back" to the user's original expression via inverse substitution maps. Thus, high-performance code for one canonical pattern is automatically specialized and applied to all computationally equivalent variants, including those differing by automatic code generation or user-level naming.
6. Applications and Empirical Results
Numeric canonicalization underpins code transformation, optimization, and benchmarking workflows in numerical linear algebra, finite element methods, computational chemistry, and related high-performance fields.
For matrices over , the canonical form simplifies classification problems, isomorphism testing, and object tabulation.
For batched einsums, canonicalization enables substantial performance improvements:
- DG-FEM macro kernels achieved speedups up to , typically $5$– for polynomial order , due to fusion and reduced memory loads.
- In the TCCG tensor-contraction suite, using canonical forms led to a geometric-mean throughput speedup of (up to for the memory-bound subset) over JAX/cuBLAS implementations.
- Canonicalization overhead is negligible relative to kernel generation times, remaining below $1$ ms in all observed cases (Kulkarni et al., 18 Jan 2026).
7. Representative Examples and Interpretive Remarks
A nontrivial instance of and for matrix canonicalization demonstrates that canonicalization correctly identifies the unique minimal matrix in its equivalence class, even when the original is in semi-canonical form but not minimal. In the batched einsum context, colored graph encodings and their canonical forms allow detection, fusion, and recognition of equivalent computation patterns at scale.
This suggests that numeric canonicalization is a key enabler for scalable, automated, and architecture-portable computation in domains where structural symmetry, data reuse, and symbolic manipulation are intertwined. The rigorous, algorithmically explicit approaches described provide the basis for both theoretical analysis and practical system design in contemporary numerical computing (Yordzhev, 2021, Kulkarni et al., 18 Jan 2026).