Symmetric Matrix Eigenvalue Decomposition
- Eigenvalue decomposition for symmetric matrices is the process of orthogonally diagonalizing a real symmetric matrix into a basis of real eigenvalues and orthonormal eigenvectors.
- The method leverages univariate Taylor polynomial arithmetic to compute higher-order derivatives, facilitating precise sensitivity analysis and structured handling of repeated eigenvalues.
- Reverse mode algorithmic differentiation applies structured lifting and adjoint propagation to ensure stable, scalable computations in advanced numerical simulations and optimization.
Eigenvalue decomposition for symmetric matrices is a central concept in numerical linear algebra, theoretical physics, and applied mathematics. For a real symmetric matrix , the spectral theorem guarantees that can be orthogonally diagonalized: there exists an orthogonal matrix and a diagonal matrix such that . The eigenvalues are real and the eigenvectors can be chosen to form an orthonormal basis. This decomposition is fundamental not only for matrix functions, stability analysis, and dimensionality reduction, but also for the algorithmic differentiation of programs involving matrix factorizations.
1. Implicit Definition and Characterization
The symmetric eigenvalue decomposition can be cast as the solution of an implicit system of matrix equations. For a symmetric , one seeks (orthogonal) and (diagonal) so that: where and are strictly lower and upper triangular projectors (with , ), and denotes element-wise (Hadamard) product. The "sifting" condition ensures off-diagonal entries of are zero in a structured way. These equations capture the eigenvalue-eigenvector mapping as an implicit function of , providing a foundation for algorithmic differentiation and higher-order analysis.
2. Higher-Order Derivatives: Univariate Taylor Polynomial Arithmetic
To differentiate the eigenvalue decomposition with respect to , higher-order derivatives are computed by "lifting" the implicit equations to operate on truncated Taylor series, known as univariate Taylor polynomial (UTP) arithmetic. One represents the input as: and seeks Taylor polynomial expansions for , satisfying the implicit constraints up to order . This is achieved by Newton-Hensel lifting:
- Compute the decomposition at zero order ().
- Sequentially solve for corrections in higher orders, refining the Taylor expansions such that all terms up to satisfy the system to the desired truncation.
- When eigenvalues are repeated (block-diagonal structure), refinements are performed within each block, ensuring that the expansion respects eigenvalue multiplicities.
This approach enables systematic computation of all higher-order derivatives of the outputs with respect to perturbations of .
3. Reverse Mode Algorithmic Differentiation and Sensitivity Analysis
In reverse (adjoint) mode, sensitivities with respect to are "pulled back" from linear forms defined on the outputs . Differentiating the implicit system yields, for distinct eigenvalues: where and are the output adjoints and is the off-diagonal matrix with
This formula, derived from matrix calculus by differentiating the conditions and , provides a numerically stable way to propagate adjoints/sensitivities back to provided the eigenvalues are simple. When eigenvalues are repeated, more intricate block-diagonal arguments and higher-order conditions are enforced.
4. Structured Lifting and Block-Diagonalization
A major algorithmic feature is recognizing and efficiently handling eigenvalue multiplicity. At each order of the Taylor expansion, the solution separates into blocks corresponding to repeated eigenvalues. The algorithm computes small block-diagonal eigenproblems on each active block, "lifts" the solutions (denoted , for block ) and assembles the global expansion: where is re-orthogonalized to ensure the global remains orthogonal to order . This blockwise lifting is essential for structured sparsity and numerical efficiency.
5. Unified Forward and Reverse AD with UTP Arithmetic
Combining UTP arithmetic and matrix calculus, both forward and reverse modes operate natively over truncated Taylor polynomials:
- Each variable (matrix or vector component) is replaced by a Taylor series.
- Basic matrix functions (addition, multiplication, inversion) are extended to Taylor polynomials.
- In forward mode, this propagates high-order derivatives; in reverse mode, it propagates adjoints through the chain rule at the Taylor series level.
- The pullbacks are organized so the chain rule holds on bundles of Taylor coefficients, with explicit computation of higher-order derivatives via coefficient recursion.
This machinery allows all algorithms "containing" symmetric EVP routines (for example, optimal experimental design, index calculation for DAEs) to be differentiated to arbitrary order, within a unified automatic differentiation framework.
6. Numerical Examples, Applications, and Special Cases
Applications and examples in the framework include:
- Matrices parameterized as , where the higher-order behavior of eigenvalues and eigenvectors is analyzed via the Taylor expansion.
- When has repeated eigenvalues, higher-order terms often reveal splitting (eigenvalue unmixing), and the block-diagonalization framework applies, guaranteeing that at each relaxation level, the problem is well-posed within active blocks.
- The approach handles both forward (pushforward) and reverse (pullback) differentiation and recovers correct derivatives/sensitivities even in the presence of near-degeneracy of eigenvalues.
Explicit algorithms (notably Algorithm 3 and Algorithm 4 "eigh" in the paper) provide stepwise details for sequential lifting and block-structured orthogonalization, addressing technicalities at all levels of eigenvalue multiplicity.
7. Impact and Broader Significance
The combination of UTP arithmetic with matrix calculus for eigenvalue decomposition systems enables:
- Efficient, scalable, and accurate computation of higher-order derivatives for programs involving symmetric EVP.
- The preservation of structural properties (orthogonality, block-diagonality) throughout the differentiation process.
- Applicability to a wide range of domains including optimum experimental design, control theory, numerical simulation with sensitivity requirements, and automatic differentiation frameworks in modern machine learning and computational science.
This framework is a cornerstone for differentiable linear algebraic routines, addressing both theoretical and practical aspects of differentiation through symmetric matrix decompositions, and enabling new computational tools for sensitivity and optimization in large-scale scientific computing.