Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 188 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 29 tok/s Pro
GPT-5 High 27 tok/s Pro
GPT-4o 57 tok/s Pro
Kimi K2 192 tok/s Pro
GPT OSS 120B 431 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Factorized Preconditioning Architecture

Updated 9 November 2025
  • Factorized preconditioning architecture is defined as constructing a preconditioner by factorizing it into structured, often sparse, operators (e.g., triangular, block-diagonal) to accelerate convergence in iterative solvers.
  • It transforms the spectral properties of matrices, enabling rapid convergence via efficient sparse operations such as matrix–vector products and triangular solves in large-scale systems.
  • Recent innovations integrate neural and quantum methods with classical strategies, enhancing scalability, robustness, and applicability across PDE, kernel methods, and deep learning optimizations.

A factorized preconditioning architecture refers to any preconditioning strategy in which the preconditioner is constructed as a product (factorization) of structured, often sparse, operators—typically triangular, block-diagonal, multilevel, or compositionally layered forms. These architectures underpin a wide array of algorithms for accelerating iterative solutions to large linear systems, partial differential equations (PDEs), kernel methods, and variational quantum eigensolvers. The central motivation is to transform the spectrum of the original operator/promote rapid convergence in Krylov subspace or optimization solvers, while enabling efficient evaluation (often via sparse matrix–vector products or triangular solves) and scalable, parallelizable construction.

1. Mathematical Foundations of Factorized Preconditioning

Classical iterative solvers for linear systems Ax=bA x = b rely on a preconditioner MA1M \approx A^{-1} that clusters the eigenvalues/singular values of the preconditioned operator MAM A or AMA M near unity. Factorized preconditioning refers to constructing MM via explicit factorization, i.e., M=G1G2GkM = G_1 G_2 \dots G_k, where each GiG_i is chosen to be easily invertible or evaluable.

Traditional examples include:

  • Incomplete LU (ILU/IC): ALUA \approx L U (LU) or ALLTA \approx L L^T (Cholesky for SPD AA), with L,UL, U computed to match the sparsity of AA; M=LUM = L U or LLTL L^T used as preconditioner (Hosaka et al., 2023).
  • Sparse Approximate Inverse, FSAI: M1=GTGM^{-1} = G^T G with GG lower-triangular and sparse, constructed by enforcing GTGAIG^T G A \approx I via local dense solves and scaling (Isotton et al., 2020).
  • Block and Multilevel Factorizations: Decompose AA recursively over blocks or multilevel Schur complements, yielding product form preconditioners such as M=M1M2M = M_1 M_2 \cdots or via hierarchical interpolative factorizations (Feliu-Fabà et al., 2020, Feliu-Fabà et al., 2018).
  • Data-driven/Neural Factorizations: Neural operators or GNNs produce MM via sparse lower/upper triangular neural parameters; “learned” incomplete factorizations reproduce and extend IC/ILU structure in a factorized way (Li et al., 10 Dec 2024, Häusner et al., 2023, Häusner et al., 12 Sep 2024).

The factorized architecture is distinguished from direct inverse approximations by its parameterization and the ability to compose/adapt factors (e.g., by sparsity pattern, block structure, or neural correction), crucial for large-scale or GPU-accelerated computation.

2. Classical and Modern Variants

Several canonical factorized preconditioners have been extensively developed and analyzed:

2.1 Incomplete LU and Cholesky

For a sparse AA, the incomplete LU preconditioner AL~U~A \approx \tilde L \tilde U is built by retaining only those entries of L,UL, U present in a prescribed pattern (typically the pattern of AA or a superset). The preconditioned system M1Ax=M1bM^{-1} A x = M^{-1} b is then solved either by left or right application, exploiting fast triangular policies (Hosaka et al., 2023). Iterative solvers reduce from hundreds or thousands of steps (unpreconditioned) to an order of magnitude fewer with a good ILU.

2.2 Adaptive FSAI

FSAI constructs M1=GTGM^{-1} = G^T G such that GG is sparse and lower-triangular, and GTGAIG^T G A \approx I (Isotton et al., 2020). A core component is the adaptive sparsity strategy: the sparsity pattern of GG is grown row-by-row, guided by the gradient of the Kaporin diagnostic, until a user-specified error reduction is achieved. Row-wise independently parallelizable, FSAI is particularly well-suited for distributed-memory and GPU acceleration, achieving strong and weak scaling to thousands of GPUs.

2.3 Hierarchical Interpolative Factorization (HIF/PHIF)

Hierarchical factorizations build AGGTA \approx G G^T or AFA \approx F by recursive elimination of well-chosen blocks (local Cholesky or Schur complements), skeletonization (interpolative decomposition), and local preconditioning (block Jacobi or incomplete factorizations at each level). The resulting factorized preconditioner supports O(N)O(N) (2D) or O(NlogN)O(N \log N) (3D) complexity and yields iteration counts with little or no NN dependence, in contrast to Cholesky or classic IC (Feliu-Fabà et al., 2020, Feliu-Fabà et al., 2018).

2.4 Block and Multilevel/Recursive Preconditioners

Recursive multilevel or block-preconditioners (e.g., AMES) partition AA into blocks, recursively assemble Schur complements and incomplete factorizations on block/leaf levels, and then combine explicit and implicit approximate inverses across the hierarchy (Bu et al., 2015). Overlap strategies and sparse explicit inverse blocks are key for robustness and reducing iteration counts at fixed memory and computational cost.

3. Data-Driven and Neural Factorized Preconditioners

Recent advances leverage neural networks—especially graph neural networks (GNNs)—to build or refine factorized preconditioners:

3.1 GNN-Enhanced IC/ILU

GNNs are trained to predict either corrections to classical IC factors (“delta” L) or directly learn the triangular or block factors. Architectures mirror the underlying triangular solves by use of directed, positional edge features and local node statistics (Li et al., 10 Dec 2024, Häusner et al., 12 Sep 2024). Enforced positive definiteness and sparsity ensure the resulting operator is stable and cheap to apply. Training is stochastic, often using matrix–vector products and Hutchinson estimators of Frobenius losses.

Key insight: Directly predicting triangular factors tends to allocate model capacity to diagonals already well-handled by IC; adding neural corrections (“IC+GNN delta”) enables precise refinement of off-diagonals, reducing PCG iteration counts by ≈25% vs. IC alone (Li et al., 10 Dec 2024).

3.2 Neural Incomplete Factorization

Approaches like NeuralIF (Häusner et al., 2023) construct M=LLTM = L L^T by GNN parameterized sparse LL. Message-passing steps are explicitly designed to reflect lower-triangular structure, with skip connections and two-direction passes mimicking LL and LTL^T actions. Consistency, SPD structure, and absence of fill-in outside the pattern of AA are maintained. Models are compact (∼2k parameters), able to generalize to out-of-training distribution matrices, and shown to match IC(0) performance at reduced build-time.

3.3 Direct NN-based Cholesky (Compile/Online-Time)

A sparse two-layer linear network, with masked weights corresponding to the incomplete Cholesky pattern, learns the lower-triangular LL directly by regression on a set of randomly sampled matrix–vector products (Booth et al., 1 Mar 2024). The cost is amortized if multiple right-hand sides are solved; crucially, neural Cholesky always succeeds (never fails on indefinite pivots) and provides an SPD preconditioner in all tested cases.

4. Quantum and Specialized Factorized Preconditioning

The systematic integration of factorized preconditioning into non-classical settings demonstrates its architectural flexibility:

  • Quantum Variational Linear Solvers (VQLS): Preconditioning via incomplete LU (M=L~U~M = \tilde L \tilde U) is incorporated into the variational quantum architecture by classically computing MM and using quantum circuits to minimize the preconditioned cost C(θ)=1b~A~x(θ)2/C(\theta) = 1 - |⟨\tilde b|\tilde A|x(θ)⟩|^2/\ldots, yielding a dramatic reduction in the required quantum circuit depth (2–3×), argued as essential for the NISQ regime (Hosaka et al., 2023).
  • Kernel and Block Structures: Block-wise Schur complement/low-rank plus sparse factorized corrections (as in AFN for regularized kernel systems) (Zhao et al., 2023), admitting near-optimal Nyström approximations plus a sparse FSAI inverse correction, can be constructed in O(nlogn)O(n \log n) time with O(1)O(1) iteration scaling.
  • Spectral-Element and PDE Systems: Sum-factorized and diagonalization-based preconditioners folding interior/face blocks into minimal, block-structured factorizations enable O(N)O(N) runtime for high-order Helmholtz problems (Huismann et al., 2016).
  • Deep Learning Optimizers: Block Kronecker and two-level (coarse+fine) factorizations in Fisher-matrix preconditioning for natural gradient optimization (e.g., 2L-KFAC) efficiently capture global and local curvature (Tselepidis et al., 2020).

5. Implementation Principles and Performance

5.1 Workflow Summary

A generic workflow for factorized preconditioning in modern settings consists of:

  1. Pattern Selection: Define sparsity or block structure for the factors (e.g., lower triangle of AA, block partitions, geometric levels).
  2. Factor Computation:
    • Classical: Compute incomplete LU/Cholesky or block-ILU via dropping/filling, with parallel QR/SVD when using explicit inverse/factorized sparse approximate inverse.
    • Neural: Train GNN or neural operator with graph-structured features, optimized on data with (stochastic) matrix–vector loss, e.g., LLTxAx2\|L L^T x - A x\|^2 or spectrum-control surrogates.
    • Quantum: Classically precondition the matrix/vector and adapt the variational ansatz/state preparation to the preconditioned problem.
  3. Deployment: Factors stored (often in CSR/COO format for compatibility with sparse libraries); preconditioning step in PCG/GMRES/other Krylov solvers is performed by sparse triangular solves or matrix–vector products.
  4. Parameter Tuning: Sparsity levels, block sizes, and training mini-batch sizes are user-tunable, frequently with negligible increase in memory overhead over classical approaches.

5.2 Scaling and Robustness

Factorized preconditioners are typically O(nnz(A))O(\mathrm{nnz}(A)) in both setup and application cost when sparsity is controlled. Modern advances enable strong and weak scaling to thousands of processors/GPUs (e.g., aFSAI in (Isotton et al., 2020)), and are robust to matrix size, problem domain, and distributional shift (as in (Li et al., 10 Dec 2024, Häusner et al., 2023)). The capacity to handle indefinite or highly ill-conditioned systems depends on architectural choices, e.g., hierarchical block-Jacobi (PHIF) can ensure preconditioner stability for problems with contrast up to 10410^4 (Feliu-Fabà et al., 2018).

6. Applications, Trade-Offs, and Extensions

Factorized preconditioning architectures are now ubiquitous in classical and quantum scientific computing, PDE solvers, kernel methods, deep learning, and Bayesian inference. Their principal advantages include:

  • Efficiency and Parallelism: Embarrassingly parallel builds (columnwise in explicit factor approximations, block/row in aFSAI), ideal for GPU and distributed settings.
  • Adaptability: Amenable to hybrid neural/classical corrections, block/multilevel extensions, and problem-specific structural reuse.
  • Architectural Generality: The recipe extends to any setting where an approximate inverse, diagonalization, or spectrum transformation is beneficial—quantum VQLS, kernel preconditioning, deep curvature methods, variational inference via flow-based MCMC with factorized normalizing flows.

Trade-offs include the cost of one-time setup (often amortized), memory/storage tradeoffs in block size or pattern expansion, and the necessity of careful regularization and stabilization (e.g., positive definiteness, minimum diagonal entries) in data-driven settings.

Ongoing research investigates deeper neural architectures for learned sparsity, multilevel/factorized neural algebraic multigrid (NAMG), and high-dimensional probabilistic flow-MCMC preconditioning where splitting linear and highly nonlinear blocks—i.e., factorizing the preconditioner between domains—yields both superior exploration and training/data efficiency (Nabergoj et al., 4 Nov 2025).


In summary, a factorized preconditioning architecture leverages structured operator products to achieve rapid spectrum transformation and scalable application within iterative and optimization algorithms. The approach encompasses both classic algorithmic and data-driven paradigms, accelerating scientific computing across traditional and emerging domains.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Factorized Preconditioning Architecture.