DerivKit: Numerical Differentiation for Inference
- DerivKit is a Python package framework for robust numerical differentiation and derivative-based inference, particularly for models unsuited to automatic differentiation.
- It employs adaptive high-order finite difference stencils and polynomial fitting methods to accurately compute gradients, Hessians, and higher-order tensors from noisy simulation outputs.
- The toolkit bridges rapid Fisher analysis with full MCMC sampling by integrating diagnostics, uncertainty quantification, and bias correction into a unified inference workflow.
DerivKit is a Python package framework designed for stable numerical differentiation and derivative-based statistical inference in scientific computing. It provides robust utilities for obtaining gradients, higher-order tensors, and constructing Fisher-matrix forecasts and non-Gaussian likelihood approximations with minimal implementation overhead, particularly for models where automatic differentiation (autodiff) is impractical, such as black-box, tabulated, or noisy simulation outputs. DerivKit's design philosophy centers on diagnostics-driven numerical methods, supporting both rapid uncertainty quantification via Fisher analysis and advanced inference techniques bridging deterministic forecasts with full Markov chain Monte Carlo (MCMC) posterior sampling (Šarčević et al., 8 Feb 2026).
1. Motivation and Core Scope
Scientific applications frequently involve models that are not conducive to autodiff—due to the use of implicit solvers, lookup tables, or inherent numerical noise. Derivative-based inference methods, including Fisher information forecasts and higher-order likelihood expansions (e.g., DALI), demand stable and accurate derivatives of model predictions with respect to their parameters. Standard finite difference (FD) approaches may be unstable or introduce bias under such circumstances due to noise, parameter stiffness, or irregular grid structures, directly compromising the reproducibility and reliability of scientific inference. DerivKit addresses these challenges by delivering adaptive, robust numerical differentiation routines and integrating diagnostics for all steps of the differentiation and inference process (Šarčević et al., 8 Feb 2026).
2. Numerical Differentiation Algorithms
DerivKit's central engine, DerivativeKit, implements two primary strategies: high-order finite difference stencils with adaptive stabilization and local polynomial fitting (PF).
- Finite Difference (FD) Stencils: Central -point rules for derivatives up to order 4. For example, the 3-point central difference for a scalar :
Higher-order stencils (e.g., 5-, 7-, 9-point) provide increased formal accuracy:
Stabilization mechanisms include Richardson extrapolation (building a sequence of estimates across step sizes and extrapolating to ), Ridders' method (adaptive extrapolation for noisy evaluations), and probabilistic uncertainty quantification for the extrapolated derivative.
- Polynomial Fitting (PF): Least-squares or Chebyshev-grid polynomial fits are applied to neighborhoods of sample points , with both fixed-window and fully adaptive window/degree selection. Diagnostics (e.g., condition number, residuals) determine method fallback.
Unified interfaces expose both methods, with automatic step-size heuristics, convergence thresholds, and metadata introspection (e.g., step sizes, condition numbers, stencil geometry).
3. Derivative Assembly and Tensor Construction
DerivativeKit generates raw partial derivatives, which are then assembled by CalculusKit into higher-level mathematical objects:
- Scalars: Gradients () and Hessians ().
- Vectors: Jacobians () and higher-order tensors for vector-valued models with models .
- Computation Strategy: Partial derivatives are computed by coordinate-wise calls to DerivativeKit and then arranged in NumPy arrays conforming to the target tensor structure (Šarčević et al., 8 Feb 2026).
4. Fisher Information Forecasting and Bias Correction
In the Gaussian (linear-response) regime, DerivKit constructs the Fisher information matrix: where are model observables and their variances. Optional Gaussian priors are incorporated by diagonal precision augmentation. Fisher bias estimates propagate biases in the data to parameter shifts: These workflows provide rapid, interpretable uncertainty quantification and bias analysis using only model evaluations and their derivatives.
5. Non-Gaussian Likelihood Expansions (DALI Framework)
DerivKit generalizes Fisher analysis by supporting higher-order expansions via the Derivative Approximation for Likelihoods (DALI) methodology. The log-likelihood around the fiducial is approximated as: with and the third- and fourth-order log-likelihood derivative tensors. DerivKit assembles these via the ForecastKit interface, permitting non-Gaussian credible region approximation and posterior reconstruction (Šarčević et al., 8 Feb 2026).
6. Bridging Fisher and MCMC: Practical Inference Workflows
DerivKit is designed to bridge rapid, approximate analytic forecasts and computationally intensive sampling-based inference (e.g., MCMC):
- Fisher Covariance as Proposal: The Fisher covariance is typically used as a Gaussian proposal for MCMC methods, improving mixing efficiency.
- DALI Corrections: These higher-order likelihood corrections can serve as pre-conditioners or provide efficient, non-Gaussian proposal landscapes.
- Integrated APIs: ForecastKit utilities provide a unified interface to compute derivatives, assemble likelihood expansions, and launch MCMC sampling. This enables direct comparison and cross-validation of forecast and posterior approaches within a single workflow. Typical timings: Fisher (seconds), DALI (tens of seconds), full MCMC (minutes to hours depending on model cost and dimensionality).
7. Applications to Topological Data Analysis and Advanced Symbolic Differentiation
DerivKit's architecture readily supports differentiable topological data analysis (TDA), particularly gradient-based learning over persistence barcodes, as formalized in (Leygonie et al., 2019). It also integrates categorical and combinatory symbolic differentiation paradigms, exposing Fréchet derivatives and their adjoints using compositional linear algebra as developed in (Elsman et al., 2022). This yields optimizations for high-dimensional tensor computation and matrix-free reverse-mode differentiation, exploiting data-parallel building blocks and symbolic representations for efficient backpropagation in large-scale inference problems.
8. Limitations, Diagnostics, and Best Practices
- Numerical Stability: Careful monitoring of finite difference cancellation (for small ) and polynomial fit conditioning is required; diagnostics and fallbacks are integral to the interface.
- Dimensionality Constraints: Higher-order DALI corrections scale combinatorially in dimensionality, limiting practical non-Gaussian expansions beyond 5–7 parameters without significant computational cost.
- Method Selection: FD + extrapolation excels for smooth, low-noise models; adaptive polynomial fits are recommended for stiff, noisy, or tabulated models.
- Diagnostics: Users should regularly inspect metadata (condition numbers, residuals, convergence rates) and cross-validate approximate covariances against short pilot MCMC chains.
- Not a Substitute for Full Sampling: While Gaussian/DALI proposals are efficient, credible intervals requiring exactness demand full MCMC runs (Šarčević et al., 8 Feb 2026).
In sum, DerivKit operationalizes a cohesive, diagnostics-rich pathway from non-intrusive model evaluation to robust uncertainty quantification and statistical inference—spanning black-box simulation, Fisher analysis, higher-order approximations, and MCMC acceleration without reliance on autodiff-specific architectures or matrix-centric implementations.