Adjoint-State Method Overview
- Adjoint-State Method is a technique that calculates sensitivity derivatives by solving both a forward and an adjoint problem, independent of the parameter count.
- It encompasses continuous and discrete formulations, effectively handling smooth dynamics as well as discontinuities in high-dimensional models.
- Widely employed in optimal control, uncertainty quantification, and inverse problems, the method balances computational cost with robust, accurate results.
The adjoint-state method is a foundational technique for efficiently computing sensitivity derivatives in systems described by high-dimensional or complex constraint equations, such as partial differential equations (PDEs), ordinary differential equations (ODEs), or large-scale algebraic systems. Conceptually, the method provides a way to evaluate the gradient of a scalar-valued objective or response function with respect to a high-dimensional vector of parameters with a cost that is, to leading order, independent of the number of parameters—requiring only one solution of the forward problem and one adjoint problem per output of interest. The adjoint-state method is implemented in multiple forms, most notably as the continuous adjoint (derived at the PDE/ODE level) and as the discrete adjoint (derived from the discretized forward system), each with distinct properties and computational implications. These approaches are integral to sensitivity analysis, uncertainty quantification, optimal control, and Bayesian inverse problems in application areas ranging from multiphase flow and acoustics to atmospheric modeling and data-driven discovery of dynamical systems.
1. Mathematical Principles and Derivation
The adjoint-state method is based on the formulation of a Lagrangian functional that augments the original optimization or sensitivity problem with constraint equations—often PDEs or ODEs—using Lagrange multipliers (“adjoint variables”). For a time-dependent PDE system written in general form,
where is the solution field and a (possibly high-dimensional) parameter vector, the objective is typically a scalar response
The Lagrangian is
where is the adjoint variable. Differentiating with respect to and requiring stationarity yields the adjoint (or dual) equation, typically to be solved backward in time (for time-dependent systems) with appropriately defined final or boundary conditions. The derivative is then given in terms of the adjoint solution and the parameter dependence of the constraint, circumventing the direct computation of , which would be prohibitive when 0 is high-dimensional (Hu et al., 2018, Melicher et al., 2016).
Two principal frameworks arise depending on when the adjoint equation is constructed relative to discretization:
- Continuous adjoint: The forward PDE is first differentiated analytically at the PDE level; the resulting adjoint PDE is then discretized. The continuous formulation yields analytical expressions for operators such as 1, 2, 3 (for multi-equation systems) and produces gradients via time-space integrals involving the adjoint variable (Hu et al., 2018).
- Discrete adjoint: The forward problem is discretized first (e.g., finite difference, finite volume, finite element), yielding a large-scale algebraic system. The discrete adjoint is then derived by applying the Lagrangian method to the algebraic (discrete) system, which results in a linear algebraic adjoint system involving the Jacobian transpose of the discretized forward model (Hu et al., 2018, Faucher et al., 2020).
2. Continuous and Discrete Adjoint Methodologies
Continuous Adjoint System
For a 1D two-fluid model, the continuous adjoint system takes the form:
4
with terminal and boundary conditions. Here, 5, 6, 7 are system matrices analytically derived from the PDE coefficients, and 8 is the adjoint source (e.g., 9). Sensitivities 0 are then calculated as space–time integrals of products of the adjoint with 1 plus possible boundary and initial-time terms (Hu et al., 2018).
Discrete Adjoint System
The discrete adjoint method starts from the fully discretized (usually implicit) forward problem 2. Differentiation and use of Lagrange multipliers yields the adjoint system:
3
and the reduced gradient is
4
Because the forward discretization couples only nearby time levels or mesh cells, the adjoint can often be solved efficiently by reverse stepping or block back substitution (Hu et al., 2018). Modern implementations utilize algorithmic differentiation or finite-difference Jacobians (Faucher et al., 2020).
Comparison Table: Key Adjoint Variants
| Method | Construction | Adjoint Equation Type | Sensitivity Formula | Cost per Output |
|---|---|---|---|---|
| Continuous | Differentiate PDE, then discretize | PDE (analytical operators) | Integral of adjoint × param. derivatives | 1 forward + 1 adjoint PDE |
| Discrete | Discretize forward model, then differentiate | Linear algebraic system | Dot-product (adjoint vector × param. Jacobian) | 1 forward + 1 adjoint linear system |
Both methods provide gradients at a cost independent of parameter dimension, but differ in accuracy and robustness depending on the discretization and model smoothness.
3. Computational Cost, Accuracy, and Robustness
In benchmark tests, both continuous and discrete adjoint methods provided sensitivities accurate to 1–2% in regimes where the underlying forward system and response functions are smooth. For steady-state faucet flow with an analytic solution, continuous adjoint sensitivities delivered slightly better accuracy (errors below 1%) and were approximately 20–30% faster because the coefficient matrices were analytic (Hu et al., 2018). For more challenging transient benchmarks, such as the BFBT two-phase flow scenario, both methods yielded physically reasonable sensitivities that matched perturbation or finite-difference reference solutions. However, the discrete adjoint method was superior in the presence of sharply discontinuous source terms; it avoided spurious oscillations and exhibited better robustness and systematic sub-2% error, even when the continuous adjoint failed to converge or lost accuracy due to Gibbs phenomena (Hu et al., 2018).
The cost per output is essentially one forward and one adjoint solve, in contrast to the 5 forward solves needed in direct sensitivity or perturbation approaches when 6 parameters are considered (Hu et al., 2018, Melicher et al., 2016). Once the adjoint solution is available, sensitivities to any number of parameters are obtainable at negligible additional expense.
4. Handling Discontinuities and Nonsmoothness
A key limitation of the continuous adjoint approach is its difficulty in handling sharp discontinuities, such as those induced by piecewise-defined source terms or closure relations in boiling and two-phase flows. Central-difference discretizations of the adjoint PDE can produce spurious oscillations and poor convergence (Gibbs phenomena), especially when the source term derivative is a discrete delta function. In such cases, the discrete adjoint approach, which inherits the numerical properties and upwinding of the primal discretization, remains robust and produces accurate sensitivities (Hu et al., 2018). This robustness is crucial in safety-critical or highly nonlinear applications.
5. Application Domains and Algorithmic Implementation
The adjoint-state method is widely applied across computational science and engineering. In multiphase and two-phase flow simulations, it supports sensitivity analysis and uncertainty propagation for void fractions under complex operating conditions (Hu et al., 2018, Hu et al., 2018). In reduced-order and data-driven modeling, adjoint-based training minimizes trajectory misfit functionals robustly, without explicit estimation of time derivatives (hence enhancing noise resilience) (Liu et al., 12 Jan 2026). Hybridizable Discontinuous Galerkin (HDG) frameworks, relevant to high-frequency wave propagation and seismic inversion, realize the adjoint state on reduced (trace-only) spaces for major computational benefits (Faucher et al., 2020).
Algorithmically, adjoint computations follow a standard loop: a forward solution is obtained and stored; then, using this trajectory, the adjoint equation is solved backward; finally, gradients or sensitivity coefficients are evaluated via quadrature or matrix-vector products. Discretizations (finite element, finite volume, finite difference) for the adjoint use either analytic expressions for system operators (continuous adjoint) or the numerical Jacobian and transposed stencils of the primal solver (discrete adjoint), which is especially significant in the presence of limiters or nonlinear upwinding (Hu et al., 2018, Botchorishvili et al., 2018).
6. Limitations, Recommendations, and Best Practices
Based on comparative studies, the following guidelines are advised for the selection of adjoint approaches (Hu et al., 2018):
- Employ the continuous adjoint when:
- The forward model and all relevant source and closure terms are smooth.
- Closed-form analytical operators are available and code infrastructure supports their construction.
- Computational efficiency is critical and the system size is large.
- Analytic insight into sensitivity structure or boundary-condition dependencies is desired.
- Employ the discrete adjoint when:
- The forward problem contains piecewise-defined source terms, discontinuities, or employs nonlinear flux limiters.
- The primal solver uses proprietary stencils or ad-hoc numerical schemes difficult to differentiate analytically.
- Robustness and absolute accuracy are critical across a wide range of parametric regimes, particularly with discontinuous physics (e.g., nucleate boiling, flashing).
For real-world two-phase reactor safety analysis and multiphase transient computations, the discrete adjoint is strongly preferred (Hu et al., 2018).
7. Extensions and Broader Impact
The adjoint-state method underpins best-in-class algorithms for high-dimensional sensitivity analysis, optimization, control, and inverse problems not only in thermal-hydraulics but also in atmospheric modeling, geophysics, reduced-order modeling, and beyond. Its essential principle—that gradients of constrained objectives can be computed by a backward-in-time or reverse-mode computation sharing the computational complexity of the forward simulator—is a cornerstone of modern scientific computing (Hu et al., 2018, Liu et al., 12 Jan 2026, Melicher et al., 2016). The method continues to evolve, extending towards non-smooth, non-unique, and data-driven settings, and integrating into automatic differentiation and machine learning pipelines.