Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 56 tok/s

Gemini 2.5 Pro 52 tok/s Pro

GPT-5 Medium 25 tok/s Pro

GPT-5 High 25 tok/s Pro

GPT-4o 75 tok/s Pro

Kimi K2 183 tok/s Pro

GPT OSS 120B 434 tok/s Pro

Claude Sonnet 4.5 35 tok/s Pro

2000 character limit reached

Least Squares Shadowing for Chaotic Systems

Updated 3 October 2025

Least Squares Shadowing (LSS) is a computational framework that uses constrained optimization to obtain robust sensitivity derivatives for ergodic averages in chaotic dynamical systems.
It overcomes the exponential divergence in conventional sensitivity methods by shadowing reference trajectories with a time-transformed path, ensuring bounded and reliable gradients.
The approach has been successfully applied in systems such as the Lorenz attractor and aero-elastic oscillators, demonstrating its effectiveness in resolving sensitivities in high-dimensional settings.

Least Squares Shadowing (LSS) is a mathematical and computational framework developed for sensitivity analysis of long-time averaged quantities in chaotic dynamical systems. Standard tangent or adjoint sensitivity methods diverge exponentially in such settings due to the presence of positive Lyapunov exponents, causing resulting gradients to be either unbounded or dominated by noise. LSS replaces the ill-conditioned initial value formulation with a well-posed constrained optimization—"shadowing" the reference trajectory with a nearby trajectory under a time transformation—thereby producing accurate and robust derivatives of ergodic averages. This method is grounded in the theory of hyperbolic dynamical systems and leverages shadowing lemmas to ensure the existence of nearby "shadow" solutions for perturbed parameters.

1. Theoretical Motivation and Ill-Conditioning of Conventional Methods

Classical sensitivity analysis in chaotic systems involves propagating the tangent or adjoint of the system's ordinary differential equations (ODEs). For a parameter-dependent ODE $\frac{du}{dt} = f(u, s)$ and a long-time average observable

$\overline{J}^{(T)}(s) = \frac{1}{T}\int_0^T J(u(t; s), s) dt,$

the corresponding tangent sensitivity $v(t)$ , satisfying $dv/dt = (\partial f/\partial u) v + (\partial f/\partial s)$ , can exhibit exponential growth due to positive Lyapunov exponents. When seeking the sensitivity of $\overline{J}^{(\infty)}$ , the exponential divergence makes direct differentiation with respect to $s$ unreliable—numerically computed derivatives commonly differ by orders of magnitude from their true values. This phenomenon is a direct consequence of the ill-posedness of differentiating initial value problems in systems with chaotic attractors (Wang et al., 2012).

2. LSS Formulation: Shadowing and Time Transformation

LSS recasts the sensitivity problem as a minimization over all possible trajectories and time reparametrizations ("time dilation") that respect the governing dynamics: $\min_{u(\cdot),\, \tau(\cdot)} \frac{1}{T} \int_0^T \left[ \|u(\tau(t)) - u_r(t)\|^2 + \alpha^2 \left(\frac{d\tau}{dt} - 1\right)^2 \right] dt,$ subject to $du/dt = f(u, s)$ , where $u_r(t)$ is a reference trajectory (typically at $s$ or a nearby parameter value) and $\alpha$ scales the contribution of time dilation. The variable $\tau(t)$ allows for dynamic phase adjustment, absorbing the neutral direction associated with time invariance and preventing secular growth due to phase mismatch. This formulation is justified under ergodicity assumptions, as long-time averages are independent of initial condition and phase (Wang et al., 2012, Wang, 2013).

3. Linearization and the LSS Sensitivity Algorithm

For infinitesimal perturbations $\delta s$ , the nonlinear LSS problem is linearized about the reference solution: $\min_{v(t), \eta(t)} \frac{1}{T} \int_0^T \left[ \|v(t)\|^2 + \alpha^2 \eta(t)^2 \right] dt,$ subject to

$\frac{dv}{dt} = \left( \frac{\partial f}{\partial u} \right) v + \frac{\partial f}{\partial s} + \eta(t) f(u_r(t), s).$

Here, $v(t)$ represents the shadowing direction (the bounded-in-time linear response along the optimal shadow trajectory), while $\eta(t)$ is the time dilation sensitivity. This constrained minimization ensures that the unstable directions responsible for exponential divergence are penalized, yielding a unique, bounded solution (Wang et al., 2012, Chater et al., 2015). The sensitivity of the infinite-time average is then given by: $\frac{d\langle J \rangle}{ds} \approx \frac{1}{T} \int_0^T \left[ \left(\frac{\partial J}{\partial u}\right) v(t) + \frac{\partial J}{\partial s} + \eta(t) \left(J(u_r(t), s) - \overline{J}\right) \right] dt,$ where $\overline{J}$ is the time average of $J$ over $u_r$ .

4. Well-Posedness, Convergence, and Theoretical Foundations

The rigorous foundation for LSS is established under the uniform hyperbolicity assumption—the dynamical system exhibits exponential contraction/expansion on invariant subspaces. The shadowing lemma guarantees that parameter perturbed orbits can be closely shadowed by true system orbits with suitably chosen initial conditions and time shifts. The convergence of the computed sensitivity to the true derivative of the ergodic average is demonstrated as both the integration time $T \to \infty$ and the discretization timestep $h \to 0$ :

The error decays as $\mathcal{O}(1/\sqrt{T})$ due to central limit theorem scaling for averages.
Finite-horizon and discretization errors can dominate for short $T$ or coarse resolution.
Practical convergence proofs quantify exponential decay of the error in the shadowing direction away from the integration endpoints, and precise convergence rates of the ergodic average derivative are established (Wang, 2013, Chater et al., 2015).

5. Numerical Implementation and Algorithmic Structure

The linearized LSS problem yields a boundary value problem in time for $v$ and $\eta$ . Numerically, this translates into a large Karush–Kuhn–Tucker (KKT) system with block structure when the ODE is discretized in time. For high-dimensional systems or long trajectories, the KKT system becomes large but is sparse and typically block tridiagonal, enabling efficient direct or iterative solvers. Multigrid-in-time schemes are effective, especially when:

Krylov subspace methods serve as smoothers.
Coarse grid operators are constructed via variational principles or high-order averaging.
The penalty parameter $\alpha$ is tuned to optimize conditioning (with observed optimal values, e.g., $\alpha^2\approx 40$ for the Lorenz system) (Blonigan et al., 2013).

The overall computational complexity is polynomial in the product of trajectory length and state dimension, but—unlike naive tangent/adjoint methods—remains independent of the divergent Lyapunov exponent.

6. Applications and Empirical Performance

LSS has been successfully applied to a spectrum of systems:

Van der Pol oscillator: For stable limit cycles, LSS accurately recovers sensitivity of oscillation peak norms with respect to damping parameters, matching finite-difference results and outperforming them in efficiency, especially for short trajectories (Wang et al., 2012).
Lorenz system: For chaotic attractors, LSS reliably computes sensitivity of time-averaged quantities (e.g., $\langle z \rangle$ with respect to $\rho$ ). Its convergence rate ( $\mathcal{O}(T^{-1/2})$ decay in error) produces stable and reproducible derivatives, unlike erratic values from finite-difference or classical tangent approaches.
Aero-elastic oscillators: LSS detects sensitivity transitions at bifurcations and resolves features inaccessible to finite-difference for short trajectories (Wang et al., 2012).
Partial differential equations, such as modified Kuramoto–Sivashinsky, where LSS remains robust in regimes with strong turbulence and ergodicity but may yield degraded accuracy in weakly mixing or non-ergodic regimes (Blonigan et al., 2013).

A summary of applications is given below:

System	Quantity of Interest	LSS Outcomes
Van der Pol	$L^8$ norm of $\dot{y}$	Accurate, low-noise gradients
Lorenz	Time-averaged $z$	Robust, convergent sensitivities
Aero-elastic LCO	$L^8$ norm of angle	Resolution of transitions and bifurcations
Modified KS eqn	Volume/energy average	Reliable except in non-ergodic regimes

7. Limitations, Practical Considerations, and Outlook

Despite the rigorous foundation and practical successes, LSS faces several challenges:

Finite trajectory length: Error in sensitivity estimation has two components—averaging error ( $\mathcal{O}(T^{-1/2})$ ) and endpoint error ( $\mathcal{O}(T^{-1})$ ); short trajectories may yield significant bias.
High-dimensionality: While LSS's sparse structure theoretically mitigates computational cost, scaling to very high-dimensional PDEs requires tailored solvers and often, adjoint or multigrid preconditioners.
Ergodicity and hyperbolicity: The method assumes strict ergodicity and (quasi-)hyperbolicity. Non-hyperbolic features or weak mixing can degrade the shadowing assumption, leading to inaccurate or biased gradients.
Extensions: Ongoing developments include alternative shadowing-based schemes (periodic shadowing, multiple shooting), non-intrusive and adjoint variants (NILSS, NILSAS), and frequency-domain generalizations for extremely large or stiff systems.

Future work is directed at broadening applicability to non-uniformly hyperbolic systems, improving solver scalability, and rigorous validation in complex climate and turbulence models (Wang, 2013, Chater et al., 2015, Blonigan et al., 2013).

LSS establishes a mathematically consistent, computationally tractable methodology for sensitivity analysis of ergodic averages in chaotic systems, circumventing the pathologies of conventional approaches by searching for bounded shadowing directions and systematically accounting for time neutrality. Its convergence and robustness are underpinned by the geometry of the underlying attractor and the properties of hyperbolic dynamics.