Infinite Horizon Optimal Switching

Updated 11 August 2025

The infinite horizon optimal switching problem is a stochastic control framework that optimizes regime changes over an indefinite period by managing switching, running, and default costs.
It employs dynamic programming, variational inequalities, and probabilistic methods like BSDEs to rigorously characterize optimal policies under uncertainty.
The approach is crucial in applications such as energy, finance, and production planning, where threshold-based rules dictate regime changes and risk management.

An infinite horizon optimal switching problem is a class of stochastic control problems in which a system, typically governed by a stochastic process, operates in one of a finite number of regimes or modes and the controller (or agent) may choose to switch between modes at random or predetermined times. The decision to switch incurs a mode-dependent cost, and the system is observed or penalized according to both running costs (or rewards) and potentially terminal or default costs. The controller's objective is to maximize (or minimize) a cumulative, discounted cost (or reward) over an infinite time horizon, ensuring well-posedness via discounting or catch-all constraints. Infinite horizon formulations are central in applications where the planning period is not naturally truncated, notably in operation of power assets, financial investment under regime risk, real options, production planning, and energy markets.

1. Mathematical Formulation and Dynamic Programming

The canonical infinite horizon optimal switching problem involves a controlled Markov (or general strong) diffusion process $X_t$ on $\mathbb{R}^k$ , a finite set $\mathcal{I} = \{1, \ldots, m\}$ of modes, and value functions $v_i(x)$ for each mode $i$ . The agent can switch from the current mode $i$ to any $j \neq i$ at a chosen time, incurring a cost $g_{ij}(x)$ . At all times $t\geq 0$ , an instantaneous running cost (or reward) $f_i(x)$ is incurred, and in many models, an additional default or abandonment cost $F_i(x)$ is permitted (see (Asri, 2012)). The value function for starting in state $x$ and mode $i$ under optimal switching and possible default is defined as: $v_i(x) = \sup_{\alpha \in \mathcal{A}_i} \mathbb{E}\left[ \int_0^\infty e^{-rt} f_{I_t^\alpha}(X_t^{x,\alpha})\,dt - \sum_{n} e^{-r\tau_n} g_{i_{n-1},i_n}(X_{\tau_n}^{x,\alpha}) - e^{-r\tau_D} F_{i_D}(X_{\tau_D}^{x,\alpha}) \right].$ Here $\alpha$ denotes the switching/suspension/default strategy, and $r > 0$ is a discount factor ensuring finiteness of total cost/reward.

The dynamic programming principle states that the value function $v_i(x)$ solves a system of variational inequalities with interconnected (mode- and default-dependent) obstacles: $\min\left\{ v_i(x) - \max\left\{ \max_{j \neq i} \left[ - g_{ij}(x) + v_j(x) \right],\ -F_i(x) \right\}, \ r v_i(x) - \mathcal{A}v_i(x) - f_i(x) \right\} = 0,$ where $\mathcal{A}$ is the infinitesimal generator of the diffusion $X_t$ : $\mathcal{A} \phi(x) = \frac12 \text{Tr}\left[\sigma(x) \sigma(x)^T D^2 \phi(x)\right] + \langle b(x), D\phi(x) \rangle.$ Existence and uniqueness of viscosity solutions to this system—under polynomial growth, Lipschitz continuity, and structural (e.g., switching cost) assumptions—are established (see Theorem 6 in (Asri, 2012)). This system provides the deterministic part of the verification argument in Markovian settings.

2. Probabilistic and Numerical Approaches: BSDEs and Monte Carlo

Modern approaches leverage backward stochastic differential equations (BSDEs) and reflected BSDEs to characterize the value function, particularly in non-Markovian and high-dimensional settings. For the $m$ -mode problem, a minimal solution to a multidimensional reflected BSDE (RBSDE) represents the optimal value; the reflection (obstacle) structure encodes that switching only occurs when another mode offers a strictly better value net of switching cost (Shigeta, 2016). When ambiguity (model uncertainty) is present, the RBSDE includes additional infimum (worst-case) terms modeling aversion to model misspecification.

For efficient high-dimensional computations, probabilistic numerical methods combine time truncation, Monte Carlo simulation, and regression-based approximations of conditional expectations to implement backward dynamic programming. Memory reduction techniques, such as reconstructing trajectories on-demand rather than storing full path histories, allow scaling to problems with thousands of simulation paths and large numbers of time steps (Aïd et al., 2012).

Representative numerical recursion in discretized form: $v^{\Pi}(t_n, x, i) = \max_{j} \left\{ h f(t_n,x,j) - k(t_n,i,j) + \mathbb{E}[ v^{\Pi}(t_{n+1}, X^{t_n,x}_{t_{n+1}}, j) ] \right\},$ where $v^{\Pi}$ is the discretization on a grid $\Pi$ , and the expectation is approximated by regression on local basis functions over suitably partitioned state space.

3. Infinite Horizon Structural Properties and Applications

The infinite horizon structure critically influences both theory and applications:

Discounting is required: The presence of a strictly positive discount rate $r$ ensures cumulative costs/rewards remain finite, and, together with assumptions on the drift and diffusion coefficients (continuity, polynomial or linear growth, Lipschitz), guarantees well-posedness of the switching problem (Asri, 2012, Ding et al., 19 Jun 2025).
Default/terminal decisions: Default (running down to a terminal state) incurs a one-time cost and is modeled by an additional obstacle, creating richer structure in the set of variational inequalities (Asri, 2012).
Interconnected obstacles: In all modes, obstacles involve other value functions, not only in alternative modes but also due to possible default. This leads to strong coupling across the equations.
Threshold (free-boundary) structure: In one-dimensional regimes, the optimal switching policy is characterized by thresholds in the state variable. In applications such as investment, energy production, or inventory, threshold crossing triggers regime change (Asri et al., 30 Jul 2024, Liang et al., 2013). For example, in two-player switching games, the value function’s QVI solution structure reduces to identifying a finite set of state thresholds at which switches are triggered.

Applications span energy (valuation of power plants with multiple production modes and shutdown/default decisions), production/investment planning under regime risk, financial portfolio selection, and real options with embedded switching or abandonment (Asri, 2012, Aïd et al., 2012, Hu et al., 2021).

4. Extensions: Ambiguity, Indefinite Weights, and Mean-Field Dynamics

Advanced models accommodate scenarios where:

Ambiguity/model uncertainty is present: The problem features additional infimum over model parameters or ambiguity sets, modifying drivers in the underlying RBSDE and PDE. This leads to strategies that are robust to worst-case scenarios, impacting optimal switching thresholds and policies (Shigeta, 2016).
Negative switching costs and benefits: Switching can incur negative costs (i.e., rebates or incentives), as long as strong triangular (no “free loop”) conditions are met, preventing infinite profit via cycling between regimes (Shigeta, 2016, Asri et al., 30 Jul 2024).
Indefinite quadratic weights: In mean-field and LQ-type infinite-horizon control with switching, cost function weights are allowed to be indefinite (possibly negative definite). Solvability is assured via regularization (e.g., penalizing large controls), and the equivalence of open- and closed-loop solvability is established via coupled algebraic Riccati equations and infinite-horizon BSDEs (Mei et al., 22 Mar 2025).
Mean-field interactions: Conditional expectations arise both in state dynamics and running cost, requiring orthogonal decomposition and the solution of coupled Riccati equations and BSDEs dependent on regime (Mei et al., 1 Jan 2025, Ding et al., 19 Jun 2025).

5. Solution Concepts: Viscosity, Feedback, and Stochastic Maximum Principle

The solution in the infinite-horizon setting is typically formulated in terms of viscosity solutions to a system of coupled variational inequalities (for Markovian problems), which, under suitable polynomial growth and continuity conditions, are unique (Asri, 2012). Viscosity methods are crucial where the value function is nonsmooth due to mode switching, obstacles, or singularities. Threshold-type policies are rigorously characterized via these solutions.

For feedback controls, in quadratic and mean-field models, optimal controls are typically synthesized using coupled algebraic Riccati equations with regime-switching structure; the state feedback is constructed so that the closed-loop system is stable and minimization is achieved (Wu et al., 1 Mar 2024, Hu et al., 2021). Theoretical results guarantee that—under stabilizability and suitable convexity—open- and closed-loop solvability are equivalent, and explicit optimal policies can be constructed.

The stochastic maximum principle provides sufficient conditions for optimal controls, even in regime-switching infinite-horizon problems. When the Hamiltonian is convex in controls and states, any control minimizing the Hamiltonian (with respect to the adjoint process) is optimal (Ding et al., 13 Jun 2025). This approach enables derivation and verification of optimal strategies in systems with regime switching and long-run objectives.

6. Numerical Implementation and Algorithmic Schemes

In practice, numerical schemes for infinite-horizon optimal switching problems must handle two main challenges:

Curse of dimensionality: The combined state-and-mode space is high-dimensional. Regression-Monte Carlo methods, with local basis regression, adaptive partitioning, and memory reduction techniques, are suited for such problems (Aïd et al., 2012).
Infinite-Horizon Truncation and Approximation: The problem on $[0,\infty)$ is approximated by truncation to $[0, T]$ with an appropriate terminal condition, then solved by backward induction. The error from time truncation and discretization is controlled via rates derived in the analysis.

For smooth, multidimensional diffusion, solving the coupled QVI directly with finite element or semi-Lagrangian dynamic programming methods is sometimes feasible (Kalise et al., 2018). In high dimensions or for non-Markovian problems, probabilistic representations via BSDEs or RBSDEs inform simulation-based and regression schemes.

Neural network-based approximation of the value and continuation functions has been demonstrated for discrete-time, risk-aware, non-Markovian infinite-horizon switching (e.g., hydropower planning with delayed information; (Martyr et al., 2019)).

7. Summary Table

Mathematical Tool/Property	Role in Infinite Horizon Optimal Switching	Principal Reference(s)
Discounting ( $r>0$ )	Ensures finiteness of cost/integral	(Asri, 2012, Ding et al., 19 Jun 2025)
Viscosity solution of PDE/QVI	Main well-posedness, uniqueness characterization	(Asri, 2012, Asri et al., 30 Jul 2024)
Multidimensional (reflected) BSDE	Probabilistic representation, non-Markovian cases	(Shigeta, 2016, Fuhrman et al., 2018)
Algebraic Riccati equation (ARE)	Feedback synthesis in LQ/mean-field models	(Mei et al., 1 Jan 2025, Mei et al., 22 Mar 2025, Wu et al., 1 Mar 2024)
Memory reduction algorithm	Efficient high-dimensional simulation	(Aïd et al., 2012)
Threshold structure (switching region)	Policy characterization in one-dimensional cases	(Asri et al., 30 Jul 2024, Liang et al., 2013)
Risk/ambiguity	Robust control via worst-case expected value	(Shigeta, 2016)
Nonpositive/negative switching costs	Wealth/loop arbitrage prevention, richer policies	(Shigeta, 2016, Asri et al., 30 Jul 2024)

The infinite horizon optimal switching problem subsumes a wide range of regimes, dynamics, and evaluation functionals, connecting PDE, stochastic analysis, variational inequality, and numerical methods. Its theoretical underpinnings—ensuring existence, uniqueness, and structural properties—directly support applications in energy systems, finance, and operations research. The modern literature rigorously extends classical finite-horizon results to the infinite-horizon regime, accommodates ambiguity and singular costs, and provides scalable computational tools for real-world implementation.