Switching Control Strategy: DRL-PSS Integration

Updated 4 November 2025

Switching Control Strategy is a method that dynamically selects controllers based on system state, using DRL and PSS integration to optimize performance in power systems.
It combines local PSS for baseline stability with DRL-based controllers for wide-area damping, thereby reducing unnecessary communication and computation.
Performance gains include faster oscillation damping, improved robustness to delays, and successful transfer of control from linear models to nonlinear systems.

A switching control strategy is a control methodology wherein the active controller or the control structure is selected dynamically from a set of possible controllers or modes based on the instantaneous system state, measured outputs, or other operational criteria. Such strategies are prevalent in applications requiring coordinated operation of multiple control schemes—often with differing levels of performance, robustness, and resource requirements—or where system properties (e.g., nonlinearity, operational constraints) change abruptly. Switching control can provide performance enhancement, improved robustness, efficient resource utilization, or guaranteed safety by appropriate selection and sequencing of controllers. The following sections present a detailed exposition of switching control strategies with special reference to the integration of Deep Reinforcement Learning (DRL)-based adaptive control and Power System Stabilizers (PSSs) for inter-area oscillation damping (Liang et al., 2023).

1. Integration of Deep Reinforcement Learning and Conventional Controllers

Switching control strategies are frequently used to combine the strengths of diverse controllers. In modern power systems, local PSSs are widely deployed for their simplicity and robustness in providing baseline small-signal damping of generator oscillations. However, these local controllers do not exploit the growing availability of wide-area measurement systems (e.g., Phasor Measurement Units, PMUs) which can be leveraged by data-driven global controllers to improve damping of low-frequency inter-area oscillations (IAOs).

In (Liang et al., 2023), a Deep Reinforcement Learning (DRL)-based controller—trained using a Markov Decision Process with eigenvalue-based reward—provides supplementary wide-area damping, while local PSSs ensure immediate and fail-safe stabilization. The switching control architecture adaptively activates the DRL controller when its contribution is crucial, and deactivates it when the local control suffices, thus reducing unnecessary communication and computation overhead.

Local PSS control law:

$u_{loc,i}(s) = k_i \cdot \frac{T_{w,i}s}{1+T_{w,i}s} \cdot \frac{1+T_{n1,i}s}{1+T_{d1,i}s} \cdot \frac{1+T_{n2,i}s}{1+T_{d2,i}s} \cdot \dot\theta_i(s)$

Wide-area DRL control law:

$\boldsymbol{u}_{wac}(t) = -\boldsymbol{Kx}(t)$

Overall excitation control signal:

$\boldsymbol{u}(t) = \boldsymbol{u}_{loc}(t) + \boldsymbol{u}_{wac}(t)$

2. Switching Logic and Rationale

The core logic of a switching control strategy lies in the state- or performance-based decision rules that dictate when to engage or disengage specific controllers. (Liang et al., 2023) formulates a system-level ‘energy-like’ performance index, which quantifies the severity of inter-area oscillations in terms of kinetic and potential energy surrogates computed from generator frequencies and rotor angles.

Switching index:

$P(t) = \kappa_1 \sum_{i=1}^{n_g} \sum_{j=1}^{n_g} [\omega_i(t) - \omega_j(t)]^2 + \kappa_2 \sum_{i=1}^{n_g-1} \omega_i^2(t) + \kappa_3 \sum_{i=1}^{n_g-1} [\theta_i(t) - \theta_{ref}(t)]^2$

The DRL controller is activated if $P(t) > \hat{r}$ and deactivated otherwise:

$\boldsymbol{u}(t) = \begin{cases} \boldsymbol{u}_{loc}(t) + \boldsymbol{u}_{wac}(t), & P(t) > \hat{r} \ \boldsymbol{u}_{loc}(t), & P(t) \leq \hat{r} \end{cases}$

This approach ensures that wide-area DRL-based control is deployed only during transient intervals where its superior damping capability is justified, and withdrawn during post-disturbance steady-state operation to conserve resources.

3. Markov Decision Process-Based Controller Design and Model Integration

The DRL controller is synthesized by posing the IAO damping problem as a Markov Decision Process (MDP), where the state is the vector of dynamic system variables (rotor angles, frequencies), the action is the feedback gain matrix, and the reward is (a negative function of) eigenvalue placement for the system's state matrix. The controller is trained using Deep Deterministic Policy Gradient (DDPG) in a linearized state-space model, but is evaluated for transfer to the original nonlinear system.

Linearized model used for DRL design:

$\dot{\boldsymbol{x}}(t) = \boldsymbol{A} \boldsymbol{x}(t) + \boldsymbol{B}_1 \boldsymbol{u}(t) + \boldsymbol{B}_2 \boldsymbol{\eta}(t)$

Reward function based on closed-loop eigenvalues:

$J(\boldsymbol{K}(t)) = \alpha \sum_i [\mathrm{Re}(\lambda_i)^2 - \mathrm{Re}(\hat{\lambda}_i)^2] + \beta \sum_i [\mathrm{Im}(\lambda_i)^2]$

Eigenvalue estimation for reward calculation is achieved through data-driven Dynamic Mode Decomposition (DMD), using PMU data.

Additionally, the switching strategy is applied only to a selected subset of generators with dominant participation in the critical IAO mode, further reducing communication and computational burden.

4. Performance, Cost, and Robustness Benefits

Evaluation of the switching control strategy within the IEEE-39 New England power system—using both linear and nonlinear dynamic models—demonstrates several key benefits:

Oscillation damping is accelerated: The SCS achieves more rapid decay and lower steady-state values in the performance index $P(t)$ compared to PSS-only or always-on wide-area strategies.
Resource efficiency: Communication and computation costs are directly reduced, since wide-area control signals are transmitted and computed only when indicated by significant system oscillations.
Robustness: The closed-loop performance shows resilience to communication delays up to 800 ms, with only minor degradation in settling times and transient response.
Transferability: The DRL controller trained on a linear system model is successfully validated on the full nonlinear system, indicating robustness against modeling inaccuracies.

Aspect	SCS Approach	Benefit
Controller types	DRL-wide-area + Local PSS	Best of both worlds; adaptive control
Switching logic	Based on system energy $P(t)$	Activates DRL based on oscillation severity
Criterion	$P(t) > \hat{r}$ : both active; else PSS only	Efficient control resource deployment
Effectiveness	Fast damping, low cost, robust to delay	Confirmed experimentally

5. Practical Implementation and Parameterization

Implementation of a switching control strategy as described requires:

Deployment of PMUs and local PSSs at generator sites for state measurement and basic damping.
Centralized or distributed computation of the system performance index $P(t)$ using real-time data streams.
Threshold optimization: The switching threshold $\hat{r}$ is tuned offline to minimize the cumulative system energy, balancing performance and resource cost.
Selective control: Only generators with high global participation factors in the targeted IAO eigenmode are provided with wide-area control, minimizing system-wide intervention.
Seamless controller transition: The excitation system is designed so that the wide-area signal can be summed with the PSS output without destabilizing the local feedback loop.

Simulation studies confirm that such an architecture can be retrofitted into existing wide-area control frameworks and preserves stability during controller transitions.

6. Impact, Significance, and Extensions

Switching control strategies, as formalized for the integration of DRL-based damping controllers and PSSs, provide a principled and efficient mechanism for leveraging advanced data-driven control technologies in legacy power system architectures. The approach fundamentally decouples high-performance transient damping from steady-state stabilization, yielding enhanced resilience, cost savings, and compatibility with existing hardware and communication infrastructure. The architecture also lays the groundwork for broader application of state-dependent or performance-index-based switching in future grid control, especially under cyber-physical constraints and in systems with heterogeneous controller and communication capabilities.

Future directions include adaptive tuning of switching thresholds based on real-time operational context, incorporation of QoS constraints from communication networks, and extension to distributed or fully decentralized switching strategies for large-scale grids.

References Table

Source	Key Contributions
(Liang et al., 2023)	DRL/PSS SCS architecture, system energy-based switching, robustness to delay
—	Other references as cited within the article above

PDF Markdown Chat (Pro)

References (1)

Deep-Reinforcement-Learning-Based Adaptive State-Feedback Control for Inter-Area Oscillation Damping with Continuous Eigenvalue Configurations (2023)

Follow Topic

Get notified by email when new papers are published related to Switching Control Strategy.