- The paper’s main contribution is an AI-enabled hybrid cyber-physical framework that combines distributed RL controllers (ADP, PPO, DQN) with advanced cyber resilience for adaptive smart grid control.
- It employs a rigorous three-layer architecture integrating dynamic physical models, stochastic cyber disturbances, and AI-driven EMS for decentralized, cost-efficient management.
- Empirical validation on an IEEE 33-bus testbed demonstrates stable voltage regulation, low control costs, and rapid recovery from FDI attacks and renewable variability.
AI-Enabled Hybrid Cyber-Physical Framework for Adaptive Control in Smart Grids
The paper "An AI-Enabled Hybrid Cyber-Physical Framework for Adaptive Control in Smart Grids" (2511.21590) formalizes a three-layer smart grid architecture composed of physical (power system), cyber (communication/measurement), and adaptive control layers, centered around an energy management system (EMS). The framework models the smart grid as an abstract tuple GSG​=⟨P,C,E,U,O⟩, enabling structurally rigorous integration of distributed control, cyber-physical net modeling, and AI-driven resilience.
The physical layer implements dynamic models, incorporating both steady-state AC power flow and detailed generator DAEs, including swing, exciter, and governor dynamics; bus-level constraints are encoded via nonlinear network-coupled algebraic equations (DAEs). The cyber layer models PMUs, smart meters, edge sensors, cloud communication, and associated delays, noise, and cyber threats, with explicit stochastic models of latency, packet loss, and data corruption.
Figure 1: Layered architecture outlining hybrid cyber-physical integration for adaptive smart grid control.
Figure 2: Schematic illustration of cyber-physical interaction, depicting feedback between the power system and communication/edge infrastructure.
The EMS optimizes multi-objective operational cost under nonlinear, physical and cyber-induced constraints, executing optimization both locally and, contingently, in the cloud. The explicit inclusion of agent-based and game-theoretic frameworks enables decentralized, prosumer-driven optimization, with strategic Nash-type equilibria determining resource allocation under price and network constraints.
AI-Based Control Strategies
Adaptive Dynamic Programming (ADP)
ADP serves as the foundation for near-optimal, model-free sequential decision making utilizing learned value functions approximated by neural parameterizations. Both cloud and edge variants are implemented—cloud ADP processes unsullied global states (subject to delay), while edge ADP manipulates local, possibly delayed and noisy information. The controllers minimize cumulative quadratic cost over system deviation and control effort, directly approximating Bellman-optimal policies for nonlinear, high-dimensional environments.
Proximal Policy Optimization (PPO)
PPO is deployed for continuous-action control (e.g., inverter dispatch, BESS scheduling), with policy and value networks trained on local observations subjected to delay, packet loss, and cyber attacks. The surrogate, clipped policy objective ensures bounded updates, improving stability under nonstationary, cyber-physical noise-driven feedback. The reward incorporates frequency (f) and voltage (V) deviation penalties and regularization of control magnitude.
Figure 3: Detailed hybrid architecture illustrating integration of cyber and physical phenomena with AI-adaptive control.
Deep Q-Network (DQN)
DQN is used for discrete-operation decision making (battery/EV mode, on/off states, load shedding triggers). Q-functions are approximated by deep convolutional architectures; network updates leverage experience replay and slow target updates for stabilizing convergence under severe nonlinearities and stochastic disturbances. The discrete-action space complements PPO's continuous regimes, and both RL algorithms are orchestrated by a supervisory selection layer for real-time optimal controller arbitration.
Cyber-Physical Security and Resilience
The framework explicitly integrates cyber-security mechanisms, focusing on resilience to False Data Injection (FDI) attacks. Measurement streams are modeled as the sum of ground-truth, Gaussian noise, communication latency/jitter, and coordinated FDI vectors designed to breach standard residual-based detection. The resilience index is introduced, defined as
R=1−∑k​∥xknom​∥2∑k​∥xk​−xknom​∥2​
where xk​ and xknom​ denote perturbed and nominal trajectories. This index quantifies on-line self-healing and adaptive recovery post-attack or disturbance.
Simulation Testbed and Empirical Findings
The approach is validated using a detailed IEEE 33-bus radial distribution network, comprising dynamic loads, variable renewable generation, DER sites, EV charging, and distributed storage. Both physical and cyber layers are subject to realistic perturbations, including:
- Aggressive load/renewable variability (randomized, abrupt ramps)
- BESS/EV interaction and stochastic mode switching
- Measurement noise, bounded/variable communication delay ($0-250$ ms), and packet drops (pdrop​=0.05)
- FDI attacks targeting voltage and frequency channels.
Key empirical findings highlight:
1. Feeder and Renewable Profiles: Wind generation displays high-frequency volatility; PV exhibits smooth diurnal trends; these drive corresponding fluctuations in feeder active power.
Figure 4: Feeder active power trajectory shows the effect of volatile renewables and demand on feeder flows.
Figure 5: Wind power profile demonstrates typical short-term variability, testing controller adaptability.
Figure 6: Solar generation profile highlights smooth baseline with stochastic dips.
2. System Stability: At sensitive buses (e.g., Bus 5), voltage is consistently maintained within recommended bands (0.94 p.u.≤V≤1.05 p.u.), despite substantial disturbances, conferring effective disturbance rejection.
Figure 7: Bus 5 voltage evolution underscores effective regulation under local and global fluctuations.
3. Controller Dispatch Actions: Battery and inverter controllers dynamically compensate for load/generation fluctuations; load curtailment is minimized and seldom invoked, reflecting judicious correction only during extreme events.
Figure 8: Overlay of normalized load, PV, and wind profiles at Bus 5 emphasizes temporal coordination and system volatility.
Figure 9: Daily load profile indicates pronounced peak/off-peak cycling, challenging adaptive response mechanisms.
4. Resilience: The system maintains high resilience index (R>0.95 for most intervals), with transients during attacks or renewables volatility swiftly mitigated by the hybrid controller.
Figure 10: Resilience index time series quantifies disturbance recovery; system rarely falls below R=0.7 even under FDI/load shocks.
5. Controller Cost Efficiency: Total control cost remains moderate; PPO achieves stable, low-variance corrections, while fast-reacting ADP and DQN edge controllers provide additional robustness under communication and cyber uncertainties, as reflected in lower per-controller quadratic costs.
Figure 11: Evolution of aggregate, PPO, ADP, and DQN costs, with edge methods demonstrating fast adaptation.
Practical and Theoretical Implications
The architectural unification of multi-level RL with cyber-physical system formalism makes the framework inherently scalable for future grid deployments with high penetration of DER, IoT, and variable renewables, where edge-cloud coordination is essential. The integration of resilience as a quantifiable metric sets a foundation for systematic, AI-driven operation under adversarial and volatile regimes, rather than merely under steady-state or conventionally-modeled disturbance conditions.
The direct parallelization of edge-based ADP/DQN and cloud-PPO agents, with hybrid controller arbitration on cost-minimization, demonstrates an effective practical template for large-scale, multi-controller heterogeneous smart grid deployments. From a theoretical perspective, the results validate that model-free RL controllers combined with domain-inspired supervisory arbitration can achieve stable, self-healing operation in realistic, attack-prone, communication-limited grid environments—parameters under which classic model-predictive or centralized optimization schemes fail due to communication and cyber-physical fragility.
Conclusion
The paper presents a rigorous AI-enabled, hybrid cyber-physical architecture for adaptive smart grid control. By structuring the network in three collaborative layers and embedding multi-agent RL and ADP methodologies at both edge and cloud, the system achieves high resilience and robust dynamic performance under a realistic spectrum of disturbances, cyber threats, and renewable variability. The explicit modeling of cyber-physical security enables direct quantification and enhancement of resilience, while empirical results on the IEEE 33-bus testbed demonstrate reliable control with low cost and rapid recovery.
Future directions include hardware-in-the-loop validation and seamless extension to emerging DER domains, with substantial promise for AI-driven, resilient, and scalable grid management in high-renewable, adversarial, and communication-constrained environments.