Enhancing Security of Power Grid Against Strategic Attacks: A Stage-Wise Matrix Game Framework Based on Siamese Relational Nash—Double Deep Q-Network.
Saved in:
| Title: | Enhancing Security of Power Grid Against Strategic Attacks: A Stage-Wise Matrix Game Framework Based on Siamese Relational Nash—Double Deep Q-Network. |
|---|---|
| Authors: | Zhang, Jianhua1 (AUTHOR), Xie, Jun1 (AUTHOR), Li, Fei1 (AUTHOR), Song, Bo1 (AUTHOR) 6020090003@jsnu.edu.cn |
| Source: | Energies (19961073). May2026, Vol. 19 Issue 10, p2319. 25p. |
| Subject Terms: | *Nash equilibrium, *Zero sum games, *Reinforcement learning, *Game theory, *Cyberterrorism, *Adversarial machine learning, *System failures, *Energy security |
| Abstract: | Modern power grids are increasingly vulnerable to strategic malicious attacks that can trigger large-scale cascading failures. Existing multi-step Markov game formulations often struggle to align with the instantaneous nature of cascading dynamics, potentially introducing estimation bias in multi-agent learning. To address this issue, we formulate the attack–defense interaction as a stage-wise zero-sum matrix game, enabling direct approximation of the underlying payoff structure without temporal credit assignment. Based on this formulation, we propose a Siamese Relational Nash Double Deep Q-Network (SR-Nash-DDQN), which incorporates a structured relational pooling mechanism to capture high-dimensional strategic dependencies. The framework further integrates physics-driven counterfactual experience replay for improved sample efficiency and adopts a two-timescale learning scheme to stabilize adversarial training. Extensive evaluations on the IEEE 9-bus, 39-bus, and 118-bus systems demonstrate that the proposed method consistently approximates Nash equilibria and maintains strategic diversity across independent trials. Moreover, zero-shot generalization across 100 unseen operating conditions shows that the learned policy effectively improves the security lower bound and reduces worst-case damage under severe uncertainties. [ABSTRACT FROM AUTHOR] |
| Database: | Energy & Power Source |
|
Full text is not displayed to guests.
Login for full access.
|
|
| Abstract: | Modern power grids are increasingly vulnerable to strategic malicious attacks that can trigger large-scale cascading failures. Existing multi-step Markov game formulations often struggle to align with the instantaneous nature of cascading dynamics, potentially introducing estimation bias in multi-agent learning. To address this issue, we formulate the attack–defense interaction as a stage-wise zero-sum matrix game, enabling direct approximation of the underlying payoff structure without temporal credit assignment. Based on this formulation, we propose a Siamese Relational Nash Double Deep Q-Network (SR-Nash-DDQN), which incorporates a structured relational pooling mechanism to capture high-dimensional strategic dependencies. The framework further integrates physics-driven counterfactual experience replay for improved sample efficiency and adopts a two-timescale learning scheme to stabilize adversarial training. Extensive evaluations on the IEEE 9-bus, 39-bus, and 118-bus systems demonstrate that the proposed method consistently approximates Nash equilibria and maintains strategic diversity across independent trials. Moreover, zero-shot generalization across 100 unseen operating conditions shows that the learned policy effectively improves the security lower bound and reduces worst-case damage under severe uncertainties. [ABSTRACT FROM AUTHOR] |
|---|---|
| ISSN: | 19961073 |
| DOI: | 10.3390/en19102319 |