Markov Game Analysis for Attack and Defense of Power Networks

64 Views

Download Presentation
## Markov Game Analysis for Attack and Defense of Power Networks

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Markov Game Analysis for Attack and Defense of Power**Networks Chris Y. T. Ma, David K. Y. Yau, Xin Lou, and Nageswara S. V. Rao**Power Networks are Important Infrastructures(And Vulnerable**to Attacks) • Growing reliance on electricity • Aging infrastructure • Introduced more connected digital sensing and control devices (and attract attacks on cyber space) • Hard and expensive to protect • Limited budget • How to allocate the limited resources? • Optimal deployment to maximize long-term payoff**Modeling the Interactions – Game Theoretic Approaches**• Static game • Each player has a set of actions available • Outcome and payoff determined by action of all players • Players act simultaneously**Static Game**• Example Defend & No Attack Defend &Attack No defend & Attack No defend & No Attack**Modeling the Interactions – Game Theoretic Approaches**• Leader-follower game (Stackelberg game) • Defender as the leader • Adversary as the follower • Bi-level optimization – minimax operation • Inner level: follower maximizes its payoff given a leader’s strategy • Outer level: leader maximizes its payoff subject to the follower’s solution of the inner problem**Stackelberg Game**• Example Defend No defend NoAttack NoAttack Attack Attack Only model one-time interactions**Modeling the Interactions – Markov Decision Process**• Markov Decision Process (MDP) • System modeled as set of states with Markov transitions between them • Transition depends on action of one player and some passive disruptors of known probabilistic behaviors (acts of nature)**Markov Decision Process (MDP)**• Example (2 states, each has 2 actions available) 0.9 0.1 0.1 0.9 Defend Recover up down No defend No recover 0.1 0.6 0.4 0.9 Only models one intelligent player**Our Approach – Markov Game**• Generalizations of MDP to an adversarial setting • Models the continual interactions between multiple players • Players interact in the new state with different payoffs • Models probabilistic state transition because of inherent uncertainty in the underlying physical system (e.g., random acts of nature)**Problem Formulation**• Defender and adversary of a power network • Two-player zero-sum game • Game formulation: • Adversary • Actions: which link to attack • Payoff: cost of load shedding by the defender because of the attack • Defender • Actions: which (up) link to reinforce or which (down) link to recover • Payoff: cost of load shedding because of the attack**Markov Game – Reward Overview**• Assume five links; link 4 both attacked and defended (u,u,u,u,u) (u,u,u,u,u) (u,u,u,u,u) p1 p2 1-p1 (u,u,u,d,u) (u,u,u,d,u) 1-p2 • Immediate reward of such actions is the weighted sum of successful attack and successful defense • Assume at state (u,u,u,d,u), link 4 both attacked and defended again • Immediate reward at state (u,u,u,d,u) is then the weighted sum of successful recovery and failed recovery • This immediate reward is further “propagated” back to the original state (u,u,u,u,u) with a discount factor • Hence, actions taken in a state will accrue a long-term reward**Solving the Markov Game – Value Iteration**• Dynamic program (value iteration) to solve the Markov game**Experiment Results**Link diagram State {u,u,u,u,u} Links 4 and 5 both connect to generator, and generator at bus 4 has higher output**Experiment Results**Payoff Matrix of state {u,u,u,u,u} for the static game. Payoff Matrix of state {u,u,u,u,u} for the Markov game. (ϒ = 0.3)**Conclusions**• Using Markov game to model the attack and defense of a power network between two players • Results show the action of players depends not only on current state, but also later states • To obtain the optimal long term benefit