Download presentation
Presentation is loading. Please wait.
Published byBriana Harless Modified over 9 years ago
1
Markov Game Analysis for Attack and Defense of Power Networks Chris Y. T. Ma, David K. Y. Yau, Xin Lou, and Nageswara S. V. Rao
2
Outline Motivation What have been done – Markov Decision Process (MDP), Static game, and Stackelberg game Our approach – Markov game Experiment results Conclusion
3
Power Networks are Important Infrastructures (And Vulnerable to Attacks) Growing reliance on electricity Aging infrastructure Introduced more connected digital sensing and control devices (and attract attacks on cyber space) Hard and expensive to protect Limited budget How to allocate the limited resources? – Optimal deployment to maximize long-term payoff
4
Modeling the Interactions – Game Theoretic Approaches Static game – Each player has a set of actions available – Outcome and payoff determined by action of all players – Players act simultaneously
5
Static Game Example Defend & Attack Defend & No Attack No defend & Attack No defend & No Attack
6
Static Game Example Defend No defend Attack No Attack Attack No Attack
7
Modeling the Interactions – Game Theoretic Approaches Leader-follower game (Stackelberg game) – Defender as the leader – Adversary as the follower – Bi-level optimization – minimax operation Inner level: follower maximizes its payoff given a leader’s strategy Outer level: leader maximizes its payoff subject to the follower’s solution of the inner problem
8
Stackelberg Game Example Defend No defend Attack No Attack Attack No Attack Only model one-time interactions
9
Modeling the Interactions – Markov Decision Process Markov Decision Process (MDP) – System modeled as set of states with Markov transitions between them – Transition depends on action of one player and some passive disruptors of known probabilistic behaviors (acts of nature)
10
Markov Decision Process (MDP) Example (2 states, each has 2 actions available) updown Defend No defend Recover No recover 0.9 0.6 0.1 0.9 0.1 0.4 0.9 0.1 Only models one intelligent player
11
Weaknesses of Current Formulations Markov Decision Process – Only models a single rational player Static game / Stackelberg game – Only models one-time interaction Security of Power Grid should be modeled as continual interactions between two rational players
12
Our Approach – Markov Game Generalizations of MDP to an adversarial setting – Models the continual interactions between multiple players Players interact in the new state with different payoffs – Models probabilistic state transition because of inherent uncertainty in the underlying physical system (e.g., random acts of nature)
13
Problem Formulation Defender and adversary of a power network – Two-player zero-sum game Game formulation: – Adversary Actions: which link to attack Payoff: cost of load shedding by the defender because of the attack – Defender Actions: which (up) link to reinforce or which (down) link to recover Payoff: cost of load shedding because of the attack
14
Problem Formulation State of the game – Status of system - set of links that are currently up, e.g., State 0 = all links are up State 1 = Link 1 is down State 3 = Links 1 & 2 are down … Both players have limited budget – Can only defense or attack limited number of links at a time
15
Markov Game – Reward Overview Assume five links; link 4 both attacked and defended (u,u,u,u,u) (u,u,u,d,u) (u,u,u,u,u) (u,u,u,d,u) p1p1 1-p 1 Immediate reward of such actions is the weighted sum of successful attack and successful defense Assume at state (u,u,u,d,u), link 4 both attacked and defended again p2p2 1-p 2 Immediate reward at state (u,u,u,d,u) is then the weighted sum of successful recovery and failed recovery This immediate reward is further “propagated” back to the original state (u,u,u,u,u) with a discount factor Hence, actions taken in a state will accrue a long-term reward
16
Solving the Markov Game – Definitions
17
Concerning the Transition Probability
18
Optimal Load Shedding Formulated as a constrained optimization problem, under physical constraints of stable power flow p: power (load or generation) z: changes in power distribution
19
Finding the Optimal Strategy – Solving a Linear Program
20
Solving the Markov Game – Value Iteration Dynamic program (value iteration) to solve the Markov game
21
Experiment Results Link diagram State {u,u,u,u,u} Links 4 and 5 both connect to generator, and generator at bus 4 has higher output
22
Experiment Results Payoff Matrix of state {u,u,u,u,u} for the static game. Payoff Matrix of state {u,u,u,u,u} for the Markov game. (ϒ = 0.3)
23
Conclusions Using Markov game to model the attack and defense of a power network between two players Results show the action of players depends not only on current state, but also later states – To obtain the optimal long term benefit
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.