Presentation is loading. Please wait.

Presentation is loading. Please wait.

Use or disclosure of the information contained herein is subject to specific written CIRA approval 1 PURSUIT – EVASION GAMES GAME THEORY AND ANALYSIS OF.

Similar presentations


Presentation on theme: "Use or disclosure of the information contained herein is subject to specific written CIRA approval 1 PURSUIT – EVASION GAMES GAME THEORY AND ANALYSIS OF."— Presentation transcript:

1 Use or disclosure of the information contained herein is subject to specific written CIRA approval 1 PURSUIT – EVASION GAMES GAME THEORY AND ANALYSIS OF COMPETITIVE DYNAMICS FOR INDUSTRIALSYSTEMS PURSUIT - EVASION GAMES E. De Lellis Dottorato in Ingegneria Aerospaziale, XXIV ciclo L. Garbarino Dottorato in Ingegneria Aerospaziale, XXVI ciclo A. Vitale Dottorato in Ingegneria Aerospaziale, XXV ciclo

2 Use or disclosure of the information contained herein is subject to specific written CIRA approval 2 PURSUIT – EVASION GAMES  Differential Games  Pursuit – Evasion Games  Feedback Nash Equilibrium Definition  Dynamic Programming and Hamilton Jacobi Bellman Equation  Finding Feedback Nash Equilibrium for PE Games  Simulation Example Outline

3 Use or disclosure of the information contained herein is subject to specific written CIRA approval 3 PURSUIT – EVASION GAMES Differential Games: Problem Formulation (1/2) u i is the control implemented by the i-th player, f is continuous in t, u i, and is continuously differentiate in x The i-th player aims at maximizing his own payoff ψ (terminal payoff) is continuous in T and is continuously differentiate in x(T) L (running cost) is continuous in t, u i, and is continuously differentiate in x Differential games constitute a class of decision problems wherein the evolution of the state is described by a differential equation and the players act throughout a time interval. Let x S subset of R N describe the state of the system, evolving in time according to the ODE:

4 Use or disclosure of the information contained herein is subject to specific written CIRA approval 4 PURSUIT – EVASION GAMES Differential Games: Problem Formulation (2/2) Differential games could be continuous time or discrete time finite time or infinite time if finite time, the duration could be fixed pre-specified or the end point in both state and time could be variable, for example: The saddle-point, Nash and Stackelberg equilibrium solution concepts are still valid for dynamic games

5 Use or disclosure of the information contained herein is subject to specific written CIRA approval 5 PURSUIT – EVASION GAMES Differential Games: Available Information The strategy adopted by a player depends on the information available to him at each time We assume that each player has perfect knowledge of The function f determining the evolution of the system, the initial state x 0 and the sets U i of control values available to the two players The payoff functions J i The instantaneous time t [0, T] (i.e. both players have a clock). Moreover we distinguish the following cases: Open Loop strategies Feedback (or Markovian) strategies Hierarchical Play Delayed Information

6 Use or disclosure of the information contained herein is subject to specific written CIRA approval 6 PURSUIT – EVASION GAMES Pursuit Evasion Games Pursuit-Evasion games are two-person deterministic zero-sum differential games defined by a dynamics and a target The target is a subset of R N. In the pursuit-evasion game, the first player tries to maintain the state of the system as long as possible outside of the target while the second player aims at reaching C as soon as possible (capturability) Sufficiency conditions for the for existence of feedback saddle-point equilibrium strategy are provided by a natural two-person extension of the Hamilton-Jacobi-Bellman equation (called the "Isaacs equation“) However, other strategies could be possible

7 Use or disclosure of the information contained herein is subject to specific written CIRA approval 7 PURSUIT – EVASION GAMES Zero-Sum Games A two players game where the payoffs are is called zero-sum game The goal of the first player is to maximize this payoff, while the second player wishes to minimize it For a zero-sum game the Nash equilibrium coincides with saddle point :

8 Use or disclosure of the information contained herein is subject to specific written CIRA approval 8 PURSUIT – EVASION GAMES Feedback Nash Equilibrium for Differential Games A set of control functions (t, x) → (u * 1 (t, x),… u * k (t, x)) is a Nash equilibrium for the differential game within the class of feedback strategies if for each i the control function (t, x) → u * i (t, x) provides an optimal feedback in connection with the optimal control problem for i-th player : for the system with dynamics

9 Use or disclosure of the information contained herein is subject to specific written CIRA approval 9 PURSUIT – EVASION GAMES Dynamic Programming (1/2) Consider the optimization the following problem maximize: given where U is a compact domain and the function f is continuous w.r.t. all variables and continuously differentiable w.r.t. x. Moreover there exists a constant C such that

10 Use or disclosure of the information contained herein is subject to specific written CIRA approval 10 PURSUIT – EVASION GAMES Dynamic Programming (2/2) Let introduce the value function Principle of Dynamic Programming: for any initial data x 0 R and 0 ≤ t 0 < t 1 < T, one has

11 Use or disclosure of the information contained herein is subject to specific written CIRA approval 11 PURSUIT – EVASION GAMES Hamilton Jacobi Bellman Equation It is not easy to compute V. Moreover, the continuous differentiability assumption imposed on V is rather restrictive. Nevertheless, if such a function exists, then the HJB equation provides a means of obtaining the optimal control strategy Theorem If a continuously differentiable function V can be found that satisfies the HJB equation subject to the boundary condition V(T, x) = ψ(T, x), then it generates the optimal strategy through the static (pointwise) maximization problem defined by the RHS of the Hamiltonian According the dynamic programming approach, under the assumption of continuous differentiability of V, the value function for the given optimization problem satisfies the Hamilton Jacobi Bellman equation

12 Use or disclosure of the information contained herein is subject to specific written CIRA approval 12 PURSUIT – EVASION GAMES NE for Zero-Sum Games For a two-person zero-sum differential game the HJB equation is called Isaacs equation and can be rewrite as follows: A pair of strategies (u 1, u 2 ) provides a feedback saddle-point solution if there exists a function V : [0,T] x R N —> R satisfying the Isaacs PDE Interchangeability of the min and max operations in the Isaacs equation is often referred to as the Isaacs condition. It holds is both f and L are separable in u 1 and u 2

13 Use or disclosure of the information contained herein is subject to specific written CIRA approval 13 PURSUIT – EVASION GAMES Sufficient Condition for Zero-Sum NE Strategy Theorem: If : I.a continuously differentiable function V(t, x) exists that satisfies the Isaacs equation, II. V(T, x) = ψ(T, x) on the boundary of the target set, defined by l(t, x) = 0, III.either u 1 *(t, x) or u 2 *(t, x), as derived from Isaac equation, generates trajectories that terminate in finite time (whatever u 1 * or u 2 *, is), then V(t, x) is the value function and the pair u 1 *(t, x) or u 2 *(t, x) constitutes a saddle point This theorem is valid for pursuit-evasion games for which the duration of the game is not fixed, but is compute through where Λ is a closed subset, called the target set, and its surface is characterized by the scalar function l(t,x) = 0

14 Use or disclosure of the information contained herein is subject to specific written CIRA approval 14 PURSUIT – EVASION GAMES Necessary Condition for Zero-Sum NE Strategy (1/2) Theorem: Given a two-person zero-sum differential game, suppose that the pair (u 1 *, u 2 *) provides a saddle-point solution in feedback strategies, with x*(t) denoting the corresponding state trajectory. Furthermore, let its open-loop representation {u i (t) = u i (t,x*(t)), i = 1,2} also provide a saddle-point solution (in open-loop policies). Then there exists a costate function p(.): [0,T] —> R N such that the following relations are satisfied: where

15 Use or disclosure of the information contained herein is subject to specific written CIRA approval 15 PURSUIT – EVASION GAMES Simulation Example

16 Use or disclosure of the information contained herein is subject to specific written CIRA approval 16 PURSUIT – EVASION GAMES Simulation Example: Homicidal Chauffeur Game Coordinates Transformation: New Refs Pursuit position as origin and X-axis along Pursuit Direction

17 Use or disclosure of the information contained herein is subject to specific written CIRA approval 17 PURSUIT – EVASION GAMES Assuption: Instantaneous Evasor rotation Target Set Usable Part of the boundary

18 Use or disclosure of the information contained herein is subject to specific written CIRA approval 18 PURSUIT – EVASION GAMES The Barrier: those initial points that allow solution of the problem and Isaac Solution

19 Use or disclosure of the information contained herein is subject to specific written CIRA approval 19 PURSUIT – EVASION GAMES Final Condition Solution

20 Use or disclosure of the information contained herein is subject to specific written CIRA approval 20 PURSUIT – EVASION GAMES Solution


Download ppt "Use or disclosure of the information contained herein is subject to specific written CIRA approval 1 PURSUIT – EVASION GAMES GAME THEORY AND ANALYSIS OF."

Similar presentations


Ads by Google