Dynamic Adversarial Conflict with Restricted Information Jason L. Speyer Research Asst.: Ashitosh Swarup Mechanical and Aerospace Engineering Department.

Slides:



Advertisements
Similar presentations
Introduction to Game Theory
Advertisements

GAME THEORY.
Linear Programming. Introduction: Linear Programming deals with the optimization (max. or min.) of a function of variables, known as ‘objective function’,
Dynamic Decision Processes
Totally Unimodular Matrices
Mixed Strategies CMPT 882 Computational Game Theory Simon Fraser University Spring 2010 Instructor: Oliver Schulte.
Chapter 11 Dynamic Programming.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Solving POMDPs Using Quadratically Constrained Linear Programs Christopher Amato.
Study Group Randomized Algorithms 21 st June 03. Topics Covered Game Tree Evaluation –its expected run time is better than the worst- case complexity.
© 2015 McGraw-Hill Education. All rights reserved. Chapter 15 Game Theory.
Markov Game Analysis for Attack and Defense of Power Networks Chris Y. T. Ma, David K. Y. Yau, Xin Lou, and Nageswara S. V. Rao.
Markov Game Analysis for Attack and Defense of Power Networks Chris Y. T. Ma, David K. Y. Yau, Xin Lou, and Nageswara S. V. Rao.
6/30/00UAI Regret Minimization in Stochastic Games Shie Mannor and Nahum Shimkin Technion, Israel Institute of Technology Dept. of Electrical Engineering.
Separating Hyperplanes
Planning under Uncertainty
Distributed Decision Making Jason L. Speyer UCLA MURI Kick Off Meeting May 14, 2000.
OPTIMAL CONTROL SYSTEMS
Autonomous Target Assignment: A Game Theoretical Formulation Gurdal Arslan & Jeff Shamma Mechanical and Aerospace Engineering UCLA AFOSR / MURI.
Advanced Microeconomics Instructors: Wojtek Dorabialski & Olga Kiuila Lectures: Mon. & Wed. 9:45 – 11:20 room 201 Office hours: Mon. & Wed. 9:15 – 9:45.
1 Computing Nash Equilibrium Presenter: Yishay Mansour.
Support Vector Machines Formulation  Solve the quadratic program for some : min s. t.,, denotes where or membership.  Different error functions and measures.
1 Cooperative Control of Distributed Autonomous Vehicles in Adversarial Environments 2.5 Year MURI Research Review November 18, 2003.
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
1 A Stochastic Pursuit-Evasion Game with no Information Sharing Ashitosh Swarup Jason Speyer Johnathan Wolfe School of Engineering and Applied Science.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Optimal Fixed-Size Controllers for Decentralized POMDPs Christopher Amato Daniel.
1 Stochastic Modeling for Clinical Scheduling by Ji Lin Reference: Muthuraman, K., and Lawley, M. A Stochastic Overbooking Model for Outpatient Clinical.
Games in the normal form- An application: “An Economic Theory of Democracy” Carl Henrik Knutsen 5/
Game Theory.
Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.
Multi-vehicle Cooperative Control Raffaello D’Andrea Mechanical & Aerospace Engineering Cornell University u Hierarchical Decomposition u Example: RoboCup.
Optimization Methods One-Dimensional Unconstrained Optimization
Game Theory.
Asaf Cohen (joint work with Rami Atar) Department of Mathematics University of Michigan Financial Mathematics Seminar University of Michigan March 11,
MAKING COMPLEX DEClSlONS
1 S ystems Analysis Laboratory Helsinki University of Technology Kai Virtanen, Tuomas Raivio, and Raimo P. Hämäläinen Systems Analysis Laboratory (SAL)
AUTOMATIC CONTROL THEORY II Slovak University of Technology Faculty of Material Science and Technology in Trnava.
Derivative Action Learning in Games Review of: J. Shamma and G. Arslan, “Dynamic Fictitious Play, Dynamic Gradient Play, and Distributed Convergence to.
Dynamic Games of complete information: Backward Induction and Subgame perfection - Repeated Games -
SWAN 2006, ARRI_ Control of Distributed-Information Nonlinear Stochastic Systems Prof. Thomas Parisini University of Trieste.
Game Theory in Wireless and Communication Networks: Theory, Models, and Applications Lecture 3 Differential Game Zhu Han, Dusit Niyato, Walid Saad, Tamer.
Markov Decision Processes1 Definitions; Stationary policies; Value improvement algorithm, Policy improvement algorithm, and linear programming for discounted.
Price-Based Unit Commitment. PBUC FORMULATION  maximize the profit.
Regret Minimizing Equilibria of Games with Strict Type Uncertainty Stony Brook Conference on Game Theory Nathanaël Hyafil and Craig Boutilier Department.
Xianwu Ling Russell Keanini Harish Cherukuri Department of Mechanical Engineering University of North Carolina at Charlotte Presented at the 2003 IPES.
1 ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 8: Dynamic Programming – Value Iteration Dr. Itamar Arel College of Engineering Department.
Chapter 4 Sensitivity Analysis, Duality and Interior Point Methods.
Bayesian Brain: Probabilistic Approaches to Neural Coding Chapter 12: Optimal Control Theory Kenju Doya, Shin Ishii, Alexandre Pouget, and Rajesh P.N.Rao.
Alpcan, T., and T. Basar (2004) “A game theoretic analysis of intrusion detection in access control systems” Proceedings of 43 rd IEEE Conference on Decision.
Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.
Econ 805 Advanced Micro Theory 1 Dan Quint Fall 2009 Lecture 1 A Quick Review of Game Theory and, in particular, Bayesian Games.
Chapter 5 Adversarial Search. 5.1 Games Why Study Game Playing? Games allow us to experiment with easier versions of real-world situations Hostile agents.
Tools for Decision Analysis: Analysis of Risky Decisions
Water Resources Planning and Management Daene McKinney
Chapter 11 Dynamic Programming.
6.5 Stochastic Prog. and Benders’ decomposition
Optimal control T. F. Edgar Spring 2012.
Game Theory Solutions 1 Find the saddle point for the game having the following payoff table. Use the minimax criterion to find the best strategy for.
CS5321 Numerical Optimization
Chapter 6. Large Scale Optimization
Announcements Homework 3 due today (grace period through Friday)
The Curve Merger (Dvir & Widgerson, 2008)
Equlibrium Selection in Stochastic Games
Game Theory II Solutions 1
Computing Nash Equilibrium
Multiagent Systems Repeated Games © Manfred Huber 2018.
Game Theory Solutions 1 Find the saddle point for the game having the following payoff table. Use the minimax criterion to find the best strategy for.
Quadratic Systems. What you’ll learn
9.3 Linear programming and 2 x 2 games : A geometric approach
6.5 Stochastic Prog. and Benders’ decomposition
Chapter 6. Large Scale Optimization
Presentation transcript:

Dynamic Adversarial Conflict with Restricted Information Jason L. Speyer Research Asst.: Ashitosh Swarup Mechanical and Aerospace Engineering Department UCLA MURI, Review June

Cooperative and Adversarial Static Strategies with Restricted Information Static Stochastic Teams [Radner] Each team member’s strategy is a function of only its local noisy measurement of the state of the world. Minimize the expected value of a convex function of the team strategies and the state of the world. All a priori statistics and functions are known. Solution Stationary conditions for general convex cost [Radner]. Locally finiteness condition relaxed [Krainak, Speyer, Marcus] Solutions available for the LQG and LEG. Static Stochastic Nonzero-sum Games [Basar] Solutions available for LQG.

Dynamic Team and Game Strategies with Nonclasical Information Patterns LQG and LEG team strategies with one-step delayed-information pattern available. All information except for the current measurement is shared. Solution constructed by dynamic programming where a static game is solved at each step in the backward recursion. Since information can not be shared, few results are available for game strategies [Willman]. Formal (possible) solution to LQG games of conflict. We interpret his results and discuss new directions.

Formulation of the Dynamic LQG Game With Restricted Information Consider the quadratic cost criterion The discrete-time system dynamics are The measurements are

Strategies for Games with Restricted Information Define the measurement history of the pursuer and evader as Define the strategies as general linear functions Since these strategies are adversarial, there is no cooperation and therefore, no possibility of cheating. Consider the Saddle Point Inequality as where ( ) * denotes the saddle point strategies. If one player uses a linear strategy, then the resulting LQG problem produces the other linear optimal strategy.

Construction of the Linear Strategies The cost can be formed through the following nesting Assume the pursuer knows the functional form of the evader’s strategy as Substitute evader’s strategy into the dynamic system where is a ((i+1)·n+i·e)×(i·n+(i-1)·e) growing matrix.

Construction (Continued) Knowing the evaders strategy, solve the LQG problem of minimizing where is the conditional mean propagated as The pursuer’s strategy given the evaders strategy is This strategy, known to the evader, must be reduced to the form

Convergence to Saddle Point Strategies Strategies are a complex function of the opponents gains. Strategy gains are determined by an iterative procedure. Begin by assuming an adversaries strategy. Solve for the opponents strategy given an adversaries strategy. A sequence of LQG minimization and maximizations oscillate about the saddle point. This sequence may converge to the saddle point strategies. Require conditions for the existence of a saddle point of pure strategies. Require conditions for convergence to the saddle point.

Special Cases Consider the full state information LQR differential game. If there exits a solution to the saddle point Riccati eq., then the Riccati eqs. associated with sequential min and max operations converge to the saddle point Riccati eq. Convergence has been proved. Consider a scalar three stage dynamic game with restricted information. Cost criterion converges to a fixed point for some parameters. Saddle point strategies converge to a fixed point. Results indicate a hedging policy by the adversaries over their full state strategies.

Application to SEAD Requires generalization of LQG strategies for adversarial conflict. Apply to suppression of enemy air defenses (SEAD) UCAVs allocate resources based on their distributed sensor information. SAM site allocates resources based on its sensor information.

Coordinated Flight Aerodynamically Coupled Formation Flight of Aircraft Chichka, Wolfe, & Speyer Applications: Autonomous Aerial Refueling. Autonomous Formation Flight for Drag Reduction. UCAV Clusters.