OPTIMAL ENGAGEMENT POLICIES

Slides:

Advertisements

Similar presentations

The Maximum Principle: Continuous Time Main purpose: to introduce the maximum principle as a necessary condition that must be satisfied by any optimal.

Advertisements

Dynamic Programming Rahul Mohare Faculty Datta Meghe Institute of Management Studies.

Multi‑Criteria Decision Making

System Engineering & Economy Analysis Lecturer Maha Muhaisen College of Applied Engineering& Urban Planning.

1 Reinforcement Learning Problem Week #3. Figure reproduced from the figure on page 52 in reference [1] 2 Reinforcement Learning Loop state Agent Environment.

WALES, Ltd. MAY 2010 UNCLASSIFIED Interception Debris- from Initials to Full Signature- 1/16 INTERCEPTION DEBRIS – FROM INITIALS TO FULL SIGNATURE Approved.

An Introduction to Markov Decision Processes Sarah Hickmott

Planning under Uncertainty

LinearRelationships Jonathan Naka Intro to Algebra Unit Portfolio Presentation.

PROJECT RISK MANAGEMENT Presentation by: Jennifer Freeman & Carlee Rosenblatt

WALES, Ltd. System Level Radar Search Optimization - 1/20 MAY 2010 UNCLASSIFIED SYSTEM LEVEL RADAR SEARCH OPTIMIZATION SYSTEM LEVEL RADAR SEARCH OPTIMIZATION.

WHY REGIONAL DEFENSE? WALES, Ltd. Prepared By: Mr. A. Hermetz WALES, Ltd., Israel 1 st Annual Israel Multinational BMD Conference & Exhibition UNCLASSIFIED.

Markov Decision Processes1 Definitions; Stationary policies; Value improvement algorithm, Policy improvement algorithm, and linear programming for discounted.

1 Lesson 8: Basic Monte Carlo integration We begin the 2 nd phase of our course: Study of general mathematics of MC We begin the 2 nd phase of our course:

Chapter 6 Decision Trees and Influence Diagrams.

1 S ystems Analysis Laboratory Helsinki University of Technology Flight Time Allocation Using Reinforcement Learning Ville Mattila and Kai Virtanen Systems.

© 2009 Ilya O. Ryzhov 1 © 2008 Warren B. Powell 1. Optimal Learning On A Graph INFORMS Annual Meeting October 11, 2009 Ilya O. Ryzhov Warren Powell Princeton.

Measure of System Effectiveness Missile Defense System By Alfred Terris UNCL:ASSIFIED1.

Rate Distortion Theory. Introduction The description of an arbitrary real number requires an infinite number of bits, so a finite representation of a.

Introduction and Preliminaries D Nagesh Kumar, IISc Water Resources Planning and Management: M4L1 Dynamic Programming and Applications.

Dr Nazir A. Zafar Advanced Algorithms Analysis and Design Advanced Algorithms Analysis and Design By Dr. Nazir Ahmad Zafar.

Models of Policy Analysis Dr. Thomason Kaplan University.

Algorithm Analysis 1.

Lesson 8: Basic Monte Carlo integration

Forming a Strategy for your Business.

Figure 5: Change in Blackjack Posterior Distributions over Time.

Virtual memory.

Advanced Algorithms Analysis and Design

Design and modeling 10 step design process

Tools for Decision Analysis: Analysis of Risky Decisions

Computational Intelligence: Methods and Applications

Dynamic Programming Sequence of decisions. Problem state.

Chapter 11 Dynamic Programming.

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3

Assessing Students' Understanding of the Scientific Process Amy Marion, Department of Biology, New Mexico State University Abstract The primary goal of.

Formulate the Research Problem

Making complex decisions

Copyright © Cengage Learning. All rights reserved.

Analytics and OR DP- summary.

ITEC 202 Operating Systems

Dynamic Programming Copyright © 2007 Pearson Addison-Wesley. All rights reserved.

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3

Biomedical Data & Markov Decision Process

RISK ASSESSMENT TOOL PREVIEW

Markov ó Kalman Filter Localization

A Consensus-Based Clustering Method

When Security Games Go Green

Markov Decision Processes

Dynamic Programming Lecture 13 (5/31/2017).

Economic Evaluation Objective of Analysis Criteria Nature

Markov Decision Processes

Economic Evaluation Objective of Analysis Criteria Nature

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

CS202 - Fundamental Structures of Computer Science II

Department of Electrical Engineering

9.2 Arithmetic Sequences and Series

Markov Decision Problems

Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Jan. 11, 2001

The Fundamental Theorem of Calculus

1st Annual Israel Multinational BMD Conference & Exhibition

1st Annual Israel Multinational BMD Conference & Exhibition

Dynamic Programming 動態規劃

CS 416 Artificial Intelligence

CS 416 Artificial Intelligence

Chapter 12 Analyzing Semistructured Decision Support Systems

SMALL (and simple) Is USEFUL

Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1

Complexity Theory: Foundations

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3

Presentation transcript:

OPTIMAL ENGAGEMENT POLICIES 1st Annual Israel Multinational BMD Conference & Exhibition OPTIMAL ENGAGEMENT POLICIES Presented By: Dr. N. Israeli WALES, Ltd., Israel

OPTIMAL ENGAGEMENT POLICIES PRESENTATION TOPICS Introduction Threat Definition Optimality Criteria Dynamic Programming Optimal Policy Equations Three Examples: Example 1: Calculation Walkthrough Example 2: Solvable Case Example 3: A Computationally Untractable Threat Definition and Possible Approximation Schemes Roadmap to Practicality Summary and Next Steps

OPTIMAL ENGAGEMENT POLICIES INTRODUCTION Interceptors are a primary defense system resource whose management greatly determines the overall system performance When engaging an attacking TBM one faces the dilemma of choosing between: Allocating many interceptors to the current threat to ensure its destruction Saving interceptors for possible future threats A key difficulty here is the uncertainty regarding the materialization of future threats The above dilemma becomes more pronounced as defense systems become richer in terms of interceptor types and the number of available interception opportunities

OPTIMAL ENGAGEMENT POLICIES INTRODUCTION (Cont.) In this work we pose threat engagement as an optimization problem and seek optimal engagement policies By an engagement policy we mean the number and types of interceptors launched at attacking TBMs, as a function of prior intelligence, TBM type and damage capability, attack history, remaining interceptors stock, etc. For the sake of simplicity we consider cases where: Attacking TBMs appear and are engaged in a sequential order All allocated interceptors are used, namely, no Shoot-Look-Shoot capability Generalizations of our work to salvo engagement and Shoot-Look-Shoot capability are possible

OPTIMAL ENGAGEMENT POLICIES INTRODUCTION (Cont.) For each appearing TBM the defense makes an engagement decision based on: The type of the current threat and its damage inflicting capability The current available interceptors stock and their effectiveness against the current threat The attack history Possible attack continuations ??? ??? TBM type 1 TBM type 2 TBM type 2 TBM type 1 ???

SCOPE AND RESERVATIONS OPTIMAL ENGAGEMENT POLICIES SCOPE AND RESERVATIONS Our work is based on the assumption that it is possible to quantify the benefit of negating ballistic threats Quantifying this benefit should reflect decision makers’ priorities and is thus a difficult political, economical and strategic problem This problem is outside the scope of the current work. We will not attempt to derive such priorities and only use representative numbers We do however recognize that this is a difficult issue and therefore do not claim that our method provides “The Correct” engagement policies Rather, we wish to provide decision makers with a tool that allows them to explore the implications of their priorities in terms of engagement policies, interceptor consumption, etc.

OPTIMAL ENGAGEMENT POLICIES THREAT DEFINITION In order to speak about optimality we must have a threat definition that represents our knowledge regarding possible future attacks Policies can be evaluated (and therefore optimized) only with respect to a threat definition Naturally, the more precise the threat definition, the more effective the optimal engagement policy Since we never have perfect intelligence, threat definitions are probability functions that specify the chances of different attacks to materialize

OPTIMAL ENGAGEMENT POLICIES OPTIMALITY CRITERIA Given a threat definition we wish to find the optimal policy of spending our interceptor stock The optimality criterion depends on what we want to achieve In this work we focus on minimizing the total attack expected damage Generalization of our method to other optimization criteria is certainly possible

OPTIMAL ENGAGEMENT POLICIES DYNAMIC PROGRAMMING Dynamic Programming (DP) is an approach developed to solve sequential, or multi stage decision problems It reduces the runtime of algorithms by dividing the decision problem into overlapping sub problems The essence of DP is Richard Bellman’s principle of optimality: An optimal policy has the property that whatever the initial state and the initial decisions are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision

FINDING AN OPTIMAL ENGAGEMENT POLICY USING DP OPTIMAL ENGAGEMENT POLICIES FINDING AN OPTIMAL ENGAGEMENT POLICY USING DP DP recursion equation: expected damage in current stage expected damage in future stages Notation The optimal engagement policy  is a graph that tells us how to engage in different situations Note that the dependence of  on the attack history comes from its dependence on the conditional materialization probability of the next threat If this probability is a simple function of attack history the optimal policy graph will be small Complex conditional probabilities may lead to prohibitively large policy graphs

EXAMPLE 1 – CALCULATION WALKTHROUGH OPTIMAL ENGAGEMENT POLICIES EXAMPLE 1 – CALCULATION WALKTHROUGH Threat definition: one type of TBM, damage D=1, attack size uniformly distributed in [1,N]: Conditional materialization probability of the n’th threat Interceptor stock: one type of interceptor with probability of kill Pk, survival probability of the attacking TBM Ps=1-Pk DP recursion equation reduces to: With boundary conditions

OPTIMAL ENGAGEMENT POLICIES EXAMPLE 1 (Cont.) Solving for N=4, I1 =3, Pk=0.8, Materializing probabilities: P(2)=3/4, P(3)=2/3, P(4)=1/2 1st TBM I=0 =2.5 =0 I=1 =1.7 =1 I=2 =1.1 I=3 =0.7 Engagement option Expected Damage shoot 0 1+P(4)*(4,2)=1.02 shoot 1 0.2+P(4)*(4,1)=0.3 shoot 2 0.04+P(4)* (4,0)=0.54 Engagement option Expected Damage shoot 0 1+P(3)*(3,3)=1.093 shoot 1 0.2+P(3)*(3,2)=0.4 shoot 2 0.04+P(3)*(3,1)=0.507 shoot 3 0.008+P(3)*(3,0)=1.008 Engagement option Expected Damage shoot 0 1+P(3)*(3,2)=1.2 shoot 1 0.2+P(3)*(3,1)=0.666 shoot 2 0.04+P(3)*(3,0)=1.04 Engagement option Expected Damage shoot 0 1+P(3)*(3,1)=1.466 shoot 1 0.2+P(3)*(3,0)=1.2 Engagement option Expected Damage shoot 0 1+P(4)*(4,3)=1.004 shoot 1 0.2+P(4)*(4,2)=0.22 shoot 2 0.04+P(4)*(4,1)=0.14 shoot 3 0.008+P(4)*(4,0)=0.508 Engagement option Expected Damage shoot 0 1+P(4)*(4,1)=1.1 shoot 1 0.2+P(4)*(4,0)=0.7 2nd TBM I=0 =2 =0 I=1 =1.2 I=2 =0.666 I=3 =0.4 =1 =1 =1 3rd TBM I=0 =1.5 =0 I=1 =0.7 I=2 =0.3 I=3 =0.14 =1 =1 =2 4th TBM I=0 =1 =0 I=1 =0.2 =1 I=2 =0.04 =2 I=3 =0.008 =3

EXAMPLE 2 – A SOLVABLE CASE OPTIMAL ENGAGEMENT POLICIES EXAMPLE 2 – A SOLVABLE CASE Assessed Threat definition: Two types of TBMs (T=1,2) with damage D(T=1)=1 and D(T=2)=4 Joint attack size uniformly distributed in [1,N] Probability of a threat to be of types 1 and 2 is 0.7 and 0.3 respectively Conditional materialization probability of the n’th threat is thus Interceptors: Two types of interceptors (a, b) Probability of kill is given by the matrix No more than 6 engagements per TBM Solution complexity: Policy size Running time TBM type 1 2 a 0.8 0.6 b 0.5 0.9 Interceptor type Maximal attack size Interceptors initial stocks Number of possible engagements of a single TBM

OPTIMAL ENGAGEMENT POLICIES EXAMPLE 2 (Cont.) Policy graph for N=5 and initial interceptor stock (7a,2b) 1st TBM 7a, 2b TBM type TBM type 1 Prob. 0.7 D=1 TBM type 2 Prob. 0.3 D=4 1 2 a 0.8 0.6 b 0.5 0.9 2nd TBM 5a, 2b 6a, 1b Interceptor type (2a,0b) (1a,1b) Pk matrix (2a,0b) (1a,1b) (2a,0b) (1a,1b) 3a, 2b 4a, 1b 5a, 0b 3rd TBM (2a,0b) (0a,2b) (2a,0b) (1a,1b) (2a,0b) (3a,0b) 4th TBM 1a, 2b 2a, 1b 3a, 0b 2a, 0b (1a,0b) (0a,2b) (1a,0b) (1a,1b) (2a,0b) (1a,0b) (1a,0b) (2a,0b) 5th TBM 0a, 2b 1a, 1b 1a, 0b 2a, 0b 0a, 0b (0a,2b) (1a,1b) (1a,0b) (2a,0b) (0a,0b)

EXAMPLE 2 (Cont.) – POLICY TOPOLOGY OPTIMAL ENGAGEMENT POLICIES EXAMPLE 2 (Cont.) – POLICY TOPOLOGY Policy for N=100 and interceptor stocks between 0 to 150 a’s, and 0 to 40 b’s Policy for type 1 TBM Policy for type 2 TBM Shoot 0 Shoot 1 Shoot 2 Shoot 3 Shoot 4 Shoot 5 Shoot 6 TBM index a int. stock b int. stock TBM index a int. stock b int. stock Interceptor a engagement TBM index a int. stock b int. stock TBM index a int. stock b int. stock Interceptor b engagement

OPTIMAL ENGAGEMENT POLICIES EXAMPLE 2 (Cont.) N=100 and initial interceptor stock of 150 a’s and 40 b’s Example engagement sequences: 70% type 1 TBMs 30% type 2 TBMs TBM index TBM index 90% type 1 TBMs 10% type 2 TBMs 20% type 1 TBMs 80% type 2 TBMs Int. a eng. type 1 TBM type 2 TBM Int. b eng. Stock a Stock b TBM index Expected damage TBM index

OPTIMAL ENGAGEMENT POLICIES EXAMPLE 2 (Cont.) N=100 and initial interceptor stocks between 0 to 150 a’s and 0 to 40 b’s Threat expected damage under optimal engagement policy b Int. stock Damage Comparison with fixed policies a Int. stock Initial interceptors stock Best fixed engagement policy Expected damage of optimal policy Eng. of type 1 TBM Eng. of type 2 TBM Expected damage 150 a, 40 b 2 a, 0 b 1 a, 1 b 4.4 2.75 75 a, 20 b 1 a, 0 b 16.2 14.25 50 a, 12 b 33.52 27.03 30 a, 8 b 0 a, 0 b 49.95 42.87

EXAMPLE 3 – UNTRACTABLE THREAT DEFINITION OPTIMAL ENGAGEMENT POLICIES EXAMPLE 3 – UNTRACTABLE THREAT DEFINITION Consider the following threat definition: The optimal policy for say, the 100th TBM should be a function of the number of TBMs of each type that composed the first 99 attacking threats. Different response as the different types approach depletion The problem is that the number of possible partitions of the first 99 TBMs to the five types is huge Accordingly, the optimal policy will be prohibitively complex TBM Type Expected Damage Attack size (Uniformly distributed) 1 50-100 2 3 4 20-40 10 5 20 6 100

OPTIMAL ENGAGEMENT POLICIES EXAMPLE 3 (Cont.) Possible approximation scheme: Approximate the previous detailed threat definition by a simpler one Generate a large ensemble of attacks from the original threat definition and calculate the materialization probability of each threat type as a function of the joint attack size With this simpler function the problem reduces to one similar to Example 2 Note however that information is lost here. In particular, we may assign a finite probability for a TBM to appear even though the enemy has already fired the entire ORBAT of the relevant type Prob. n

OPTIMAL ENGAGEMENT POLICIES SUMMARY TBM attack engagement has been posed as an optimal sequential decision problem We showed that for certain types of threat definitions the optimization problem can easily be solved with Dynamic Programming The optimal policy for these solvable cases is simple in certain parts of the attack phase space (early attack stages with lots of interceptors) but tends to bifurcate with diminishing interceptor stock Some types of threat definitions imply complex attack materialization probabilities which lead to prohibitively large policy graphs In these complex cases we suggest approximate solutions