Download presentation
Presentation is loading. Please wait.
Published bySharon Wade Modified over 9 years ago
1
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 1 Chapter 1: The DP Algorithm To do: sequential decision-making state random elements discrete-time stochastic dynamic system optimal control/decision problem actions vs. strategy (information gathering, feedback) Illustrated via examples, later on the general model will be described.
2
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 2 Example: Inventory Control Problem Quantity of a certain item, e.g. gas in a service station, oil in a refinery, cars in a dealership, spare parts in a maintenance facility, etc. The stock is checked at equally spaced periods in time, e.g. every morning, at the end of each week, etc. At those times, a decision must be made as to what quantity of the item to order, so that demand over the present period is “satisfactorily” met (we will give a quantitative meaning to this). 012k-1kk+1N-1N k-1kk+1 kth period check stock, place order
3
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 3 Example: Inventory Control Problem Stochastic Difference Equation: x k+1 = x k + u k – w k x k : stock at the beginning of kth period u k : quantity ordered at beginning of kth period. Assume delivered during kth period. w k : demand during kth period, { w k } stochastic process assume real-valued variables
4
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 4 Example: Inventory Control Problem Negative stock is interpreted as excess demand, which is backlogged and filled ASAP. Cost of operation: 1.purchasing cost: cu k ( c = cost per unit) 2.H(x k+1 ) : penalty for holding and storage of extra quantity (x k+1 >0), or for shortage (x k+1 <0) Cost for period k = cu k + H(x k +u k -w k ) = g(x k,u k,w k ) x k+1
5
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 5 Example: Inventory Control Problem Let or
6
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 6 Example: Inventory Control Problem Objective: to minimize, in some meaningful sense, the total cost of operation over a finite number of periods (finite “horizon”) total cost over N periods =
7
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 7 Example: Inventory Control Problem Two distinct situations can arise: Deterministic Case: x o is perfectly known, and the demands are known in advance to the manager. 1.at k=0, all future demands are known {w 0, w 1,..., w N-1 }. select all orders at once, so as to exactly meet the demand x 1 = x 2 =... = x N-1 = 0 0 = x 1 = x 0 + u 0 – w 0 u 0 = w 0 – x 0 u k = w k, 1 k N-1 : fixed order schedule assume x 0 w 0
8
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 8 Example: Inventory Control Problem What we do is to select a set of fixed “actions” (numbers, i.e. precomputed order schedule). 2.At the beginning of period k, w k becomes known (perfect forecast). Hence, we must gather information and make decisions sequentially. “strategy” rule for making decisions based on information as it becomes available: forecast
9
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 9 Stochastic Case: x 0 is perfectly known (can generalize to case when only distribution is known), but is a random process. Assume that are i.i.d., -valued r.v., with pdf f w, i.e. Independent of k P w : Probability distribution or measure, i.e. is the problem that takes a value in the set Stochastic Case
10
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 10 Note that the stock is now a r.v. Alternatively, we can describe the evolution of the system in terms of a transition law: Prob = Prob = Stochastic Case
11
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 11 Also, the cost is a random quantity minimize expected cost Action: select all orders (numbers) at k=0 most likely not “optimal” (reduces to nonlinear programming problem) VS Strategy: select a sequence of functions s.t. Stochastic Case : difficult problem ! Optimization is over a function space Information available of kth period
12
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 12 Let = ( 0,, 1,..., N-1 ) : control / decision strategy, policy, law : set of all admissible strategies (e.g. k (x) 0) s.t. Stochastic Dynamic Program: If the problem is feasible, then and optimal strategy *, i.e. Then, the stochastic DP problem is minimize:
13
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 13 Note: No backlogging : transition law Summary of the Problem 1-Discrete time Stochastic System : system equation
14
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 14 3-Control constraint: 2-Stochastic element, assumed i.i.d. for example, will generalize to depending on x k and u k. if there is a maximum capacity M, 4-Additive cost Stochastic Dynamic Program then,
15
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 15 5-Optimization over admissible strategies We will see later on that this problem has a neat closed form solution: for some (threshold levels) T k : base-stock policy Stochastic Dynamic Program
16
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 16 Role of Information: Actions Vs. Strategies Example: Let a two-stage problem be given as: where w 0 is a random variable s.t. it takes values 1 w. p., i.e. 0 0
17
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 17 Role of Information: Actions Vs. Strategies Problem A: Choose actions (u 0, u 1 ) (open loop, control schedule) to minimize: Equivalently, let N=2 minimize s.t. (*)
18
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 18 Solution A: Case (i): Role of Information: Actions Vs. Strategies
19
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 19 Role of Information: Actions Vs. Strategies Case (ii):
20
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 20 Can be anything, then choose appropiately. Role of Information: Actions Vs. Strategies No information gathering: we choose at the start and do not take in to consideration x 1 at the beginning of stage 1.
21
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 21 Role of Information: Actions Vs. Strategies Problem B: Choose u 0 and u 1 sequentially, using the observed value of x 1. Sequential decision-making, feedback control. Thus to take decision u 1, we wait until outcome x 1 becomes available, and act accordingly. Solution B: from (*), we select
22
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 22 Note: information gathering doesn’t always help: Let (Deterministic case) Do not gain anything by making decisions sequentially Role of Information: Actions Vs. Strategies
23
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 23 Discrete-Time Stochastic Dynamic System Model and Optimal Decision / Control Problem 1-Discrete time stochastic dynamic system (t, k can be time or events) state space of time k control space disturbance space (countable) Also, depending on the state of the system, there are constraints on the actions that can be taken: Non empty subset
24
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 24 Discrete-Time Stochastic Dynamic System Model and Optimal Decision / Control Problem 2-Stochastic disturbance { w k }. : probability measure (distribution), may depend explicitly in time, current state and action, but not on previous disturbances w k-1, …, w 0.
25
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 25 3-Admissible Control / Decision Laws (Strategies, Policies) Define information patterns ! ▪Feasible policies ▪ Markov: -Deterministic -Randomize and (*) (*) holds Discrete-Time Stochastic Dynamic System Model and Optimal Decision / Control Problem
26
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 26 4-Finite Horizon Optimal Control / Decision Problem Given an initial state x 0, and cost functions g k, k=0, …, N-1 find that minimizes the cost functional k=0, …, N-1 subject to the system equation constraint Discrete-Time Stochastic Dynamic System Model and Optimal Decision / Control Problem
27
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 27 We say that * is optimal for the initial state x 0 if Optimal N-stage cost (or value) function Discrete-Time Stochastic Dynamic System Model and Optimal Decision / Control Problem Likewise, for > 0 given, is said to be -optimal if
28
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 28 This stochastic optimal control problem is difficult!: we are optimizing over strategies Discrete-Time Stochastic Dynamic System Model and Optimal Decision / Control Problem The Dynamic Programming Algorithm will give us necessary and sufficient conditions to decompose this problem into a sequence of coupled minimization problems over actions, (optimization) from which we will obtain. DP is only general approach for sequential design making under uncertainty.
29
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 29 Given a dynamic description of a system via a system equation Then we can alternatively describe the system via a transition law. Alternative System Description
30
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 30 Alternative System Description Given x k and u k, x k+1 has distribution: System equation system transition law P
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.