ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 49 DP can give complete quantitative solution Example 1: Discrete, finite.

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 49 DP can give complete quantitative solution Example 1: Discrete, finite capacity, inventory control problem S k = C k = D k = {0, 1, 2} x k + u k  2: finite capacity x k+1 = max(0, x k + u k – w k ) x k + u k  2  u k  2 – x k Prob{w k =0}=0.1, Prob{w k =1}=0.7, Prob{w k =2}=0.2 no backlogging U(x k )={0,…,2-x k )

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 50 DP can give complete quantitative solution Example 1 continued: Inventory control problem N = 3 g n (x n ) = 0 g k (x k, u k, w k ) = u k + 1∙max(0, x k + u k – w k ) + 3 ∙ max(0, w k + x k – u k ) order holding lost demand

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 51 DP can give closed-form solution Example 2: A gambling model A gambler is going to bet in N successive plays. the gambler can bet any (nonnegative) amount up to his present fortune. What betting strategy maximizes his final fortune? P(lose) = p, P(win) = 1 – p = q: Bernoulli Solution: For convenience, and with no loss in generality, we look to maximize the log of the final fortune. The model is as follows. Utility of fortune  1 / wealth  U(x) = log(x)

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 52 DP can give closed-form solution Example 2 continued: Variable definitions x k = fortune at beginning of kth play (after outcome of (k – 1)th play, before kth) u k = bet for kth play as a percentage of x k w k = 1: win w.p. p -1: lose w.p. q = 1 – p g k (x k, u k, w k ) = 0,0  k  N – 1 g N (x N ) = -log(x N ) x k+1 = x k + w k u k x k to maximize

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 53 DP can give closed-form solution Example 2 continued: DP algorithm for the problem

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 54 DP can give closed-form solution Example 2 continued: Solving the DP at k=N-1 Thus, if p = 1 (q = 0)  u * N-1 = 1 : bet it all!  u * N-1 = p – q if 0 ≤ p < ½, then u * N-1 = 0 (p < q  q  log(1 – u N-1 ) dominates)  p  log(1 + u N-1 )+ q  log(1 – u N-1 )< q  log(1 – u 2 N-1 ) ≤ 0

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 55 DP can give closed-form solution Example 2 continued: Find critical points

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 56 DP can give closed-form solution Example 2 continued: Closed-form solution for k=N-1 Hence, C

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 57 DP can give closed-form solution Example 2 continued: Closed-form solution for k=N-1 Hence, can view these as constant functions (controls = %) or as feedback policies (total bet )

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 58 DP can give closed-form solution Example 2 continued: Solving the DP at k=N-2 Proceeding one stage (play) back:

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 59 DP can give closed-form solution Example 2 continued: General closed-from DP solution

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 60 DP can be used to obtain qualitative properties (structure) of optimal solutions Example 3: A stock option model x k : price of a given stock at beginning of kth day x k+1 = x k + w k = {w k }i.i.d.,w k ~ F(  )  Random Walk

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 61 DP can be used to obtain qualitative properties (structure) of optimal solutions Example 3 continued: A stock option model Actions: Have an option to buy one share of the stock at fixed price c; N days to exercise option. If you buy when stock’s price is s:  s – c = profit (can be negative) What strategy maximizes profit?  Terminating Process (Bertsekas, Prob. 8, Ch. 1)

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 62 DP can be used to obtain qualitative properties (structure) of optimal solutions Example 3 continued: Solution

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 63 DP can be used to obtain qualitative properties (structure) of optimal solutions Example 3 continued: Solution However, process terminates (see prob. 8, ch. 1) when u k =B  fictitious termination state T s.t. mixed symbolic and numeric states  discrete event system

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 64 DP can be used to obtain qualitative properties (structure) of optimal solutions Example 3 continued: Solution Cost structure changed to: There is no simple analytical solution for J k (x k ) or u* k =  *(x k ), but we can obtain some qualitative properties (structure) of solutions.

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 65 DP can be used to obtain qualitative properties (structure) of optimal solutions Example 3 continued: DP algorithm for the problem expected “profit-to-go”

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 66 DP can be used to obtain qualitative properties (structure) of optimal solutions Example 3 continued: Lemma (Ross) (i) J k (x k ) – x k + c is decreasing in x k  after a certain value of stock price profit-to-go is negative  buy none (ii) J k (x k ) is increasing and continuous in x k (backward induction) constant does not affect property

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 67 DP can be used to obtain qualitative properties (structure) of optimal solutions Example 3 continued: Theorem (Ross) There exists numbers s 1 ≤ s 2 ≤ … ≤ s N-k ≤ … ≤ s N such that where, These results can be used to solve the problem numberically, or to gain insight into the process. critical stock values k periods remaining

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 68 DP can be used to obtain qualitative properties (structure) of optimal solutions Example 3 continued: Remark For a deterministic situation, optimizing over policies (feedback) results in no advantage over optimizing over actions (sequences of controls/decisions) Hence, the optimization problem can be solved using linear/nonlinear programming. Furthermore, for a finite state and action deterministic problem, we can equivalently formulate the problem as a shortest path problem for an acyclic graph.

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 69 DP can be used to obtain qualitative properties (structure) of optimal solutions Example 3 continued: Forward search There are efficient ways to find shortest path, e.g. Branch and Bound algorithms. However, DP has some advantages: always leads to global optimum can handle difficult constraint sets c 01 c 02 c 03 c ij 1 2 3 0 0 0 k=0k=1k=2k=N-1k=N start End (Artificial)...

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 70 DP can handle difficult constraint sets Example 4: Integer-valued variables Remark: reachable set from x 0 = 1 is Z 2 : no cost at final stage N=2

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 71 Example 4 continued: Solution k = 2 k = 1 one-stage cost J2J2 singleton DP can handle difficult constraint sets

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 72 Example 4 continued: Solution k = 0 DP can handle difficult constraint sets

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 73 Example 4 continued: Optimal Policy k = 0 DP can handle difficult constraint sets

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 49 DP can give complete quantitative solution Example 1: Discrete, finite.

Similar presentations

Presentation on theme: "ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 49 DP can give complete quantitative solution Example 1: Discrete, finite."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 49 DP can give complete quantitative solution Example 1: Discrete, finite.

Similar presentations

Presentation on theme: "ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 49 DP can give complete quantitative solution Example 1: Discrete, finite."— Presentation transcript:

Similar presentations

About project

Feedback