DP for Optimum Strategies in Games J.-S. Roger Jang (張智星) jang@mirlab.org http://mirlab.org/jang MIR Lab, CSIE Dept. National Taiwan University
Outline Game of dice sum Game of colored jenga
Game of Dice Sum Description Your goal Toss a dice 8 times and place the value into 4 double-digit number right after each toss. Find the total of these 4 numbers. If the total is bigger than 150, your score is 0. Otherwise your score is the total. Your goal Find the optimum strategy to play the game such that the expected total is optimized. Credit: Peter Norvig at Google CS283: AI Programming Techniques (1989 at UC Berkeley)
Three-step Formula of DP: Step 1 Optimum-value function D(p, q, s)=expected max score when p: No. of ten’s position left q: No. of one’s position left s: current sum of the game Credit: 電機系賀正翔 Game state of (1, 2, 67)
Three-step Formula of DP: Steps 2 and 3 Recurrent formula for the optimum-value function Answer: D(4, 4, 0)
Strategy during the Game Recurrent formula for the optimum-value function
Game of Colored Jenga Description: Techniques http://codeforces.com/problemset/problem/424/E Techniques Dynamic programming Hash table