Discrete Optimization MA2827 Fondements de l’optimisation discrète https://project.inria.fr/2015ma2827/ Dynamic programming (Part 2) Material based on.

Slides:



Advertisements
Similar presentations
Minibridge Cowbridge Comprehensive July 2007 Patrick Jourdain.
Advertisements

SOCiable A Game of Life Chances ©Jon Witt Version 1.0.
By: Jonathan Quenzer. To have a computer learn how to play Blackjack through reinforcement learning Computer starts off with no memory. After each hand.
© Glenn Rowe AC Lab 2 A simple card game (twenty- one)
It’s All in the Cards Adding and Subtracting Integers
Lecture 8: Dynamic Programming Shang-Hua Teng. Longest Common Subsequence Biologists need to measure how similar strands of DNA are to determine how closely.
Overview What is Dynamic Programming? A Sequence of 4 Steps
Part 3: The Minimax Theorem
Lecture 10. Simplified roulette European roulette has numbers 0,1,…36. xD0s
Dealer Comm Hand Player makes Ante bet and optional Bonus bet. Five cards are dealt to each player from the shuffler. Five cards are dealt from the shuffler.
1 CSE1301 Computer Programming: Lecture 23 Algorithm Design (Part 1)
Dice, Cards and Darts Probabilities in Games Finding the Right Representation.
An application of the Probability Theory
Computer Science 313 – Advanced Programming Topics.
Centinel tournament ● A deck: the numbers in random order ● A game lasts until no numbers are left in deck ● A game is played like this (first player.
Alfredo Perez Resident Mathematician Texas A&M University GK-12 Program.
Matrix Games Mahesh Arumugam Borzoo Bonakdarpour Ali Ebnenasir CSE 960: Selected Topics in Algorithms and Complexity Instructor: Dr. Torng.
Sequence Alignment Variations Computing alignments using only O(m) space rather than O(mn) space. Computing alignments with bounded difference Exclusion.
Pertemuan 23 : Penerapan Dinamik Programming (DP) Mata kuliah : K0164-Pemrograman vers 01.
Chapter 5 Black Jack. Copyright © 2005 Pearson Addison-Wesley. All rights reserved. 5-2 Chapter Objectives Provide a case study example from problem statement.
Dynamic Programming 0-1 Knapsack These notes are taken from the notes by Dr. Steve Goddard at
Lecture 7 Topics Dynamic Programming
Test Abstractions Intent Nat. Lang. Spec. HW Behavioral Tests can be described at different abstraction levels Tests can be defined top-down or bottom-up.
VOCABULARY  Deck or pack  Suit  Hearts  Clubs  Diamonds  Spades  Dealer  Shuffle  Pick up  Rank  Draw  Set  Joker  Jack 
Black Jack Dr. Bernard Chen University of Central Arkansas Spring 2012.
Blackjack: Myths vs. Reality Group K Andrew KerrAndrew Phillips Sven SkoogWoj Wrona.
Blackjack: An Analysis of Probability By: John Theobald.
April 2009 BEATING BLACKJACK CARD COUNTING FEASIBILITY ANALYSIS THROUGH SIMULATION.
Learning to Play Blackjack Thomas Boyett Presentation for CAP 4630 Teacher: Dr. Eggen.
Learning BlackJack with ANN (Aritificial Neural Network) Ip Kei Sam ID:
Dynamic Programming. Well known algorithm design techniques:. –Divide-and-conquer algorithms Another strategy for designing algorithms is dynamic programming.
Fundamentals of Algorithms MCS - 2 Lecture # 7
Blackjack: A Beatable Game Amber Guo Adapted from: David Parker Advisor: Dr. Wyels California Lutheran University ‘05.
Welcome to 5 th Grade “Virtual Parent School” Unit 1 Order of Operations & Whole Numbers.
LECTURE 14: USE CASE BASICS CSC 212 – Data Structures.
COSC 3101A - Design and Analysis of Algorithms 7 Dynamic Programming Assembly-Line Scheduling Matrix-Chain Multiplication Elements of DP Many of these.
Homework Homework due now. Reading: relations
Lecture 12. Game theory So far we discussed: roulette and blackjack Roulette: – Outcomes completely independent and random – Very little strategy (even.
Dynamic Programming. Many problem can be solved by D&C – (in fact, D&C is a very powerful approach if you generalize it since MOST problems can be solved.
All In To put all the rest of your money into the pot.
Thursday, May 2 Dynamic Programming – Review – More examples Handouts: Lecture Notes.
Dynamic Programming.  Decomposes a problem into a series of sub- problems  Builds up correct solutions to larger and larger sub- problems  Examples.
Game Procedures Who does what, where, when, and how?
Week 8 : User-Defined Objects (Simple Blackjack Game)
Introduction to State Space Search
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 21.
The Pentium Goes to Vegas Training a Neural Network to Play BlackJack Paul Ruvolo and Christine Spritke.
CSCI 256 Data Structures and Algorithm Analysis Lecture 10 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.
9/27/10 A. Smith; based on slides by E. Demaine, C. Leiserson, S. Raskhodnikova, K. Wayne Adam Smith Algorithm Design and Analysis L ECTURE 16 Dynamic.
TU/e Algorithms (2IL15) – Lecture 4 1 DYNAMIC PROGRAMMING II
Texas Holdem A Poker Variant vs. Flop TurnRiver. How to Play Everyone is dealt 2 cards face down (Hole Cards) 5 Community Cards Best 5-Card Hand Wins.
Data Structures Lab Algorithm Animation.
Discrete Optimization
Lecture 11.
Discrete Optimization
Lecture 5 Dynamic Programming
Discrete Optimization
Chapter 5 Black Jack.
Lecture 10.
Lecture 5 Dynamic Programming
Lecture 12.
Dynamic Programming Dr. Yingwu Zhu Chapter 15.
Kevin Mason Michael Suggs
Task 2 Implementation help
Trevor Brown DC 2338, Office hour M3-4pm
Longest Common Subsequence
Dynamic Programming II DP over Intervals
Longest Common Subsequence
Dynamic Programming.
CMSC201 Computer Science I for Majors Lecture 12 – Program Design
Presentation transcript:

Discrete Optimization MA2827 Fondements de l’optimisation discrète Dynamic programming (Part 2) Material based on the lectures of Erik Demaine at MIT and Pascal Van Hentenryck at Coursera

Outline Dynamic programming – Guitar fingering Quiz: bracket sequences More dynamic programming – Tetris – Blackjack

DP ≈ “careful brute force” DP ≈ recursion + memoization + guessing Divide the problem into subproblems that are connected to the original problem Graph of subproblems has to be acyclic (DAG) Time = #subproblems · time/subproblem Dynamic programming

5 easy steps of DP 1.Define subproblems 2.Guess part of solution 3.Relate subproblems (recursion) 4.Recurse + memoize OR build DP table bottom-up - check subprobs be acyclic / topological order 5.Solve original problem Analysis: #subproblems #choices time/subproblem time extra time

Guitar fingering Task: find the best way to play a melody

Guitar fingering Task: find the best way to play a melody Input: sequence of notes to play with right hand One note at a time! Which finger to use? 1, 2, …, F = 5 for humans Measure d( f, p, g, q ) of difficulty to go from note p with finger f to note q with finger g Examples of rules: crossing fingers: 1 q => uncomfortable stretching: p uncomfortable legato (smooth): ∞ if f = g

Guitar fingering Task: find the best way to play a melody Goal: minimize overall difficulty Subproblems: min. difficulty for suffix note[ i : ] #subproblems = O( n ) where n = #notes Guesses: finger f for the first note[ i ] #choices = F Recurrence: DP[ i ] = min{ DP[ i + 1 ] + d( note[ i ], f, note[ i +1 ], next finger ) } Not enough information!

Guitar fingering Task: find the best way to play a melody Goal: minimize overall difficulty Subproblems: min. difficulty for suffix note[ i : ] when finger f is on note[ i ] #subproblems = O( n F ) Guesses: finger f for the next note, note[ i + 1 ] #choices = F Recurrence: DP[ i, f ] = min{ DP[ i + 1, g ] + d( note[ i ], f, note[ i +1 ], g ) | all g } Base-case: DP[ n, f ] = 0 time/subproblem = O( F )

Guitar fingering Task: find the best way to play a melody Goal: minimize overall difficulty Subproblems: min. difficulty for suffix note[ i : ] when finger f is on note[ i ] #subproblems = O( n F ) Guesses: finger f for the next note, note[ i + 1 ] #choices = F Recurrence: DP[ i, f ] = min{ DP[ i + 1, g ] + d( note[ i ], f, note[ i +1 ], g ) | all g } Base-case: DP[ n, f ] = 0 time/subproblem = O( F )

Guitar fingering Task: find the best way to play a melody Topological order: for i = n-1, n-2, …, 0: for f = 1, …, F: total time = O( n F 2 ) Final problem: find minimal DP[ 0, f ] for f = 1, …, F guessing the first finger notes fingers

Quiz: bracket sequences Consider sequences of brackets: ( ) [ ] { } A sequence of brackets is correct when 1.each opening bracket matches to a closing one (same type) 2.substring inside a matching pair is correct Examples: [ () () { [ ] } ]correct ) ( ) ( ) (incorrect [ ] [ ( ) }incorrect

Quiz: bracket sequences Consider sequences of brackets: ( ) [ ] { } A sequence of brackets is correct when 1.each opening bracket matches to a closing one (same type) 2.substring inside a matching pair is correct Task 1: How many correct sequences of length 2n exist? Task 2: Given a sequence of length n (incorrect), how many (minimum) symbols do you need to add make the sequence correct? Example: ( { ] ) => ( { } [ ] )

Tetris Task: win in the game of Tetris!

Tetris Task: win in the game of Tetris! Input: a sequence of n Tetris pieces and an empty board of small width w Choose orientation and position for each piece Must drop piece till it hits something Full rows do not clear Goal: survive i.e., stay within height h

Tetris Task: stay within height h Subproblem: survival? in suffix [ i : ] given a particular column profile #subproblems = O( n h w ) Guesses: where to drop piece i? #choices = O( w ) Recurrence: DP[ i, p ] = max { DP[ i + 1, q ] | q is a valid move from p } Base-case: DP[ n+1, p ] = true for all profiles p time/subproblem = O( w )

Tetris Task: stay within height h Topological order: for i = n – 1, n – 2, …, 0: for p = 0, …, h w – 1: total time O( n w h w ) Final problem: DP[ 0, empty ] pieces profiles

Blackjack Task: beat the blackjack (twenty-one)!

Blackjack Task: beat the blackjack! Rules of Blackjack (simplified): The player and the dealer are initially given 2 cards each Each card gives points: -Cards 2-10 are valued at the face value of the card -Face cards (King, Queen, Jack) are valued at 10 -The Ace card can be valued either at 11 or 1 The goal of the player is to get more points than the dealer, but less than 21, if more than 21 than he looses (busts) Player can take any number of cards (hits) After that the dealer hits deterministically: until ≥ 17 points

Perfect-information Blackjack Task: beat the blackjack with a marked deck! Input: a deck of cards c 0, …, c n-1 Player vs. dealer one-on-one Goal: maximize winning for a fixed bet $1 Might benefit from loosing to get a better deck

Task: beat the blackjack with a marked deck! Subproblem: BJ[ i ] = best play of c i, …, c n-1 #subproblems = O( n ) Guesses: how many times player hits? #choices ≤ n Recurrence: BJ[ i ] = max{ outcome  {-1, 0, 1} + BJ[ i #hits + #dealer hits ] | for #hits = 0, …, n if valid play } Perfect-information Blackjack Topological order: Final problem:

Detailed recursion: def BJ(i): if n − i < 4: return 0 (not enough cards) outcome = [ ] for p = 2, …, n − i − 2: (# cards taken) player = c i + c i+2 + c i+4 + … + c i+p+2 if player > 21: (bust) outcome.append( -1 + BJ(i+p+2) ) break for d = 2, …, n – i – p – 1 dealer = c i+1 + c i+3 + c i+p+2 + … + c i+p+d if dealer ≥ 17: break if dealer > 21: dealer = 0 (bust) outcome.append( cmp(player, dealer) + BJ(i + p + d) ) return max( outcome ) Perfect-information Blackjack

Task: beat the blackjack with a marked deck! Subproblem: BJ[ i ] = best play of c i, …, c n-1 #subproblems = O( n ) Guesses: how many times player hits? #choices ≤ n Recurrence: BJ[ i ] = max{ outcome  {-1, 0, 1} + BJ[ i #hits + #dealer hits ] | for #hits = 0, …, n if valid play } time/subproblem = O( n 2 ) Perfect-information Blackjack Topological order: for i = n-1, …, 0: total time O( n 3 ) Final problem: BJ[ 0 ]