Download presentation
Presentation is loading. Please wait.
1
Reinforcement Learning and Tetris Jared Christen
2
Tetris Markov decision processes Large state space Long-term strategy without long-term knowledge
3
Background Hand-coded algorithms can clear > 1,000,000 lines Genetic algorithm by Roger Llima averages 42,000 lines Reinforcement learning algorithm by Kurt Driessens averages 30-40 lines
4
Goals Develop a Tetris agent that improves on previous reinforcement learning implementations Secondary goals Use as few handpicked features as possible Encourage risk-taking Include rarely-studied features of Tetris
5
Approach
6
Neural Net Control Inputs Raw state – filled & empty blocks Handpicked features Outputs Movements Placements
7
Contour Matching
8
Structure Active tetromino Next tetromino Held tetromino Placement 1 score Placement 1 match length Placement 1 value Placement n match length Placement n value Placement n score Hold value
9
Experiments 200 learning games Averaged over 30 runs Two-piece and six-piece configurations Compare to benchmark contour matching agent
10
Results Two-pieceSix-piece
11
Results Scor e Lines Cleared Best match 724255 Two-piece 599645 Six-piece 708553 Six-piece with height differences 714453 Six-piece with placement heights 698152
12
Conclusions Accidentally developed a heuristic that beats previous reinforcement learning techniques Six-piece’s outperformance of two- piece suggests there is some pseudo- planning going on A better way to generalize the board state may be necessary
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.