Download presentation
Presentation is loading. Please wait.
1
Training a Neural Network
Tic-Tac-Toe Training a Neural Network Outline Learning Method: Reinforcement Learning Generating training data Training a multilayer perceptron
2
Reinforcement Learning
Goal: Find probability P(si) of winning from any state si Idea: After each game, update P(si) for all si met Algorithm: Initialize all states with P(si)=0.5 Play a game (try all P(next state) or random decision) P(final state) = Update intermediate states: P(si):=P(si)+a[P(sk)-P(si)] si … state before sk; a … learning rate Go back to (2)
3
Reinforcement Learning
- Convergence - If a decreases with time, all P(si) converge. sample state 1 2 3
4
Optimizing lookup table Training data
Prune lookup table by exploiting symmetry: = > = 5890 entries 825 entries Generate training data: current state and best move (target), pruned to 537 entries
5
Training an MLP Result:
Input layer: current state (9 values: X=+1, O=-1) Output layer: 9 neurons, target = max 1 hidden layer (9 neurons): classification rate: 80% 1 hidden layer (27 neurons): classification rate: 93% 3 hidden layers (27 neurons each): classification rate: 98.5% Result: Combine MLP and lookup table: Less memory (only weights and strongly reduced lookup table) Faster Can achieve perfect playing
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.