Download presentation
Presentation is loading. Please wait.
Published byBambang Cahyadi Modified over 6 years ago
1
Rule Learning for Go Introduction Data Extraction Bad Move Problems
Ripper Explanation Example Experimental Results
2
Objective Increase the playing strength of Explorer through
the use of machine-learned rules for local play Why Learn Rules? How We Learn Rules Cohen’s RIPPER Algorithm (Repeated Incremental Pruning to Produce Error Reduction) Local Move Generation Speed
3
Overview Professional Games Data Extraction Ripper Explorer
Tests Vs. GNUGo
4
Data Extraction Ripper Format 0 – Off-Board 1 – Own Stone
2 – Own Territory 3 – Possible Own Territory 4 – Near Own Territory 5 – Neutral … up to 11 Comma-Separated Lists 5,5,1,9,5, 1,5,1,10,5, 5,5,5,5,5, 5,5,5,9,5, goodMove.
5
Processing Options Rotation Mirroring Nearby Bad Moves
5,5,1,9,5, 1,5,1,10,5, 5,5,5,5,5, 5,5,5,9,5, goodMove. Rotation Mirroring Nearby Bad Moves Farther Bad Moves Random Bad Moves Stone Data Only Limited Territory Data All Territory Data Extra Weighting 5,9,1,5,5, 5,10,1,5,1, 5,5,5,5,5, 5,9,5,5,5, goodMove. 5,5,5,5,5, 5,5,5,9,5, 1,5,1,10,5, 5,5,1,9,5, goodMove. 5,5,5,5,5, 5,9,5,5,5, 5,10,1,5,1 5,9,1,5,5, goodMove. Y – Axis X – Axis X & Y
6
How Ripper Works Make GrowSet, PruneSet with 2/3 & 1/3 of each class respectively Do { // Generate Ruleset While NewRule covers negative examples in GrowSet For All Attributes of the data For All Possible Values of the Attribute Evaluate GainOf( Attribute == Value ) If Negation is Allowed Evaluate GainOf( Attribute != Value ) Add Condition for which Max. Gain occurred to NewRule For All possible deletions of conditions from NewRule Evaluate ((correctPositives - incorrectNegatives) / totalCovered ) On the PruneSet Implement deletion with maximum value Add NewRule to Ruleset } While #GoodMoves in GrowSet != 0 && Ruleset desc. length < min desc. length + D bits
7
How Ripper Works (Continued)
Repeat #Optimizations Times { // Optimize Ruleset For Each Rule in Ruleset Create Replacement of Rule by removing Rule from Ruleset and growing a new rule then pruning it so as to minimize the error of the entire ruleset with the replacement instead of the original on the PruneSet Create Revision by greedily adding conditions to the original Evaluate Rev. and Rep. by evaluating the description length of the rulesets having them switched with the original, the choice with minimum description length is kept. Add new Rules again // As in previous slide. } Description Length: Number of bits required to encode the information in a binary string. Gain: (# moves covered by old rule still covered by new rule)*(Information of Old Rule – Information of New Rule) Information: - log2(goodMoves covered / total # moves covered by rule)
8
Ripper Example Input Filter Negative Input Data Positive Input Data
9
Example Data Position1 = ‘1’ Position1 = ‘1’ Position4 = ‘1’
Valu e Position 0.555 0.121 0.679 2 1.181 0.925 2.670 1 0.263 0.625 13 12 11 10 9 8 7 6 5 4 3 Position1 = ‘1’ Valu e Position 0.541 -0.125 0.195 1.120 2 2.01 1.548 1 0.180 0.765 0.443 13 12 11 10 9 8 7 6 5 4 3 Position1 = ‘1’ Position4 = ‘1’ Valu e Position -0.176 -0.589 0.789 1.660 2 2.339 1 0.678 1.263 -0.058 13 12 11 10 9 8 7 6 5 4 3 Position1 = ‘1’ Position4 = ‘1’ Position2 = ‘1’
10
Position1 = ‘1’ Position4 = ‘1’ Position2 = ‘1’ Position3 = ‘2’
Example Data Valu e Position -0.387 2.712 2 1 0.678 13 12 11 10 9 8 7 6 5 4 3 Position1 = ‘1’ Position4 = ‘1’ Position2 = ‘1’ Position3 = ‘2’ No negatives covered, all positives removed, attempt at making second rule wil fail. Revision: Nothing occurs as all good moves are covered. Replacement: Same rule learned. Final hypothesis is: goodMove :- pos1='1', pos4='1', pos2='1', pos3='2' (12/0). default badMove (48/0).
11
Current Results 18 Rulesets Covering a Range of Processing Options
Only one completed test vs. GNUGo, no improvement seen. Ruleset Used: 5x5 Square Mirrors Added Random Bad Moves All Territory Data Double-Weighted
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.