1 UM Stratego Colin Schepers Daan Veltman Enno Ruijters Leon Gerritsen Niek den Teuling Yannick Thimister.

1 UM Stratego Colin Schepers Daan Veltman Enno Ruijters Leon Gerritsen Niek den Teuling Yannick Thimister

2 IntroductionYannick The game of StrategoDaan Evaluation FunctionLeon Monte CarloColin Genetic AlgorithmEnno Opponent modeling and strategyNiek ConclusionYannick Content

3 The game of Stratego Board of 10x10 Setup field 4x10

4 The game of Stratego B Bombs 1 Marshall 2 General 3 Colonels 4 Majors 5 Captains 6 Lieutenants 7 Sergeants 8 Miner 9 Scout S Spy F Flag

5 The game of Stratego Win Flag capture Unmovable pieces Draw Unmovable pieces Maximum moves

6 Starting Positions Flag placed Bombs placed Remaining pieces placed randomly

7 Starting Positions Distance to Freedom Being bombed in Partial obstruction Adjacency Flag defence Startup Pieces

8 Starting Positions Distance to Freedom

9 Starting Positions Startup Pieces

10 Sub-functions of the evaluation function: Material value Information value Near enemy piece value Near flag value Progressive bonus value First-move penalty Evaluation Function

11 Evaluation Function How it works: All the sub-functions return a value These values are then weighted and added to each other The higher the total added value, the better that move is for the player

12 Evaluation Function Material Value: Used for comparing the two players' board strengths Each piece type has a value Total value of the opponent's board is subtracted from the player's board value Positive value means strong player board Negative value means weak player board

13 Evaluation Function Information value: Stimulates the collection of opponent information and the keeping of personal piece information Each piece type has a certain information value All the values from each side are summed up and then substracted from each other A marshall being discovered is worse than a scout being discovered

14 Evaluation Function Near enemy piece value Checks if a moveable piece can or cannot defeat a piece next to it If piece can be defeated, return positive score If not, return a negative one If piece unknown, return 0

15 Evaluation Function Near flag value Stimulates the defence of own flag and the attacking of enemy's flag Constructs array with possible enemy flag locations If enemy near own flag, return negative number If own piece near possible enemy flag, return positive number

16 Evaluation Function Progressive bonus value Stimulates the advancement of pieces towards enemy lines Returns a positive value if piece moves forward Negative if backward

17 Evaluation Function First-move value Keeps pieces from giving away information Keeps the number of unmoved pieces high

18 Monte Carlo A subset of all possible moves is played No strategy or weights used Evaluation value received after every move At the end a comparison of evaluation values determines the best move A depth limit is used so the tree doesn't grow to big and the algorithm will end at some point

19 Monte Carlo Advantages: Simple implementation Can be changed quickly Easy observation of behavior Good documentation Good for partial information situations

20 Monte Carlo Disadvantages: Generally not smart Dependent on the evaluation function Computationally slow Tree grows very fast

21 Monte Carlo Experiments MC against lower-depth MC PlayerWinsLossesDraw MC285949 MC-LD592849

22 Monte Carlo Experiments MC against no-depth MC PlayerWinsLossesDraw MC15212 MC-ND21512

23 Monte Carlo Experiments MC against deeper-depth but narrower MC PlayerWinsLossesDraw MC5211 MC-DDN2511

24 Monte Carlo Experiments MC against narrower MC PlayerWinsLossesDraw MC621885 MC-N186285

25 Genetic Algorithm Evolve weights of the terms in the evaluation functions AI uses standard expectiminimax search tree Evolution strategies (evolution paremeters are themselves evolved)

26 Genetic Algorithm Genome: Mutation:

27 Genetic Algorithm Crossover: σ and α of parents average weights: Averaged if Else randomly chosen from parents

28 Genetic Algorithm Fitness function: Win bonus Number of own pieces left Number of turns spent

29 Genetic Algorithm Reference AI: Monte Carlo AI Self-selecting reference genome Select average genome from each generation Pick winner between this genome and previous reference

30 Hill climbing The GA takes too long to train Hill climbing is faster

31 Opponent modeling Observing moves Ruling out pieces Stronger pieces are moved towards you Weaker pieces are moved away

32 Opponent modeling No knowledge about enemy pieces at the start Updating the probabilities Update the probability of the moving piece Update probabilities of all other pieces

33 Monte Carlo Experiments MC against MC with opponent modeling using a database of Human versus human games PlayerWinsLossesDraw MC394458 MC-OM443958

34 Monte Carlo Experiments MC against MC with opponent modeling using a database of MC versus MC games PlayerWinsLossesDraw MC MC-OM

35 Strategy Split the game up into phases Exploration phase Until 25% of enemy pieces are identified Elimination phase Until 70% of enemy pieces are killed End-game phase Alter the evaluation function

36 Conclusion Both AIs are very slow The genetic AI takes too long to train In case of Stratego, tweaking a few weights may not be an optimal way to create an intelligent player

1 UM Stratego Colin Schepers Daan Veltman Enno Ruijters Leon Gerritsen Niek den Teuling Yannick Thimister.

Similar presentations

Presentation on theme: "1 UM Stratego Colin Schepers Daan Veltman Enno Ruijters Leon Gerritsen Niek den Teuling Yannick Thimister."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 UM Stratego Colin Schepers Daan Veltman Enno Ruijters Leon Gerritsen Niek den Teuling Yannick Thimister.

Similar presentations

Presentation on theme: "1 UM Stratego Colin Schepers Daan Veltman Enno Ruijters Leon Gerritsen Niek den Teuling Yannick Thimister."— Presentation transcript:

Similar presentations

About project

Feedback