Download presentation
Presentation is loading. Please wait.
Published byOphelia Hudson Modified over 9 years ago
1
Artificial Intelligence in Game Design Introduction to Learning
2
Learning and Games Learning in AI: –Creating new rules automatically Observation of world Examples of good/bad actions to take –Major goal of AI Would be very useful in gaming –Automatic adaptation to player tactics –Infinite replayability Would be impossible for player to create strategy that would win forever You can defeat me now, but I shall return smarter than ever!
3
Learning and Games Fairer and more plausible NPC behavior Characters should have same learning curve as players –Start out inexperienced –Become more competent over time Example: Simple “cannon fire game” –Could use physics to compute exact angle, but would win first turn! –Character should miss badly at first –Should “learn” to get closer over time
4
Learning in AI Basic components: Inputs from environment Current Rules Actions indicated by rules Critic Determines how good or bad action was Often in terms of some error Learning Element Determines how to change rules in order to decrease error
5
Learning in AI Learning algorithms in AI –Neural networks –Probabilistic learning –Genetic learning algorithms Common attributes –Requires time Usually thousands of cycles –Results are unpredictable Will create any rules that decrease error, not necessarily the ones that make the most sense in a game Still very limited –No algorithm to automatically generate something as complex as a FSM Not what you want to happen in a game! “Create an opponent that can defeat Data”
6
Online Learning Learning most useful if occurs during game –Must be as efficient as possible Simple methods best –Hill climbing –N-Gram prediction –Decision tree learning Most successful methods often specific to game –Example: Negative influences at character destruction locations -2 -2 -2 Unknown enemy unit Our unit destroyed Other units steer around this area
7
Scripted Learning Can “fake” appearance of learning –Player performs action –Game AI knows best counteraction, but does not perform it –Game AI allows player certain number of that action before beginning to perform counteraction Like timeout Number could be chosen at random –Gives appearance that character has “learned” to perform counteraction Player allowed to attack from right for certain number of turns AI begins to defend from right after that point
8
Scripted Learning Scripting in cannon game: –Compute actual best trajectory using physics –Add error factor E to computation –Decrease error E over time at rate Δ E Test different values of Δ E to make sure learns at “same rate” as typical player Can also different values of Δ E to set “difficulty” level Correct trajectory Large E Small E
9
Hill Climbing Simple technique for learning optimal parameter values Character AI described in terms of configuration of parameter values V = (v 1, v 2, … v n ) –Example: Action probabilities for Oswald –V = (P left, P right, P defend ) Attack Left45% Attack Right30% Defend25% Oswald’s current V = (0.45, 0.30, 0.25)
10
Hill Climbing Each configuration of parameter values V = (v 1, v 2, … v n ) has error measure E(V ) –Often an estimate based on success of last action(s) Example: Total damage taken by Oswald – Total damage caused by Oswald’s last 3 actions Good enough for hill climbing Goal of learning: Find V such that E(V ) is minimized –Or at least “good enough” Attack Left35% Attack Right25% Defend40% Configuration with low error measure
11
Hill Climbing Hill climbing works best for –Single parameter –Correctness measure which is easy to compute Example: “cannon game” –Only parameter: Angle Ө of cannon –Error measure: Distance between target and actual landing point Error Ө
12
Error Space Graphical representation of relationship between parameter value and correctness Hill climbing = finding “lowest point” in this space Ө Error Optimal Ө Ө Error = 0 Maximum correctness
13
Hill Climbing Algorithm Assumption: –Small change in one direction increases correctness –Will eventually reach optimal value if keep changing in that direction Ө Error Ө2Ө2 Ө3Ө3 Ө1Ө1 Direction of decreasing error Ө3Ө3 Ө2Ө2 Ө1Ө1
14
Hill Climbing Algorithm Estimate direction of slope in local area of error space –Must sample values near E(Ө) E(Ө + ε) E(Ө - ε) Move in direction of decreasing error –Increase/decrease Ө by some given step size δ –If E(Ө + ε) < E(Ө - ε) then Ө = Ө + δ –Else Ө = Ө – δ Ө Ө+εӨ+εӨ-εӨ-ε Ө + δ
15
Multidimensional Error Space Exploring multiple parameters simultaneously –Probabilities for Attack Left, Attack Right, Defend –Ability to control “powder charge” C for cannon as well as angle Ө Vary parameters slightly in all dimensions –E(Ө + ε, C + ε) E(Ө + ε, C – ε) –E(Ө – ε, C + ε) E(Ө – ε, C – ε) Choose combination with lowest error Ө 1 C 1 I need to increase both the angle and the charge
16
Multidimensional Error Space Can have too many parameters –n parameters = n dimensional error space –Will usually “wander” space, never finding good values If using learning keep problem simple –Few parameters (one or two best) –Make sure parameters have independent effect on error Increased charge, angle both increase distance Ө 1 C 1 I could also move up a hill, or check the wind direction…
17
Hill Climbing Step Size Choosing a good step size δ –Too small: learning takes too long –Too large: learning will “jump over” optimal value Ө2Ө2 Ө1Ө1 This guy is an idiot! Ө2Ө2 Ө1Ө1
18
Hill Climbing Step Size Adaptive Resolution –Keep track of previous error E (Ө T-1 ) If E (Ө T ) < E (Ө T-1 ) assume moving in correct direction –Increase step size to get there faster δ = δ + κ Ө2Ө2 Ө1Ө1 Ө3Ө3
19
Hill Climbing Step Size If E (Ө T ) > E (Ө T-1 ) assume overshot optimal value –Decrease step size to avoid overshooting on way back δ = δ × ρ, ρ < 1 –Idea: decrease step size fast Main goal: Make character actions plausible to player –Should make large changes if miss badly –Should make small changes if near target Ө1Ө1 Ө3Ө3 Ө2Ө2
20
Local Minima in Error Space Major assumption: Error space monotonically decreases as move towards goal Multiple shots with same result – no decrease in error
21
Local Minima in Error Space Local minima in error space –Places where apparent error does not decrease as get closer to optimum value –Simple hill climbing can get stuck Ө Error Optimal ӨLocal minima Hill climbing will not escape!
22
May need to restart with different initial value –Use randomness –Something very different from last starting point –Plausible behavior – if current actions not working, try something new Local Minima in Error Space Multiple shots with same result Very different result
23
Memory and Learning What if player moves? –Should not have to restart learning process –Should keep appearance that character is slowly improving aim Should quickly adapt to changes in player strategy Ө3Ө3 Ө2Ө2 Ө1Ө1 Ө4Ө4
24
Memory and Learning Remember previous actions and effects –Store each angle Ө tried and resulting distance D(Ө) –If player moves to location L, start from Ө whose D(Ө) is closest to L Ө3Ө3 Ө2Ө2 Ө1Ө1 Closest to new player location is Ө 2 D(Ө2)D(Ө2)D(Ө3)D(Ө3)D(Ө1)D(Ө1) D(Ө2)D(Ө2)D(Ө3)D(Ө3)D(Ө1)D(Ө1)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.