Download presentation
Presentation is loading. Please wait.
1
1 Teck H. Ho April 8, 2004 Outline In-Class Experiment and Motivation Adaptive Experience-Weighted Attraction (EWA) Learning in Games: Camerer and Ho (Econometrica, 1999) Sophisticated EWA Learning and Strategic Teaching: Camerer, Ho, and Chong (JET, 2002) Self-tuning EWA Learning (EWA Lite): Ho, Camerer, and Chong (2004)
2
2 Teck H. Ho April 8, 2004 Median Action Game
3
3 Teck H. Ho April 8, 2004 Continental Divide Game
4
4 Teck H. Ho April 8, 2004 The learning setting Normal-form game where each player is aware of the payoff table Player i’s strategy space consists of discrete choices indexed by j (e.g., 1, 2,…, 13, 14) The game is repeated for several rounds At each round, all players observed: Strategy or action history of all other players Own payoff history
5
5 Teck H. Ho April 8, 2004 Research question To develop a good descriptive model of adaptive learning to predict the probability of player i (i=1,…,n) choosing strategy j at round t
6
6 Teck H. Ho April 8, 2004 Criteria of a “good” model Use (potentially) all available information subjects receive in a sensible way Satisfies plausible principles of behavior (i.e., conformity with other sciences such as psychology) Fits and predicts choice behavior well Ability to generate new insights As simple as the data allow
7
7 Teck H. Ho April 8, 2004 Competing Models Introspection ( P j ): Requires too much human cognition Nash equilibrium (Nash, 1950) Quantal response equilibrium (McKelvey and Palfrey, 1995) Evolution ( P j (t) ): Players are pre-programmed Replicator dynamics (Friedman, 1991) Genetic algorithm (Holland, 1975; Ho, 1996) Learning ( P i j (t) ): Uses about the right level of cognition Reinforcement (Roth and Erev, 1995) Belief-based learning Cournot best-response dynamics (Cournot, 1838) Simple Fictitious play (Brown, 195) Weighted Ficitious play (Fudenberg and Levine, 1998) Directional learning (Selten, 1991) Imitation (Schlag, 1998; 1999) Experience-weighted attraction learning (Camerer and Ho, 1999)
8
8 Teck H. Ho April 8, 2004 Information Usage in Learning Choice reinforcement learning (Thorndike, 1911; Bush and Mosteller, 1955; Herrnstein, 1970; Erev and Roth, 1998): successful strategies played again Belief-based Learning (Cournot, 1838; Brown, 1951; Fudenberg and Levine 1998): form beliefs based on opponents’ action history and choose according to expected payoffs The information used by reinforcement learning is own payoff history and by belief-based models is opponents action history EWA uses both kinds of information
9
9 Teck H. Ho April 8, 2004 “Laws” of Effects in Learning Law of actual effect: successes increase the probability of chosen strategies Teck is more likely than Colin to “stick” to his previous choice (other things being equal) Law of simulated effect: strategies with simulated successes will be chosen more often Colin is more likely to switch to T than M Law of diminishing effect: Incremental effect of reinforcement diminishes over time $1 has more impact in round 2 than in round 7. 8 5 4 L R 8 10 9 T M B Row player’s payoff table Colin chose B and received 4 Teck chose M and received 5 Their opponents chose L
10
10 Teck H. Ho April 8, 2004 Assumptions of Reinforcement and Belief Learning Reinforcement learning ignores simulated effect Belief learning predicts actual and simulated effects are equally strong EWA learning allows for a positive (and smaller than actual) simulated effect
11
11 Teck H. Ho April 8, 2004 The EWA Model Initial attractions and experience (i.e., ) Updating rules Choice probabilities
12
12 Teck H. Ho April 8, 2004 EWA Model and Laws of Effects Law of actual effect: successes increase the probability of chosen strategies (positive incremental reinforcement increases attraction and hence probability) Law of simulated effect: strategies with simulated successes will be chosen more often ( > 0 ) Law of diminishing effect: Incremental effect of reinforcement diminishes over time ( N(t) >= N(t-1) )
13
13 Teck H. Ho April 8, 2004 The EWA model: An Example Period 0: Period 1: TBTB L R 8484 9 10 Row player’s payoff table History: Period 1 = (B,L)
14
14 Teck H. Ho April 8, 2004 Reinforcement Model: An Example Period 0: Period 1: TBTB L R 8484 9 10 Row player’s payoff table History: Period 1 = (B,L)
15
15 Teck H. Ho April 8, 2004 Belief-based(BB) model: An Example Period 0: Period 1: TBTB L R 8484 9 10 Bayesian Learning with Dirichlet priors
16
16 Teck H. Ho April 8, 2004 Model Interpretation Simulation or attention parameter ( ) : measures the degree of sensitivity to foregone payoffs Exploitation parameter ( ): measures how rapidly players lock-in to a strategy (average versus cumulative) Stationarity or motion parameter ( ): measures players’ perception of the degree of stationarity of the environment
17
17 Teck H. Ho April 8, 2004 Model Interpretation Cournot Weighted Fictitious Play Fictitious Play Average Reinforcement Cumulative Reinforcement
18
18 Teck H. Ho April 8, 2004 New Insight Reinforcement and belief learning were thought to be fundamental different for 50 years. For instance, “….in rote [reinforcement] learning success and failure directly influence the choice probabilities. … Belief learning is very different. Here experiences strengthen or weaken beliefs. Belief learning has only an indirect influence on behavior.” (Selten, 1991) EWA shows that belief and reinforcement learning are related and special kinds of EWA learning
19
19 Teck H. Ho April 8, 2004 Actual versus Belief-Based Model Frequencies: Median Action Game
20
20 Teck H. Ho April 8, 2004 Actual versus Reinforcement Model Frequencies: Median Action Game
21
21 Teck H. Ho April 8, 2004 Actual versus EWA Model Frequencies: Median Action Game
22
22 Teck H. Ho April 8, 2004 Estimation and Results
23
23 Teck H. Ho April 8, 2004 Actual versus Belief-Based Model Frequencies: CDG
24
24 Teck H. Ho April 8, 2004 Actual versus Reinforcement Model Frequencies: CDG
25
25 Teck H. Ho April 8, 2004 Actual versus EWA Model Frequencies: CDG
26
26 Teck H. Ho April 8, 2004
27
27 Teck H. Ho April 8, 2004
28
28 Teck H. Ho April 8, 2004 Extensions Heterogeneity (Camerer and Ho, 1999) Payoff learning (Camerer, Ho, and Wang, 2000) Sophistication and strategic teaching (Camerer, Ho, and Chong, 2002) Self-tuning EWA (or EWA Lite) (Ho, Camerer, and Chong, 2004) Applications: Signaling games (Anderson and Camerer, 1999) Auction markets (Camerer, Ho, and Hsia, 2000) Product Choice at Supermarkets (Ho and Chong, 2003)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.