Download presentation
Presentation is loading. Please wait.
1
Competition between adaptive agents: learning and collective efficiency Damien Challet Oxford University Matteo Marsili ICTP-Trieste (Italy) challet@thphys.ox.ac.uk ● My definition of the Minority Game ● Simple worlds (M= 0) ● Markovian behavior ● Neural networks ● Reinforcement learning ● Multistate worlds (M> 0) ● Cause of large inefficiencies ● Remedies ● From El Farol to MG and back
2
'Truth is always in the minority' Kierkegaard
3
Zig-Zag-Zoug ● Game played by Swiss children ● 3 players, 3 feet, 3 magic words ● “Ziiig”... “Zaaag”.... “ZOUG!”
4
Minority Game ● Zig-Zag-Zoug with N players ● Aim: to be in the minority ● Outcome = #UP-#DOWN = #A-#B ● Model of competition between adaptive players Challet and Zhang (1997), from El Farol's bar problem (Arthur 1994)
5
Initial goals of the MG El Farol (1994): impossible to understand Drastic simplification, keeping key ingredients Bounded rationality Reinforcement learning Symmetrize the problem: 60/100 -> 50/50 Understand the symmetric problem Generalize results to the asymmetric problem
6
Repeated games Why playing again ? Frustration Losers in majority How to play ? Deduction Rationality Best answer All lose ! Induction Limited capabilities Beliefs, strategies, personality Trial and error Learning
7
Minority Game a 1 ( t) a 2 ( t) a N ( t)... A(t) = i a i (t) Payoff player i -a i (t)A(t) N agents i=1,..., N Choice a i (t) +1 Total losses = A 2
8
Markovian learning 'If it ain't broken, don't fix it' (Reents et al., Physica A 2000: If I won, I stick to my previous choice If I lost, I change to the other choice with prob p Results: ( s 2 = 2 ) ● pN = x = cst (small p): 2 = 1 + 2x (1+ x/6) ● p~ N 1/2 2 ~ N ● p~ 1 2 ~ N 2
9
Markovian learning II Problem: if N unknown, p= ? Try: p= f(t) e.g. p= t -k Convergence for any N Freezing When to stop ?
10
Neural networks Simple perceptrons, learning rate R (Metzler ++ 1999) 2 = N + N(N-1)F(N,R) min 2 = N (1-2/ ) = 0.363... N
11
Reinforcement learning ● Each player has a register D i ● D i > 0 + is better ● D i < 0 - is better ● D i (t+1) = D i (t) – A(t) ● Choice: prob(+ | D i ) = f(D i ) f '(x) > 0 (RL)
12
Reinforcement learning II ● Central result: agents minimize 2 (predictability) for all f ● Stationary state: = 0 ● Fluctuations = ? ● Ex: f(x)=(1+tanh(K x))/2 exponential learning, K learning rate ● K< K c ~ N ● K> K c 2 ~ N 2
13
Market Impact: each agent has an influence on the outcome ● Naive agents: payoff- A = - A -i -a i ● Non-naive agents: payoff- A + c a i ● Smart agents: payoff - A -i cf WLU, AU ● Central result 2: non-naive agents minimize (fluctuations) for all f -> Nash equilibrium Reinforcement learning III ~ 1
14
Summary
15
Minority Games with memory If an agent believes that the outcome depends on the past results, the outcome will depend on the past results. Sun spot effect Self-fulfilling prophecies Fallacies of casual inference Consequence: The other agents will change their behavior accordingly
16
=P/N s 2 /N Minority Games with memory: naïve agents Fixed randomly drawn strategies = quenched disorder Tools of statistical physics give the exact solution in principle Agents minimize the predictability Predictability = Hamiltonian Optimization problem Numeric: Savit++ PRL99 Analytic: Challet++ PRL99 Coolen+ J. Phys A 2002 ?
17
Minority Games with memory: low efficiency = P/N
18
Minority Games with memory: low efficiency P/N is not the right scaling for large fluctuations
19
Minority Games with memory: origin of low efficiency Stochastic dynamical equation for strategy score U i slow varying part + correlated noise I: Size independent II = K P -1/2 When I << II, large fluctuations Transition at I / K = G / P 1/2 Critical signal to noise ratio = G / P 1/2
20
Minority Games with memory: origin of low efficiency Check: Determine G Predict critical points I/K G / P 1/2
21
Minority Games with memory: origin of low efficiency BEFORE AFTER
22
Minority Games with memory: origin of low efficiency
23
Minority Games with memory: sophisticated agents Agents minimize fluctuations Optimization problem again
24
Reverse problem Many variations, different global utility functions ● Grand canonical game (play or not play) ● Time window of scores (exponential moving average) ● Any payoff Hence, given a task (global utility function), one knows how to design agents (local utility). example: optimal defects combinations (cf. Neil's talk)
25
From El Farol to MG and back El Farol 0 N L MG 0 N L = N/2 Differences, similarities? Which results from MG are valid for El Farol?
26
From El Farol to MG and back 0 N L Theorem: all results from MG apply to El Farol N Everything scales like (L/N – )/ = P ½ The El Farol problem with P states of the world is solved.
27
From El Farol to MG and back: new results If (L/N – )/ = P ½ 0, P>P c = 2 S 2 / [ (L/N- ) 2 ]: no more phase transition.
28
Summary AU/WLU suppresses large fluctuations -> Nash equilibrium Design: agents must know they have an impact. The knowledge of the exact impact not crucial Reverse problem also possible MG: simple, rich, fun, and useful www.unifr.ch/econophysics/minority 102 commented references
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.