Download presentation
Presentation is loading. Please wait.
Published byAndrew Fox Modified over 9 years ago
1
26 December 2015All Rights Reserved, Edward Tsang Evolutionary Bargaining Player A Player B Game theoryGame theory: Two players alternative offering gameTwo players alternative offering game Subgame perfect equilibrium found Slight game modification Slight game modification Laborious work on new solutions Perfect rationality assumptionrationality Technical details (non-trivial) Co-evolution Incentive methods iIncentive methods invented What is my share?? Proposal: EC for approximating solutions on new games
2
Evolutionary Bargaining Strategies Edward Tsang Nanlin Jin Acknowledgement: Abhinay Muthoo
3
26 December 2015All Rights Reserved, Edward Tsang Evolutionary Bargaining Conclusions Demonstrated GP’s flexibility –Models with known and unknown solutionsModels with known and unknown solutions –Outside option –Incomplete, asymmetric and limited information Co-evolution is an alternative approximation method to find game theoretical solutions –Perfect rationality assumption relaxed –Relatively quick for approximate solutions –Relatively easy to modify for new models Genetic Programming with incentive / constraints –Constraints helped to focus the search in promising spaces Lots remain to be done…
4
26 December 2015All Rights Reserved, Edward Tsang Two players alternative offering game Player 1: How about 70% for me 30% for you? Player 2: No, I want 60% Player 1: No, how about 50-50? Player 2: No, I want at least 55% If neither players have any incentive to compromise, this can go on for ever Payoff drops over time – incentive to compromise A’s Payoff = x A exp(– r A tΔ) A’s Payoff Let r 1 be 0.1, Δ be 1 t = 0, Player 1’s pay off is 70% t = 2, Player 1’s pay off is 50% e 0.1 2 = 41%
5
26 December 2015All Rights Reserved, Edward Tsang Payoff decreases over time
6
26 December 2015All Rights Reserved, Edward Tsang Bargaining in Game Theory Rubinstein 1982 Model: = Cake to share between A and B (= 1) A and B make alternate offers x A = A’s share (x B = – x A ) r A = A’s discount rate t = # of rounds, at time Δ per round A’s payoff x A drops as time goes by A’s Payoff = x A exp(– r A tΔ) Important Assumptions: –Both players rational –Both players know everything Equilibrium solution for A: A = (1 – B ) / (1 – A B ) where i = exp(– r i Δ) Notice: No time t here 0 ? xAxA xBxB AB Optimal offer: x A = A at t=0 In reality: Offer at time t = f (r A, r B, t) Is it necessary? Is it rational? (What is rational?)(What is rational?)
7
26 December 2015All Rights Reserved, Edward Tsang Evolutionary Bargaining Strategies Prisoners’ Dilemma –Co-evolution –GA vs PBIL Bubinstein’s Models –Offer at time t = f (r A, r B, t) –x A *, x B * emerged as best results –Other solutions evolved occasionally Current work –Asymmetric information –Outside options
8
26 December 2015All Rights Reserved, Edward Tsang Evolutionary Rubinstein Bargaining Overview Game theorists solved Rubinstein bargaining problem –Subgame Perfect Equilibrium (SPE) Slight problem alterations lead to different solutions –E.g. Outside option, Asymmetric info, Different time scales Evolutionary computation –Succeeded in solving a wide range of problems –What can it do for finding subgame equilibrium? Co-evolution is an alternative approximation method to find game theoretical solutions –Less time for approximate SPEs –Less modifications for new problems
9
Evolutionary Computation for Bargaining Technical Details
10
26 December 2015All Rights Reserved, Edward Tsang Issues Addressed in EC for Bargaining Representation –Should t be in the language? One or two population? How to evaluate fitness –Fixed or relative fitness? How to contain search space? Discourage irrational strategies: –Ask for x A >1? –Ask for more over time? –Ask for more when A is low? / AA BB 1 BB 1 Details of GP
11
26 December 2015All Rights Reserved, Edward Tsang Two populations – co-evolution We want to deal with asymmetric games –E.g. two players may have different information One population for training each player’s strategies Co-evolution, using relative fitness –Alternative: use absolute fitness Evolve over time Player 1 … Player 2 ………
12
26 December 2015All Rights Reserved, Edward Tsang Representation of Strategies A tree represents a mathematical function g Terminal set: {1, A, B } Functional set: {+, , ×, ÷} Given g, player with discount rate r plays at time t g × (1 – r) t Language can be enriched: –Could have included e or time t to terminal set –Could have included power ^ to function set Richer language larger search space harder search problem
13
26 December 2015All Rights Reserved, Edward Tsang Incentive Method: Constrained Fitness Function No magic in evolutionary computation –Larger search space less chance to succeed Constraints are heuristics to focus a search –Focus on space where promising solutions may lie Incentives for the following properties in the function returned: –The function returns a value in (0, 1]The function returns a value in (0, 1] –Everything else being equal, lower A smaller shareEverything else being equal, lower A smaller share –Everything else being equal, lower B larger shareEverything else being equal, lower B larger share Note: this is the key to search effectiveness
14
26 December 2015All Rights Reserved, Edward Tsang Incentives for Bargaining F(g i ) =g i GF(s(g i )) is the game fitness (GF) of a strategy (s) based on the function g i, which is generated by genetic programmingstrategy (s)g i When g i is outside (0, 1], the strategy does not enter the games; the bigger |g i |, the lower the fitness GF(s(g i )) GF(s(g i )) + B B ≥ 3 (tournament selection is used) If g i in (0,1] & SM i SM i > 0 & SM j > 0SM j GF(s(g i )) GF(s(g i )) + ATT(i) + ATT(j)ATT(i) ATT(j)If g i in (0,1] & (SM i ≤ 0 or SM j ≤ 0)SM iSM j ATT(i) ATT(i) + ATT(j) – e (–1/|gi|)ATT(j) If g i NOT in (0,1]
15
26 December 2015All Rights Reserved, Edward Tsang C1: Incentive for Feasible Solutions The function returns a value in (0, 1] The function participates in game plays The game fitness (GF) is measured A bonus (B) incentive is added to GF –B is set to 3 –Since tournament is used in selection, the absolute value of B does not matter (as long as B>3) f
16
26 December 2015All Rights Reserved, Edward Tsang C2: Incentive for Rational A Everything else being equal, lower A smaller share for A Given a function g i : The sensitive measure SM i ( i, j, α) measures how much g i decreases when i increases by αSM i ( i, j, α) 1 If SM i ( i, j, α) > 1 e –(1/ SMi( i, j, α)) If SM i ( i, j, α) ≤ 1 Attribute ATT(i) = f 0 < ATT(i) ≤ 1
17
26 December 2015All Rights Reserved, Edward Tsang C3: Incentive for Rational B Everything else being equal, lower B larger share for A Given a function g i : The sensitive measure SM j ( i, j, α) measures how much g i increases when j increases by αSM j ( i, j, α) 1 If SM j ( i, j, α) > 1 e –(1/ SMj( i, j, α)) If SM j ( i, j, α) ≤ 1 Attribute ATT(j) = f 0 < ATT(j) ≤ 1
18
26 December 2015All Rights Reserved, Edward Tsang Sensitivity Measure (SM) for i SM i ( i, j, α) = [g i ( i (1 + α), j ) g i ( i, j )] ÷ g i ( i, j ) If i (1 + α) < 1 [g i ( i, j ) g i ( i (1 α), j )] ÷ g i ( i, j ) If i (1 + α) 1 SM i measures the derivatives of g i -- how much g i decreases when i increases by α (g i is the function which fitness is to be measured) f
19
26 December 2015All Rights Reserved, Edward Tsang Sensitivity Measure (SM) for j SM j ( i, j, α) = [g i ( i, j ) g i ( i, j (1 + α))] ÷ g i ( i, j ) If j (1 + α) < 1 [g i ( i, j (1 α)) g i ( i, j )] ÷ g i ( i, j ) If j (1 + α) 1 SM j measures the derivatives of g i -- how much g i increases when j increases by α (g i is the function which fitness is to be measured) f
20
26 December 2015All Rights Reserved, Edward Tsang Bargaining Models Tackled Determin ants Complete Information Uncertainty 1-sided2-sided Discount Factors * Rubinstein 82* Rubinstein 85 x Imprecise info Ignorance x Bilateral ignorance + Outside Options * Binmore 85 x Uncertainty + Outside Options More could be done easily * = Game theoretical solutions knownGame theoretical solutions known x = game theoretic solutions unknown game theoretic solutions unknown
21
26 December 2015All Rights Reserved, Edward Tsang Models with known equilibriums Complete Information Rubinstein 82 model: –Alternative offering, both A and B know A & B Binmore 85 model, outside options: –As above, but each player has an outside offer, w A and w B Incomplete Information Rubinstein 85 model: –B knows A & B –A knows A –A knows B is w with probability w 0, s (> w ) otherwise
22
26 December 2015All Rights Reserved, Edward Tsang Models with unknown equilibriums Modified Rubinstein 85 / Binmore 85 models: 1-sided Imprecise information –B knows A & B ; A knows A and a normal distribution of B 1-sided Ignorance –B knows both A and B ; A knows A but not B 2-sided Ignorance –B knows B but not A ; A knows A but not B Rubinstein 85 + 1-sided outside option
23
26 December 2015All Rights Reserved, Edward Tsang Equilibrium with Outside Option xA*xA* Conditions AA w A A A w B B B 1 w B w A A (1 w B )w B > B B B w A +(1 B )w A > A A w B B (1 w A ) 1 w B w A > A (1 w B )w B > B (1 w A ) wAwA w A + w A > 1–
24
26 December 2015All Rights Reserved, Edward Tsang Equilibrium in Uncertainty – Rub85 2 = w 2 = s x1*x1* t*t* x1*x1* t*t* w 0 < w* VsVs 0 VsVs 0 w 0 > w*x w0 0 1 – ((1 – x w0 ) / w ) 1
25
26 December 2015All Rights Reserved, Edward Tsang Rubinstein Solution vs Experimental Results ( A, B ) Rubinstein Solution x A ’ Experimental x A μσ (0.4, 0.4)0.71430.89730.0247 (0.4, 0.6)0.52630.50900.0096 (0.4, 0.9)0.15630.14690.1467 (0.9, 0.4)0.93750.91070.0106 (0.9, 0.6)0.86960.80000.1419 (0.9, 0.9)0.52630.50650.1097 (0.9, 0.99)0.09170.14740.1023 x A : agreement made by the best strategies in the final (300 th ) generation Population size 100; Crossover rates 0 to 0.1; Mutation rates 0.01 to 0.5; Tournament size 3
26
Running GP in Bargaining Representation, Evaluation Selection, Crossover, Mutation
27
26 December 2015All Rights Reserved, Edward Tsang Representation Given A and B, every tree represents a constant + 1 BB AA 1 AA / AA BB 1 BB 1
28
26 December 2015All Rights Reserved, Edward Tsang Population Dynamics Population of strategies Select, Crossover, Mutation New Population Population of strategies Select, Crossover, Mutation New Population Player 1Player 2 Evaluate Fitness
29
26 December 2015All Rights Reserved, Edward Tsang Evaluation Given the discount factors, each tree is translated into a constant x –It represents the demand represented by the tree. All trees where x 1 are evaluated using rules defined by the incentive methodincentive method All trees where 0 x 1 enter game playing Every tree for Player 1 is played against every tree for Player 2
30
26 December 2015All Rights Reserved, Edward Tsang Evaluation Through Bargaining Incentive method ignored here for simplicity.46.31.65.20 Player 1 Fitness.75000 0.75.24 0.96.36 0 1.08.590 0 1.18 Player 1 Demands Demands by Player 2’s strategies
31
26 December 2015All Rights Reserved, Edward Tsang Selection A random number r between 0 and 1 is generated If, say, r=0.48 (which is >0.43), then rule R3 is selected Rule (Demand)FitnessNormalizedAccumulated R1 (0.75)0.750.19 R2 (0.96)0.960.240.43 R3 (1.08)1.080.270.70 R4 (1.18)1.180.301.00 Sum:3.971
32
26 December 2015All Rights Reserved, Edward Tsang Crossover + 1 BB AA 1 AA / AA BB 1 BB 1 AA / 1 BB 1 + 1 BB 1 AA AA BB Parents Offspring
33
26 December 2015All Rights Reserved, Edward Tsang Mutation With a small probability, a branch (or a node) may be randomly changed + 1 BB 1 AA AA BB + BB AA / 1 AA AA BB
34
Discussions on Bargaining http://cswww.essex.ac.uk/CSP/bargain
35
26 December 2015All Rights Reserved, Edward Tsang What is Rationality? Are we all logical? What if Computation is involved? Does Consequential Closure hold? –If we know P is true and P Q, then we know Q is true –We know all the rules in Chess, but not the optimal moves due to combinatorial explosioncombinatorial explosion “Rationality” depends on computation power! –Think faster “more rational” (The CIDER Theory)The CIDER Theory
36
26 December 2015All Rights Reserved, Edward Tsang Combinatorial Explosion 1 2 3 4 5 6 7 8 ABCD EFG H 124816 3264 128 10 19 Put 1 penny in square 1 2 pennies in square 2 4 pennies in square 3, etc. Even the world’s richest man can’t afford it –10 19 p = £100,000,000 Billion
37
26 December 2015All Rights Reserved, Edward Tsang Stochastic Search, Motivation Schedule 30 jobs to 10 machines: –Search space: 10 30 leaf nodes Generously allow: –Explore one in every 10 10 leaf nodes! –Examine 10 10 nodes per second! Problem may take 300 years to solve!!! –May be lucky to find first solution –But finding optimality takes time –Complete methods limited by combinatorial explosioncombinatorial explosion
38
Game Theory Hall of Frame John HarsanyiJohn NashReinhard Selten Robert AumannThomas Schelling 1994 Nobel Prize 2005 Nobel Prize Alvin RothLloyd Shapley 2012 Nobel Prize
39
26 December 2015All Rights Reserved, Edward Tsang 1994 Nobel Economic Prize Winners John Harsanyi (Berkeley) Incomplete information John Forbes Nash John Forbes Nash (Princeton) Non-cooperative games Reinhard SeltenReinhard Selten (Bonn) Bounded rationality (after Herbert Simon) Experimental economicsHerbert Simon
40
26 December 2015All Rights Reserved, Edward Tsang 1978 Nobel Economic Prize Winner Artificial intelligence “For his pioneering research into the decision- making process within economic organizations" “The social sciences, I thought, needed the same kind of rigor and the same mathematical underpinnings that had made the "hard" sciences so brilliantly successful. ” Bounded Rationality –A Behavioral model of Rational Choice 1957 Sources: http://nobelprize.org/economics/laureates/1978/ http://nobelprize.org/economics/laureates/1978/simon-autobio.htmlhttp://nobelprize.org/economics/laureates/1978/ Herbert Simon (CMU) Artificial intelligence
41
26 December 2015All Rights Reserved, Edward Tsang 2005 Nobel Economic Prizes Winners Robert J. Aumann, and Thomas C. Schelling won 2005’s Noel memorial prize in economic sciences For having enhanced our understanding of conflict and cooperation through game-theory analysis Robert J. Aumann 75 Thomas C. Schelling 84 Source: http://www.msnbc.msn.com/id/9649575/ Updated: 2:49 p.m. ET Oct. 10, 2005http://www.msnbc.msn.com/id/9649575/
42
26 December 2015All Rights Reserved, Edward Tsang Robert J Aumann Winner of 2005 Nobel Economic Prize Born 1930 Hebrew Univ of Jerusalem & US National Academy of Sciences “Producer of Game Theory” (Schelling) Repeated games Defined “Correlated Equilibrium” –Uncertainty not random –But depend on info on opponent Common knowledge
43
26 December 2015All Rights Reserved, Edward Tsang Thomas C Schelling Winner of 2005 Nobel Economic Prize Born 1921 University of Maryland “User of Game Theory” (Schelling) Book “The Strategy of Conflict” 1960 –Bargaining theory and strategic behavior “Book Arms and Influence” 1966 –foreign affairs, national security, nuclear strategy,... Paper “Dynamic models of segregation” 1971 –Small preference to one’s neighbour segregation
44
26 December 2015All Rights Reserved, Edward Tsang Alvin E Roth Winner of 2012 Nobel Economic Prize Born 1951 Columbia, Stanford “Game theory for real world problems” Case study in game theory –Stable Marriage Problem –National Resident Matching Problem
45
26 December 2015All Rights Reserved, Edward Tsang Lloyd Shapley Winner of 2012 Nobel Economic Prize Born 1923 UCLA Mathematical economics Game theory –Including stochastic games –Applications include the stable marriage problem Shapley-Shubik power index –For ranking planning and group decision-making
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.