Download presentation
Presentation is loading. Please wait.
Published byHester Payne Modified over 9 years ago
1
1 Learning by Duopoly Agents Steve Kimbrough Fred Murphy INFORMS, November 7, 2006, 8:00-9:30 File: kimbrough-murphy-informs-2006fm-1.ppt
2
2 Abstract as Published Title: Learning by Duopoly Agents in Bidding for Day-Ahead Electricity Supply Presenting Author: Fred Murphy,Professor, Temple University, The Fox School of Business, 108 Speakman Hall, 1810 N. 13th Street, Philadlephia PA 19122, United States, fmurphy@temple.edu Co-Author: Steve Kimbrough,Professor, University of Pennsylvania, 3730 Walnut Street, Suite 500, Philadelphia PA 19104, United States, kimbrough@wharton.upenn.edu Abstract: Standardly, bids by distinct firms in the day-ahead market for electricity are combined to produce a kinked supply curve. We report results from an agent-based model in which a stochastic demand curve for electricity is given exogenously and in which two agents learn to bid to supply electricity. We report on the design of the model, the behavior of various learning regimes, and the conditions under which tacit collusion by the bidding agents may be arrived at, sustained, and destroyed.
3
3 Problems with Classic Economic Models Stylized so that it is possible to derive analytic results (the world is complex) Assumes actors have full information (no one has this) Assumes actors have a clear objective function to maximize profits (after one accounting course, it is clear that the definition of profit is unclear)
4
4 Properties of a Basic Economic Agent Has a measure of success Has a data stream to measure its success Does simple experiments or to learn how its actions affect success or observes the consequences of outside sources of variation Operates in a potentially noisy environment We term this Probe and Adjust (PandA) Note that the first three properties are the minimal set of properties for an economic agent to improve
5
5 PandA as an Algorithm Adjusts a continuous parameter (price, quantity, etc.) Parameters: currentLevel, delta, epsilon, and epochLength Activity proceeds in epochs. In each episode the agent plays (“bids”) its currentLevel ± e, where e is in [-delta, delta] and is drawn uniformly. The agent records its returns from playing above or below currentLevel. After epochLength epochs, the current epochs concludes, and the agent adjusts currentLevel by ± epsilon, depending on whether playing up or down yielded better rewards. NB. PandA agents explore & exploit, with the tradeoff specified by the PandA parameters (which can also be learned).
6
6 Three Market Contexts Monopoly Oligopoly Perfect competition We are focusing on oligopoly, where the theory is admittedly unsatisfactory. But first…monopoly
7
7 Monopoly Market context: agent is the only supplier Agent properties: –Measure of success is classic definition of profits –The data stream is the profits associated with the quantity offered (Agent does not know the demand curve) –Agent tries different quantities and adjusts the base quantity around which it experiments
8
8
9
9 Monopoly Results Simple model, PandA, quickly arrives at the vicinity of the monopoly position. PandA is robust under stochasticity. Can track random walk changes in the demand function. PandA performance depends on parameters, delta and epsilon, and epochLength, but these can be tuned, can be learned by the agent. Key point: Here is ONE learning policy that is computationally & epistemically undemanding and that arrives at something close to the monopoly position. Reproduces classical theory with much less stringent assumptions.
10
10 Duopoly & Oligopoly: Cournot Competition Market structure: two or more agents offer quantities into the marketplace Agent properties: –Measure of success is the classic profit calculation –Data stream for each player is its decisions and profits –Each agent tries different quantities and adjusts the base quantity around which it experiments
11
11 Cournot Results PandA players quickly arrive at the vicinity of the Cournot equilibrium. PandA is robust under stochasticity. Can track random-walk changes in the demand function. PandA performance depends on parameters, delta and epsilon, and epochLength, but these can be tuned by the agent. Key point: Reproduces classical theory with much less stringent assumptions.
12
12
13
13 Duopoly & Oligopoly: Bertrand Competition Market structure: two or more players offer prices. The agent with the lowest price wins the whole market Agent properties: –Measure of success is the classic profit calculation –Data stream is its decisions and profits –Agent tries different prices and adjusts the base price around which it experiments.
14
14
15
15 Bertrand Results Players stochastically split the market at the monopoly price This is the reverse of classic Bertrand The reason is simple, at the competitive equilibrium profits are 0. A nonzero probability of nonzero profits leads agents to raise prices. Through their random choices they split the market.
16
16 Further Bertrand Results PandA players bid prices But now there are more than 2 players. What happens? Depends on the number of players and on the epochLengths they use. Broadly: if players are added and/or everyone’s epochLength is shortened, a tipping point is eventually reached and the agents “race to the bottom”, achieving the classic Bertrand result. If one or a few players are more “patient”, have longer epochLengths, this will mitigate the effect, even in the presence of impatient players.
17
17
18
18 Where We Stand Results confirm and differ from classic models Market designs we have used are simple We can show mathematically that the agents are essentially calculating stochastic gradients and moving in the optimizing direction. Thus, our current results can be derived using classic economic tools Any early losses from experimenting are more than compensated by higher longer run returns from learning (the explore/exploit tradeoff)
19
19 Where We Want to Go More complex markets with no simple analytic results Agency theory results tested using incented agents A richer set of behaviors to better understand the consequences of market power and the potential for tacit collusion. Alternative agents with minimum capability
20
20 A Framework for future Research Start with agents having the minimum capability and restrict added capabilities to those of real managers ( e.g. schemas for organizing data streams) Reproduce basic economic results –If the results differ, prove why. Only then move to complicated markets where analytic results cannot be derived An analogy is the relationship between queuing theory and simulation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.