Download presentation
Presentation is loading. Please wait.
Published byMark Lilly Modified over 9 years ago
1
SIF8072 Distributed Artificial Intelligence and Intelligent Agents
Lecture 2: Multi-agent Interactions SIF8072 Distributed Artificial Intelligence and Intelligent Agents Lecturer: Sobah Abbas Petersen
2
Lecture Outline Multi-agent Systems Utility and Preferences
Game Theory and Payoff Matrices Strategies Negotiation - Auctions Summary
3
Wooldridge: ”Introduction to MAS”
References Wooldridge: ”Introduction to MAS” Multi-agent Interactions: Chapters 6 Auctions: Chapter 7
4
Interactions ”The world functions through interacting agents. Each person pursues his/her own goals through encounters with other people or machines.” ”Rules of Encouter” by Rosenchein and Zlotskin, 1994
5
Example 1 Two students decide to work together on their exercises. They have to decide upon a time. One prefers to work on Thursday afternoons after the lecture while the other prefers to work on Friday morning. How do they decide upon a time to do the work?
6
Example 2 A friend invites you out for a drink and the cinema tonight. But your favourite TV program is on tonight. You think: It would be nice to go out with my friend, but it’s cheaper to watch TV. If you stay at home and watch TV, you might not have a chance to go out with your friend for a long time. I can always record the program and watch it afterwards. I can invite my friend home.
7
Multi-agent Systems (MAS)
Contains a number of agents which: interact with one another through communication are able to act in an environment have different ”spheres of influence” may be linked by other relationships, e.g. organisational It is important to understand the type of interaction. Each agent can be assumed to be self-interested: has its own preferences and desires about how the world should be.
8
Multi-agent Systems (MAS)
Sphere of influence Agent Interaction Organisational relationship Environment
9
Utilities and Preferences
Assume we have 2 agents: Ag = {i,j}. Assume ={ 1, 2,….} is the set of ”outcomes” that agents have preferences over. We capture preferences by utility functions: ui : IR uj : IR Utility functions lead to preference orderings over outcomes: ≥ i ’ means ui() ≥ ui(’) > i ’ means ui() > ui(’) ω i ω’ : This notation is used to represent preference orderings.
10
What is Utility? Utility is not money, but a useful analogy
Typical relationship between utility and money: Utility Money
11
Multi-agent Encounters 1
Need a model of the environment in which the agents will act. Agents simultaneously choose an action and, as a result, an outcome in will result. Actual outcome depends on a combination of actions. Environment behaviour given by state transformer function: (reference: p31 of textbook) : Ac Ac State Transformer Funtion: (ref. p.31) Maps a run (assumed to end with the action of an agent) to a set of possible environment states – those that could result from performing the action. Two important points: Environment are assumed to be history dependent and not only dependent on the actions performed by the agents. This definition allows for non-determinism. Agent i’s action Agent j’s action
12
Multi-agent Encounters 2
Assume that each agent has two possible actions: C: cooperate D: defect Let Ac = {C,D}
13
State Transformer Funtions
Environment sensitive to actions of both agents: (D,D)= 1 (D,C)= 2 (C,D)= 3 (C,C)= 4 Environment where neither agent has any influence: (D,D)= 1 (D,C)= 1 (C,D)= 1 (C,C)= 1 Environment controlled by j: (D,D)= 1 (D,C)= 2 (C,D)= 1 (C,C)= 2 Environment sensitive to actions of both agents: any action by either agent gives a different outcome. Environment where neither agent has any influence any action by either agent gives the same outcome, ω1. Environment controlled by j: any action by agent j gives a different outcome. Let Ac = {C,D}
14
Agent’s Preference Consider the case where both agents influence the outcome and they have the following utility functions: ui(1 )=1 ui(2 )=1 ui(3)=4 ui(4 )=4 uj(1 )=1 uj(2 )=4 uj(3)=1 uj(4 )=4 ui(D,D)=1 ui(D,C)=1 ui(C,D)=4 ui(C,C)=4 uj(D,D)=1 uj(D,C)=4 uj(C,D)=1 uj(C,C)=4 Then, agenti’s preferences are: C,C i C,D i D,C i D,D Agenti preferes all outcomes that arise through C over all outcomes that arise through D Utilities: If agenti defects, utility = 1, if it cooperates, utility = 4.
15
Payoff Matrices We can characterise the previous scenario in a payoff matrix e.g. Top right cell: i cooperates, j defects Defect Coop j i 1 4 Agent i is the column player (payoff received by i shown in top right of each cell) Agent j is the row player
16
Game Theory A mathematical theory that studies interactions about self-interested agents. Essential elements of a game are: Players (2 or more) Some choice of action (strategy) One or more outcomes (someone wins, someone loses) Information Suitable for situations where the other agent’s (player’s) behaviour matters. One reason for using Game Theory is to use an existing tool to solve practical problem. Rational Agents: (definition taken from Wooldridge: ”Reasoning about rational agents”, MIT Press, 2000. We often disctinguish between agents that are rational and not rational.
17
The Prisoner’s Dilemma 1
2 men are collectively charged with a crime and held in separate cells. They have no way of communicating with each other or making an agreement. They are told: if one confesses and the other does not, confessor will be freed and the other jailed for 3 years. if both confess, then each will be jailed for 2 years. If neither confess, then each will be jailed for 1 year. Confessing => defecting (D) Not confessing => cooperating (C) If you were one of the prisoners, what would you do? Discuss your answer with your neighbour.
18
The Prisoner’s Dilemma 2
Payoff matrix for Prisoner’s Dilemma: Top left: If both defect, punishment for mutual defection. Top right: if i cooperates and j defects, i gets sucker’s payoff of 1 while j gets 4. Bottom left: if j cooperates and i defects, j gets sucker’s payoff of 1 while i gets 4. Bottom right: Reward for mutual cooperation. Defect Coop j i 2 4 1 3 Numbers in the payoff matrix reflect how good an outcome is for the agent and not the no. of years in prison. p.116 in text book. Numbers in the payoff matrix reflect how good an outcome is for the agent. e.g. ui(D,D)=2 ui(D,C)= ui(C,D)=1 ui(C,C)=3 uj(D,D)=2 uj(D,C)= uj(C,D)=4 uj(C,C)=3
19
The Prisoner’s Dilemma 3
The individual rational agent will defect! This guarantees a payoff of no worse than 2 Cooperating guarantees a payoff of at most 1 Defection is the best response to all possible strategies Both agents defect and get a payoff = 2. If both agents cooperate, they will each get payoff = 3. (The other prisoner is my twin!) Conclusions that have been drawn from this: game theory notion of rational agents is wrong. Somehow the dilemma is being formulated wrongly. A possible answer why cooperation may still be a better strategy – play the game once more – Iterated Prisoner’s Dilemma. By playing the game iteratively, you can adopt a strategy such as tit-for-tat, where you gain by cooperating. Can we recover cooperation? The Iterated Prisoner’s Dilemma
20
Let’s take a minute….. How can we apply the Prisoner’s Dilemma to real situations? e.g. Arms races – nuclear weapons compliance treaty between two countries. Can you think of other situations? Numbers in the payoff matrix reflect how good an outcome is for the agent and not the no. of years in prison. p.116 in text book.
21
Strategies ”A strategy is the way an agent behaves in an interaction”. (Ref: Rosenchein and Zlotskin, 1994) From game theory: strategies are actions of agents (Ac) When 2 agents encounter, important question: What should I do?
22
Dominance Given any particular strategy s (e.g. C or D), there will be a number of outcomes. We say that s1 dominates s2 if every outcome possible by i playing s1 is preferred over every outcome possible by i playing s2. Refer slide 14 – payoff matrices. Cooperation is the dominant strategy.
23
Nash Equilibrium 2 strategies s1 and s2 are in Nash Equilibrium if:
Under the assumption that agent i plays s1, agent j can do no better than play s2; Under the assumption that agent j plays s2, agent i can do no better than play s1; Neither agant has any incentive to deviate from a Nash Equilibrium. Unfortunately: Not every interaction scenario has a Nash Equilibrium. Some interaction scenarios have more than one Nash Equilibrium.
24
Nash Equilibrium - Example
The Battle of the Sexes Conflict between a man and a woman, where the man wants to go to a Prize Fight and the woman wants to go to a Ballet They are deeply in love. So, they would make a sacrifice to be with each other. 2 Nash Equilibria Strategy combination (Prize Fight, Prize Fight) Strategy combination (Ballet, Ballet) Woman Prize Fight Ballet Man 2 1 Note: the choice of which Nash Equilibrium is easy if the man and the woman talked beforehand. Even if they didn’t talk about it, repetition would justify Nash Equilibrium. Ref: ”Games and Information, E. Rasmussen, 2001
25
Let’s play a little game…..
Guess half the average Choose a number between 0 and 100. Your aim is to choose a number that is closest to half the average of the numbers chosen by all the students. What is your number? A simple-minded student (student 1) might think that any number would be ok. So, s/he chooses 50. Another student (student 2), who is more sophisticated might figure that if a lot of poeple think like student 1,then I’ll choose 25. Student 3, even more sophisticated might think that if several stydsents think like student 2, then choose 13 or 12. And so on… Infact, the winning choice turned out to be 13.
26
Competitive and zero-sum Interactions
One agent can only get a more preferred outcome at the expense of the other agent strictly competitive. Zero-sum encounters ui () + uj () = 0, for all . e.g. A football game where only one team can win. Real life encounters are rare. Non-zero encounters: players’ interests are not always in conflict. So, there are opportunities for both players to gain. E.g. The Prisoner’s Dilemma.
27
Assumptions in Game Theory
All Players behave rationally Not always the case with all agents! Each player knows the rule. Payoffs are known and fixed. These are limitations!
28
Multi-agent Interaction: Summary
MAS: a number of agents which interact with one another through communication. An agent’s action results in an outcome in the environment. Utility functions are used for preference orderings. Game theory – a mathematical theory that studies interactions among agents. An agent’s action is a strategy: Dominant Nash Equilibrium
29
Negotiation ”The process of several agents searching for an agreement” e.g. about price. Reaching consensus ”Rules of Encouter” by Rosenchein and Zlotskin, 1994
30
Auction: Example 1 Several millions of $ paid for art at auction houses such as Sotheby’s. Ears 2 u, Vincent!
31
Auction: Example 2 Online Auctions
You want to buy some exciting video games. You see that there are some available on eBay. You register at eBay and offer a bid for some of these games.
32
Auctions auctioneer bidders An Auction takes place between an auctioneer and a collection of bidders. Goal is for the auctioneer to allocate the goods to one of the bidders. In most settings, the auctioneer desires to maximise the price; bidders desire to minimise the price. auctioneer bidder Price
33
Auction Parameters Value of goods Private, public/common, Correlated
Winner determination First price, second price Bids may be Open cry, Sealed Bidding may be One shot, ascending, descending
34
English Auctions English auctions are:
auctioneer Bidder 1 Price Bidder x English auctions are: First price Open cry Ascending Dominant strategy: successively bid a small amount more than the highest current bid until it reaches the valuation, then withdraw. Susceptible to Winners curse Winner is the one who overvalues the goods on offer and may end up paying more than its worth. Susceptible to shills: lying by the auctioneer to put bogus bids to artificially raise the current bidding price.
35
Dutch Auctions Dutch auctions are:
auctioneer Bidder Price Dutch auctions are: Open cry Descending Auctioneer starts at an artificially high price. Then continually lowers the offer price until an agent makes a bid which is equal to the current offer price. Dominant strategy: None Susceptible to Winners curse
36
First-price Sealed-bid Auctions
auctioneer Bidders One shot auction Single round, where bidders submit a sealed-bid for the good. Good is awarded to agent that made the highest bid. Winner pays price of highest bid. Best strategy: bid less than true value.
37
Vickrey Auctions Vickrey auctions are:
second-price sealed-bids Good is awarded to agent that made the highest bid. Winner pays price of second highest bid. Best strategy: bid the true value. Susceptible to anti-social behaviour See example in book, page 1 35.
38
Lies and Collusions Lies: Collusion of bidders
By the bidders (e.g. In Vickrey auctions) By the auctioneer (shills, in Vickrey auction) Collusion of bidders Coalition of bidders where they agree beforehand to put forward artificially low bids for the good on offer. When the good is obtained, the bidders can then get the true value of the good and share the profits. See example in book, page 1 35.
39
Limitations of Auctions
Only concerned with the allocation of goods; Not adequate for settling agreements that concerns matters of mutual interest. Negotiation See example in book, page 1 35.
40
Let’s take a minute…… Can you think of any auctions that you have come across? How about offering your notebook to the highest bidder at the end of the year….. Discuss with your neighbour.
41
…..Selecting a Bid
42
Auctions: Summary An Auction takes place between an auctioneer and a collection of bidders. In most settings, the auctioneer desires to maximise the price; bidders desire to minimise the price. Types of Auctions: English auction Dutch auction First-price sealed bids Vickrey (Second-price sealed bids) Useful for allocating goods. But too simple for many other settings.
43
Next Lecture: Negotiation
Will be based on: ”Reaching Agreements”, Chapter 7 in Wooldridge: ”Introduction to MultiAgent Systems” Coordination – Working together, Chapter 9
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.