Distributed Rational Decision Making

Distributed Rational Decision Making
Ofer Shadmi

Distributed Rational Decision Making
Is used by self-interested agents in multi-agent encounters. Saves Labor time of human negotiators. Are more effective in finding beneficial short-term agreements. December 7, 2018

Negotiation protocols evaluation criteria
Social welfare – the sum of all agent’s payoffs or utilities in a given solution. Pareto efficiency – a solution x is Pareto efficient (Pareto optimal) if there’s no other solution x’ such that at least one agent is better off in x’ than in x, and no agent is worst off in x’ than in x. Individual rationality – can participating in the negotiation make things worse? December 7, 2018

Negotiation protocols evaluation criteria (cont.)
Stability – mechanism should be designed to be non-manipulable: motivate each agent to behave in the desired manner. Nash equilibrium (existence and uniqueness) Computational efficiency – mechanisms should be designed so that when agents use them, as little computation is needed as possible. Distribution and communication efficiency – distributed protocol (perf. Bottleneck) vs. minimum communication (time, money,…). December 7, 2018

Example – Prisoner’s dillema
Sometimes efficiency goals and stability goals conflict: Unique welfare maximizing and Pareto efficient strategy is the one where both agents cooperate. Only dominant strategy equilibrium and Nash equilibrium is the one where both agents defect. December 7, 2018

Voting protocols Mechanism which chooses the outcome based on the inputs given by all agents. Reaching an agreement by voting can be reached by any of: Plurality protocol (highest no. of votes wins) Binary protocol (series of votes of 2 options each) Borda protocol (sum of all agent’s preferrings) All 3 protocols are problematic. December 7, 2018

Problematic example of the binary protocol
Notice that d wins though every agent prefers c over d. 35% of agents have preferences 33% of agents have preferences 32% of agents have preferences c a b c c d c a b d a December 7, 2018

Problematic example of the binary protocol
Notice that d wins though every agent prefers c over d. 35% of agents have preferences 33% of agents have preferences 32% of agents have preferences c a b c c d c a b d a c d b d a d b d c a b December 7, 2018

Problematic example of the Borda protocol
As can be seen,c is the leading alternative. Preferences Agent 1 2 3 4 5 6 7 c wins with 20, b has 19, a has 18, d loses with 13 Borda count December 7, 2018

Problematic example of the Borda protocol
As can be seen,c is the leading alternative. Notice that by removing the least relevant alternative (d), c becomes the worst alternative. Preferences Agent 1 2 3 4 5 6 7 c wins with 20, b has 19, a has 18, d loses with 13 Borda count A wins with 15, b has 14, c loses with 13 Borda count with d removed December 7, 2018

Insincere (Strategic) Voters
If an agent can benefit from insincerely declaring his preferences, he will do so. Knowledge of the agent’s true preferences are seldom available. December 7, 2018

Auctions Unlike voting, auctions usually end with a deal between 2 agents (auctioneer and bidder). Appears in many practical computer science applications. Several succesful auction-based commercial websites. December 7, 2018

Auction settings There exist 3 types of settings in which the value of an item can be defined: Private value Common value Correlated value December 7, 2018

Auction protocols Common protocols: English auction
First-price sealed-bid auction Dutch auction Vickrey auction December 7, 2018

English auction First-price open-cry auction
Dominant strategy: always bid a small amount more than current highest bid, until private value price is reached. Open-exit option (bidder openly declares exiting without a re-entering possibility) December 7, 2018

First-price, sealed-bid auction
Each bidder submits one bid, without knowing the other’s bids. Best strategy: bid less than true valuation. If each agent i has a private value vi for some goods, and under the assumption of uniform distribution of the value between 0 - <v>, there exists a Nash equilibrium when each agent bids (|A|-1)/|A| * vi. December 7, 2018

Dutch (descending) auction
Seller cotinuously lowers the price until one of the bidders takes the item at its current price. Strategically equivalent to the first-price sealed-bid auction (why?). December 7, 2018

Vickrey auction Second-price sealed-bid auction
Highest bidder wins, paying the price of the second highest bid. Theorem: A bidder dominant strategy in a Vickrey auction is to bid his true valuation (why?). December 7, 2018

Vickrey auction (cont.)
An agent is best off bidding truthfully, no matter what the other bidders are like. The agents reveal their preferences truthfully – allows globally efficient decisions to be made. No need to waste efforts on counterspeculating other agents. December 7, 2018

Efficiency of the result allocation
Each one of the 4 auction protocols allocates the item Pareto efficiently to the bidder who values it the most. However, the ones with the dominant strategy (English & Vickrey) are more efficient in the sense that no effort is wasted on counterspeculating. December 7, 2018

Revenue equivalence and non-equivalence
What is more profittable: second-price or first-price auctions? Theorem: All of the 4 auctions protocols produce the same expected revenue to the auctioneer in private value auctions, where the values are independent distributed and bidders are risk-neutral. December 7, 2018

Revenue from the auctions
Among risk averse bidders, the Dutch and first-price sealed-bid auctions give higher revenue to the auctioneer. A risk averse auctioneer achieves higher expected utility via the Vickrey or the English auction protocol. In practice, most auctions are not pure private value auctions – causes bidder to increase value  greater expected revenue in English & Vickrey protocols to the auctioneer. December 7, 2018

Bidder collusion All four auction protocols are not collusion-proof.
First-price sealed-bid and Dutch auctions are preferrable in order to prevent collusions (Why?). How to manage with non-coalition collusion members? December 7, 2018

Lying auctioneer In the Vickrey auction, the auctioneer may overstate the second-highest bid to the highest bid. Solution: using cryptographic electronic signatures. Cannot happen in the other 3 protocols. December 7, 2018

Lying auctioneer (cont.)
In English auctions: shills. In sealed-bid auctions: Auctioneer may place a bid himself (reservation price). In Vickrey auctions: should the auctioneer bid more than it’s true reservation price? Winner’s curse December 7, 2018

Undesirable private information revelation
Reminder: the Vickrey auction’s dominant strategy in private-value auctions is bidding truthfully. May reveal sensitive information (a main reason why the Vickrey auction protocol is not widely used). Doesn’t occur (clearly) in first-price sealed-bid auctions. December 7, 2018

Interrelated auctions
Strategies might be different when interrelated items are auctioned at a time instead of each item seperately. Lookahead is a key feature in auctions of multiple interrelated items. Auctioneers often allow bidders to pool all of the interrelated items under one entirety bid. December 7, 2018

Interrelated auctions (cont.)
Sometimes auctioneers allow bidders to backtrack from commitments by paying penalties. Different kind of speculations: trying to guess what items will be auctioned in the future, and which agents are going to win in those auctions. Trade-off: (partial) lookahead vs. cost. December 7, 2018

Contract nets Up to now, general equilibrium market mechanisms use global prices and a single centralized mediator. A single mediator might become a communication and computation bottleneck, or a potential point of failure. In some cases, the agent might want to have control of who receives their sensitive information. December 7, 2018

Task allocation negotiation
The capability of (re)allocating tasks among agents is a key feature in automated negotiation systems. Definition: A task allocation problem is defined by a set of tasks T, a set of agents A, a cost function and the initial task allocation of tasks among agents where December 7, 2018

Task allocation (cont.)
A contract is individually rational (IR) to an agent if that agent is better off with the contract than without it. A contractee q accepts a contract if it gets paid more than its marginal cost of handeling the tasks Tcontract of the contract. December 7, 2018

Task allocation (cont.)
Similarly, Each agent can take on both contractor and contractee roles. In the contract net protocol, tasks that have been received earlier can be recontracted. Notice: task allocation can only be improved (hill climbing model). December 7, 2018

O, C, S and M contracts While trying to converge into the globally optimal task allocation: Original contract (O): one task is moved from one agent to another. (most common) Cluster contract (C): set of tasks is atomically contracted from one agent to another. Swap contract (S): a pair of agents swaps a pair of tasks. Multiagent contract (M): more than two agents are involved in an atomic exchange of tasks. December 7, 2018

O, C, S and M contracts Theorem: For each of the four contract types (O, C, S, and M) there exist task allocations where no IR contract with the other 3 is possible, but with the 4th type is. Theorem: There are instances of the task allocation problem where no IR sequence from the initial task allocation to the optimal one exists using O-, C-, S- and M-contracts. Clearly, no subset of them suffices either. December 7, 2018

OCSM contracts Definition: An OCSM contract is defined by a pair <T,ρ> of AxA matrices. An element Ti,j is the set of tasks that agent i gives to agent j, and ρi,j is the amount that i pays to j. OCSM contracts allow moving from a task allocation to any other task allocation with a single contract. December 7, 2018

OCSM contracts IR sequences always exist from any task allocation to the optimal one if the contracting protocol incorporates OCSM contracts. A stronger claim: Any hill-climbing algorithm (i.e. any sequence of IR contracts) finds the globally optimal task allocation (without backtracking). December 7, 2018

Insincere agents in task allocation
So far we assumed that agents act was based on individual rationality. In some situations, agents may accept non-IR contracts or reject IR contracts. Agents sometimes lie about their true cost of some task(s), as he might receive more money for the contract. December 7, 2018

Insincere agents in task allocation (cont.)
Agents could also lie about the tasks they have: Hiding tasks Phantom tasks Decoy tasks December 7, 2018

Coalition formation Coordinating with other parties in some domain might save costs. Example: the prisoner’s dillema. Searching a Nash equilibrium is often too weak, because subgroups of agents can deviate in a coordinated manner. December 7, 2018

Solution I Strong Nash equilibrium: No subgroup that can deviate by changing strategies jointly in a manner that increases the payoff of all of its members. A strong Nash equilibrium guarantees more stability. Problem: It is often too strong a solution – in many games no such equilibria exist. December 7, 2018

Solution II Coalition-proof Nash equilibrium: No subgroup that can make a mutually beneficial deviation in a way that the deviation itself is stable according to the same criterion. Problem: the deviation may be stable within the deviation group, but not with other agents. Problem: even these kinds of solutions do not always exist. December 7, 2018

Solution III Instead of using equilibrium analysis, we can talk about coalition formation – characteristic function game (CFG). The value of each coalition S is given by a characteristic function vs. Notice: the value of a coalition may depend on non-member’s actions, due to positive and negative externalities. December 7, 2018

CFG’s Negative externalities: Positive externalities: Shared resources
Conflicting goals Positive externalities: Overlapping goals December 7, 2018

Coalition formation in CFG
Coalition formation in CFG includes 3 activities: 1. Coalition structure generation: partitioning the set of agents into exhaustive and disjoint coalitions (also called the coalition structure – CS). Example: December 7, 2018

Example of CS Consider a game with 3 agents {1,2,3}.
There are 7 possible coalitions: {1}, {2}, {3}, {1,2}, {1,3}, {2,3} and {1,2,3}. There are five possible coalition structures: {{1}, {2}, {3}}, {{1}, {2,3}}, {{1,2}, {3}}, {{1,3}, {2}} and {{1, 2, 3}}. December 7, 2018

Coalition formation in CFG (cont.)
2. Solving the optimization problem of each coalition: Pooling the task and resources of the agents in the coalition, and solving this joint problem. The coalition’s objective – maximizing the monetary value: money received from outside the system for accomplishing tasks minus the cost of using the resources. December 7, 2018

Coalition formation in CFG (cont.)
3. Dividing the value of the generated solution among agents. Notice: this value may be negative (why?) December 7, 2018

Coalition structure generation
Definition: superadditive games are games in which for all disjoint coalitions CSG in superadditive games is trivial, since the agents are best off by forming the grand coalition where all agents operate together. December 7, 2018

Superadditivity Classically it is argued that almost all games are superadditive (why?). Does not consider the cost to the coalition formation process itself (communication, computation, time etc.). In non-superadditive games, CSG becomes highly nontrivial. December 7, 2018

The goal Maximizing the social welfare of the agents by finding a coalition structure where Problem: the number of the coalition structures is (Some kind of) solution: search through a subset N of all partitions of A, and pick the best coalition seen so far December 7, 2018

Bounds We would like to be able to guarantee that this coalition structure is within a worst case bound from optimal: define nmin to be the smallest size of N that allows us to establish such a bound k. December 7, 2018

{1},{2},{3},{4} Coalition structure graph {1,2,3,4} {1},{2},{3, 4}
{1,2},{3},{4} {1},{2, 4},{3} {1, 3},{2},{4} {1},{2, 3},{4} {1, 4},{2},{3} {1},{2,3,4} {1,2},{3,4} {2},{1,3,4} {1,3},{2,4} {3},{1,2,4} {1, 4},{2,3} {4},{1,2,3} Coalition structure graph {1,2,3,4}

Minimal search to establish a bound
Theorem: To bound k, it suffices to search the lowest 2 levels of the coalition structure graph. Proof: In order to establish a bound, vs of each coalition S has to be observed (in some CS). The second lowest level has CS’s where exactly 1 subset of agents split away from the grand coalition. The lowest level is the grand coalition. December 7, 2018

Minimal search to establish a bound (cont.)
Theorem: With this search, the bound k=|A|. Proof: notice that CS* can include at most a=|A| coalitions. Therefore, December 7, 2018

Theorem: The number of nodes searched is n=2|A|-1. Proof: number of CS’s in lower level – one. Number of coalitions on the second lowest level – 2a-2 (all subsets besides the empty one and the grand coalition). There are 2 coalitions per CS on this level, so there are (2a-2)/2 CS’s in the second level, or 1+ (2a-2)/2= 2a-1 CS’s (nodes) in both levels. December 7, 2018

Theorem: For the algorithm that searches the two lowest levels of the graph, the bound k=a is tight. Proof: Construct a worst case via which the bound shown is tight. Choose vS=1 for all coalition S of size 1, and vS=0 for all the other coalitions. Now, CS*={{1},{2},…{a}}, and V(CS*)=a. Then, CSN*={{1},{2,…a}}. Since V(CSN*)=1, December 7, 2018

Conclusions A worst case bound from optimum can be guaranteed without seeing all CS’s. Exponentially many coalition structures have to be searched before a bound can be established. No algorithm for coalition structure generation can establish a bound without trying at least 2|A|-1 CS’s. December 7, 2018

Parallelizing CS search among inisincere agents
Naturally, the search for a CS can be done more efficiently in parallel. Self-interested agents prefer greater personal payoffs – they will ignore a global better contract. Problem: how to motivate self-interested agents to exactly follow the socially desirable search? December 7, 2018

Parallel search for self-interested agents (cont.)
1. Deciding what part of the coalition structure graph to search (in advance, randomly, dictated by some authority etc.) 2. Partitioning the search space among agents: allocate randomly, payment for unequal amount of search etc. December 7, 2018

Parallel search for self-interested agents
3. Actual search – each agent searches its part, and returns argmax V(CS) in its search place. 4. Enforcement – a second agent will re-search the same place to verify that the first agent performed its search well. If needed, a penalty is often given. December 7, 2018

Parallel search for self-interested agents (cont.)
5. Additional search – previous steps can be repeated if there’s time left. 6. Payoff division – in many methods (Shapley value, for example). To make sure that agent’s won’t consider it when returning their CS, the possible penalty should be big enough. December 7, 2018

Payoff division – Shapley value
Definitions: Agent i is called a dummy if for every coalition S that does not include i. Agents i and j are called interchangeable if for every coalition that includes i but not j. (x1,…,xn) is a vector of payoffs to the agents. December 7, 2018

Shapley value axioms Symmetry: If i and j are interchangeable, then xi=xj. Dummies: if i is a dummy, then xi=v{i}. Additivity: for any 2 games v and w, xi in v+w equals xi in v plus xi in w, where v+w is the game defined by (v+w)S=vS+wS. December 7, 2018

Shapley value of an agent
Theorem: the following is the only payoff that satisfies the last 3 axioms: This payoff is called the Shapley value of an agent i. It is the marginal contribution of agent i to the CS, averaged over all possible joining order. December 7, 2018

The Shapley value The joining order does matter!
It always exists, and is unique. It is Pareto efficient (the entire value of the CS gets distributed among the agents). Guarantees that individual agents and the grand coalition are motivated to stay with the coalition structure. December 7, 2018

The Shapley value Does not guarantee that all agents are better off in the CS than by breaking into a coalition of their own. The marginal contribution of each agent has to be computed over all joining orders (there are |A|! of them). Possible way to choose the joining order – using a randomizing permutation given by each agent. December 7, 2018

Questions?

Distributed Rational Decision Making

Similar presentations

Presentation on theme: "Distributed Rational Decision Making"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Rational Decision Making

Similar presentations

Presentation on theme: "Distributed Rational Decision Making"— Presentation transcript:

Similar presentations

About project

Feedback