Multi-Agent Coalition Formation for Long-Term Task or Mobile Network Hsiu-Hui Lee and Chung-Hsien Chen
Proposal Propose a new architecture which integrates case-based reasoning, negotiation, and reinforcement learning to improve the coalition formation process Suit for executing long-term task or for accomplishing a task in high mobility networks
Ubiquitous and mobile networks Ubiquitous networking system was designed by A group of researchers at AT&T Laboratories Cambridge Several devices that have network capability communicate each other to achieve a common goal E.g. locate a person at a building, connection to a personal computer via several devisors
Notations
Problem
Case-based reasoning Use to obtain the past coalition case Fuzzy match mechanism -> find similar task in the past If similar case found: – Sending looking up request to peer agents who are belong to the solution set in the past – If found resources: negotiate If not enough resources found among peers – Broadcast requests to search agents who have resources
Leaving Rate The leaving rate of peer agents indicates the probability that peer agents disappear
Negotiation Processes in continuous rounds Each round: the agent makes a proposal and send it to the peer agents – The peer agent checks the proposal whether it can be accepted or not Strategies: – Linear strategy: dropping to its limitation steadily – Tough strategy: dropping to its limitation immediately when deadline approaches
Negotiation… The linear strategy -> low leaving rate agents – Increases the successful probability of negotiating Tough strategy -> high leaving rate agents – More agents with low leaving rate and lesser agents with high leaving rate
Negotiation… About rewards – Closer to the idle value -> higher probability to agree Partially formed coalition doesn’t has enough resources, but the system has Agent leave an acting coalition -> fail execute task
Reinforcement learning machine learning mechanism – An agent perceives the current state to takes action Agent collect experience for better coalition formation For a given goal the computer learns how to achieve the goal by trial-and-error They don’t use this method
Temporal difference learning Learning rate – higher, more experience Remove reward made by uncertain agents Similar task in the past