DCOPs Meet the Real World: Exploring Unknown Reward Matrices with Applications to Mobile Sensor Networks Manish Jain Matthew E. Taylor Makoto Yokoo MilindTambe.

DCOPs Meet the Real World: Exploring Unknown Reward Matrices with Applications to Mobile Sensor Networks Manish Jain Matthew E. Taylor Makoto Yokoo MilindTambe 1 Manish Jain

Motivation Real-world Applications of Mobile Sensor Networks ◦ Robots in an urban setting ◦ Autonomous Under-water vehicles 2 Manish Jain

Challenges Rewards are unknown Limited time-horizon Anytime performance is important 3 Manish Jain

Distributed Constraint Optimization for sensor networks ◦ [Lesser03, Zhang03, …] Mobile Sensor Nets for Communication ◦ [Cheng2005, Marden07, …] Factor Graphs ◦ [Farinelli08, …] Swarm Intelligence, Potential Games Other Robotic Approaches … Existing Models Manish Jain 4

Contributions Propose new algorithms for DCOPs Seamlessly interleave Distributed Exploration and Distributed Exploitation Tests on physical hardware 5 Manish Jain

Outline Background on DCOPs Solution Techniques Experimental Results Conclusions and Future Work 6 Manish Jain

a2a3Reward 10 0 0 6 a1a2Reward 10 0 0 6 DCOP Framework a1 a2 a3 7 Manish Jain

Applying DCOP Manish Jain 8 DCOP ConstructDomain Equivalent AgentsRobots Agent ValuesSet of Possible Locations Reward on the Link Signal Strength between neighbors Objective: Maximize Net Reward Objective: Maximize net signal strength

k-Optimality [Pearce07] 1-optimal solutions: all or all R = 12R = 6 a2a3Reward 10 0 0 6 a1a2Reward 10 0 0 6 a1 a2 a3 9 Manish Jain

MGM-Omniscient a1 a2 a3 a_ia_jReward 10 0 0 6 Manish Jain

MGM-Omniscient a1 a2 a3 10 11 Manish Jain a_ia_jReward 10 0 0 6

MGM-Omniscient a1 a2 a3 a_ia_jReward 10 0 0 6 12 10 12 Manish Jain

MGM-Omniscient a1 a2 a3 a_ia_jReward 10 0 0 6 12 10 a1a2a3 0 0 0 0 0 0 Only one agent per neighborhood allowed to change Monotonic Algorithm 13 Manish Jain

Solution Techniques Static Estimation ◦ SE-Optimistic ◦ SE-Realistic Balanced Exploration using Decision Theory ◦ BE-Backtrack ◦ BE-Rebid ◦ BE-Stay 14 Manish Jain

Static Estimation Techniques SE-Optimistic ◦ Always assume that exploration is better ◦ Greedy Approach 15 Manish Jain

Static Estimation Techniques SE-Optimistic ◦ Always assume that exploration is better ◦ Greedy Approach SE-Realistic ◦ More conservative – assume exploration gives mean reward ◦ Faster convergence 16 Manish Jain

17 Manish Jain Balanced Exploration Techniques

BE-Backtrack ◦ Decision Theoretic Limit on exploration ◦ Track previous best location R b ◦ State of the agent: (R b,T) 18 Manish Jain Balanced Exploration Techniques

Manish Jain 19

Balanced Exploration Techniques Manish Jain 20 Utility of Exploration

Balanced Exploration Techniques Manish Jain 21 Utility of Backtrack after Successful Exploration

Balanced Exploration Techniques Manish Jain 22 Utility of Backtrack after Unsuccessful Exploration

BE-Rebid ◦ Allows agents to backtrack ◦ Re-evaluate every time-step ◦ Allows for on-the-flyreasoning ◦ Same equations as BE-Backtrack 23 Manish Jain Balanced Exploration Techniques

BE-Stay ◦ Agents unable to backtrack ◦ Dynamic Programming Approach 24 Manish Jain Balanced Exploration Techniques

Results 25 Manish Jain

Results 26 Manish Jain Learning Curve (20 agents, chain, 100 rounds)

Results (simulation) 27 Manish Jain (chain topology, 100 rounds)

Results (simulation) 28 Manish Jain (10 agents, random graphs with 15-20 links)

Results (simulation) 29 Manish Jain (20 agents, 100 rounds)

Results (physical robots) 30 Manish Jain

Results (physical robots) 31 Manish Jain (4 robots, 20 rounds)

Conclusions Provide algorithms for DCOPs addressing real-world challenges Demonstrated improvement with physical hardware 32 Manish Jain

Future Work Scaling up the evaluation ◦ different approaches ◦ different parameter settings Examine alternate metrics ◦ battery drain ◦ throughput ◦ cost to movement Verify algorithms in other domains Manish Jain 33

34 Manish Jain Thank You manish.jain@usc.edu http://teamcore.usc.edu/manish

Conclusions Provide algorithms for DCOPs addressing real-world challenges Demonstrated improvement with physical hardware 35 Manish Jain manish.jain@usc.edu http://teamcore.usc.edu/manish

DCOPs Meet the Real World: Exploring Unknown Reward Matrices with Applications to Mobile Sensor Networks Manish Jain Matthew E. Taylor Makoto Yokoo MilindTambe.

Similar presentations

Presentation on theme: "DCOPs Meet the Real World: Exploring Unknown Reward Matrices with Applications to Mobile Sensor Networks Manish Jain Matthew E. Taylor Makoto Yokoo MilindTambe."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

DCOPs Meet the Real World: Exploring Unknown Reward Matrices with Applications to Mobile Sensor Networks Manish Jain Matthew E. Taylor Makoto Yokoo MilindTambe.

Similar presentations

Presentation on theme: "DCOPs Meet the Real World: Exploring Unknown Reward Matrices with Applications to Mobile Sensor Networks Manish Jain Matthew E. Taylor Makoto Yokoo MilindTambe."— Presentation transcript:

Similar presentations

About project

Feedback