Fifth International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-06) Exact Solutions of Interactive POMDPs Using Behavioral Equivalence.

Fifth International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-06) Exact Solutions of Interactive POMDPs Using Behavioral Equivalence Speaker Prashant Doshi University of Georgia Authors B. Rathnasabapathy, Prashant Doshi, and Piotr Gmytrasiewicz

2 Overview ● I-POMDP – Framework for sequential decision making for an agent in a multi-agent setting – Takes the perspective of an individual in an interaction ● Problem – Cardinality of the interactive state space → infinite ● Other agent's models (incl. beliefs) are part of an agent's state space (interactive epistemology) ● An algorithm for solving I-POMDPs exactly – Aggregate behaviorally equivalent models of other agents

3 Background – Properties of POMDPs and I-POMDPs Finitely nested –Beliefs are nested up to a finite strategic level l –Level 0 models are POMDPs Value function of POMDP and finitely nested I- POMDP is piecewise linear and convex (PWLC) Agents’ behaviors in POMDP and finitely nested I- POMDP can be represented using policy trees

4 Interactive POMDPs Definition Interactive state space –S: set of physical states : set of intentional models : set of subintentional models –Intentional models contain the other agent’s beliefs

5 Example: Single-Agent Tiger Problem ? +10 -100

6 Behaviorally Equivalent Models P1 P2 P3 Equivalence Classes of Beliefs

7 Equivalence Classes of Interactive States Definition –Combination of a physical state and an equivalence class of models

8 Lossless Aggregation In a finitely nested I-POMDP, a probability distribution over, provides a sufficient statistic for the past history of i’s observations Transformation of the interactive state space into behavioral equivalence classes is value-preserving Optimal policy of the transformed finitely nested I- POMDP remains unchanged

9 Solving I-POMDPs Exactly Procedure Solve-IPOMDP ( AGENT i, Belief Nesting L ) : Returns Policy If L = 0 Then Return { Policy : = Solve-POMDP ( AGENT i ) } Else For all AGENT j AGENT i Policy j : = Solve-IPOMDP( AGENT j, L-1) End M j := Behavioral-Equivalence-Models(Policy j ) ECIS i : = S x { x j M j } Policy : = Modified-GIP(ECIS i, A i, T i, Ω i, O i, R i ) Return Policy End

10 Multi-Agent Persistent-Tiger Problem +10 -100 {Growl Left, Growl Right} X {Creak Right, Creak Left, Silence}

11 Beliefs on ECIS Agent j’s policy

Agent i’s policy in the presence of another agent j Policy becomes diverse as i’s ability of observing j’s actions improves

14 ● A method that enables exact solution of finitely nested interactive POMDPs ● Aggregate agent models into behavioral equivalence classes – Discretization is lossless ● Interesting behaviors emerge in the multi-agent Tiger problem Conclusions

Thank You and Please Stop by my Poster Questions

Fifth International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-06) Exact Solutions of Interactive POMDPs Using Behavioral Equivalence.

Similar presentations

Presentation on theme: "Fifth International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-06) Exact Solutions of Interactive POMDPs Using Behavioral Equivalence."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Fifth International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-06) Exact Solutions of Interactive POMDPs Using Behavioral Equivalence.

Similar presentations

Presentation on theme: "Fifth International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-06) Exact Solutions of Interactive POMDPs Using Behavioral Equivalence."— Presentation transcript:

Similar presentations

About project

Feedback