Download presentation
Presentation is loading. Please wait.
Published byAdela Floyd Modified over 9 years ago
1
Fifth International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-06) Exact Solutions of Interactive POMDPs Using Behavioral Equivalence Speaker Prashant Doshi University of Georgia Authors B. Rathnasabapathy, Prashant Doshi, and Piotr Gmytrasiewicz
2
2 Overview ● I-POMDP – Framework for sequential decision making for an agent in a multi-agent setting – Takes the perspective of an individual in an interaction ● Problem – Cardinality of the interactive state space → infinite ● Other agent's models (incl. beliefs) are part of an agent's state space (interactive epistemology) ● An algorithm for solving I-POMDPs exactly – Aggregate behaviorally equivalent models of other agents
3
3 Background – Properties of POMDPs and I-POMDPs Finitely nested –Beliefs are nested up to a finite strategic level l –Level 0 models are POMDPs Value function of POMDP and finitely nested I- POMDP is piecewise linear and convex (PWLC) Agents’ behaviors in POMDP and finitely nested I- POMDP can be represented using policy trees
4
4 Interactive POMDPs Definition Interactive state space –S: set of physical states : set of intentional models : set of subintentional models –Intentional models contain the other agent’s beliefs
5
5 Example: Single-Agent Tiger Problem ? +10 -100
6
6 Behaviorally Equivalent Models P1 P2 P3 Equivalence Classes of Beliefs
7
7 Equivalence Classes of Interactive States Definition –Combination of a physical state and an equivalence class of models
8
8 Lossless Aggregation In a finitely nested I-POMDP, a probability distribution over, provides a sufficient statistic for the past history of i’s observations Transformation of the interactive state space into behavioral equivalence classes is value-preserving Optimal policy of the transformed finitely nested I- POMDP remains unchanged
9
9 Solving I-POMDPs Exactly Procedure Solve-IPOMDP ( AGENT i, Belief Nesting L ) : Returns Policy If L = 0 Then Return { Policy : = Solve-POMDP ( AGENT i ) } Else For all AGENT j AGENT i Policy j : = Solve-IPOMDP( AGENT j, L-1) End M j := Behavioral-Equivalence-Models(Policy j ) ECIS i : = S x { x j M j } Policy : = Modified-GIP(ECIS i, A i, T i, Ω i, O i, R i ) Return Policy End
10
10 Multi-Agent Persistent-Tiger Problem +10 -100 {Growl Left, Growl Right} X {Creak Right, Creak Left, Silence}
11
11 Beliefs on ECIS Agent j’s policy
12
Agent i’s policy in the presence of another agent j Policy becomes diverse as i’s ability of observing j’s actions improves
13
13
14
14 ● A method that enables exact solution of finitely nested interactive POMDPs ● Aggregate agent models into behavioral equivalence classes – Discretization is lossless ● Interesting behaviors emerge in the multi-agent Tiger problem Conclusions
15
Thank You and Please Stop by my Poster Questions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.